Skip to main contentEnhanced Usage Monitoring
January 31, 2026
New comprehensive usage tracking and reporting features for better resource management:
- Datasource-level breakdowns for granular usage visibility
- Account-based tracking with improved join keys for accurate reporting
- 10-minute update intervals for near real-time usage insights
- Automated cleanup of expired data for accurate retention calculations
Improved Onboarding Experience
January 30, 2026
Streamlined onboarding with enhanced user flows:
- Redesigned onboarding cards with clearer visual hierarchy
- “My First Playground” experience for hands-on experimentation
- Role collection during signup for personalized setup
- Custom hover states matching each card’s accent color
Real-Time Evaluations
January 30, 2026
Run evaluations immediately on incoming data with real-time ingestion:
- Instant evaluation of production traces without delays
- Latent evaluation support for updating earlier spans
- Seamless cutover between batch and real-time processing
- Available across all Arize AX tiers by default
AWS Bedrock Custom Endpoints
January 30, 2026
Enhanced AWS Bedrock integration for enterprise deployments:
- Custom base URL support for private endpoints
- Inference profile ARNs for multi-region routing
- Custom model configurations for specialized deployments
- Simplified regional management with unified tracking
Wildcard Array Path Variables
January 30, 2026
Access array data more flexibly in templates and experiments:
- Wildcard (
*) patterns to reference all array elements
- Last-index (
-1) access for the most recent item
- Automatic generation of wildcard variants for convenience
- Support in task variables and experiment columns
Improved Queue Management
January 29, 2026
Better user experience when managing annotation queues:
- Duplicate detection with clear error messages
- Added and skipped record counts after bulk operations
- Actionable feedback when attempting to add existing records
Session Evaluations with Conversation Context
January 23, 2026
Evaluate entire conversation flows with new virtual attributes:
{conversation} template variable for session-level evaluations
- Chronologically ordered input/output pairs
- Automatic aggregation of multi-turn dialogues
- Root span filtering for accurate session context
Circuit Breaker for Evaluation Tasks
January 29, 2026
Protect resources during evaluation failures with intelligent circuit breaking:
- Immediate abort on authentication errors (401/403)
- Automatic detection of systemic issues after 10 consecutive failures
- Failure rate monitoring to stop doomed batches early
- Resource optimization by preventing guaranteed-to-fail requests
Tracing Configuration for Evaluation Tasks
January 23, 2026
Enable detailed debugging for evaluation tasks:
- Toggle tracing on/off in Advanced Options
- Automatic trace generation for monitoring and debugging
- Persistent settings saved with your tasks
- Production-ready visibility into evaluation execution
Enhanced RBAC System
January 27, 2026
Fine-grained access control with the new RBAC system:
- Custom roles with specific permissions
- Space-level role bindings for granular access management
- Coexistence with legacy roles during migration
- UI support for role assignment across all user management pages
- Automatic fallback to legacy roles when custom roles are deleted
Enhanced Dashboard Time Persistence
January 22, 2026
Your dashboard preferences now persist automatically:
- Auto-save time range, time zone, and granularity selections
- Instant restoration when returning to dashboards
- Per-dashboard settings for customized views
- Seamless experience across sessions
Trace Table Performance Improvements
January 21, 2026
Faster loading times for the tracing table:
- 30-50% faster initial load times
- String truncation for large content
- Lazy loading of full values in tooltips
- Minimal impact on user experience
Expandable Trace Hierarchy
January 20, 2026
View trace structure directly in the table:
- Expand traces to see child spans inline
- Hierarchical visualization without opening slideouts
- Faster navigation through complex traces
- Contextual understanding of request flow
Custom Prompt Release Labels
January 20, 2026
Organize and track prompt versions with custom labels:
- Tag prompt versions with meaningful identifiers
- Environment markers like “staging” or “production”
- Dynamic label suggestions from existing prompts
- Easy retrieval of specific prompt releases
Enhanced Annotation Configs
January 12, 2026
More powerful annotation workflows with improved configs:
- Color-coded categories based on optimization direction
- Read-only view for reviewing existing configs
- Optimization direction control (maximize, minimize, or none)
- Clear label guidance for consistent evaluations
Eval Hub Enhancements
January 16, 2026
Improved evaluation management and visibility:
- Model information in evaluator listings with provider icons
- Evaluator counts in running tasks with hover details
- Automatic save when creating or editing evaluators
- Streamlined task flow for faster evaluation setup
Todo List Management Improvements
January 16, 2026
More reliable task tracking in Alyx conversations:
- Visual status indicators for all todo states
- Dynamic reminders with exact update calls needed
- Plan preservation across human-in-the-loop pauses
- Clearer instructions positioned near the plan
Stacked Bar Chart Widgets
January 9, 2026
Visualize multi-dimensional data with new chart types:
- Stacked bar charts for comparing categories over time
- Druid-powered queries for fast rendering
- Customizable groupings and dimensions
- Dashboard integration for comprehensive monitoring
Scatter Plot Widgets
January 21, 2026
Explore relationships between variables with scatter plots:
- Correlation analysis for two numeric dimensions
- Interactive data points for detailed investigation
- Dashboard integration for visual analytics
- Customizable axes and filtering
Enhanced Monitor Configuration
January 21, 2026
More control over monitor behavior:
- Configurable auto-threshold lookback windows via feature flag
- Extended lookback periods for sparse data projects
- Flexible threshold calculation based on historical patterns
- Account-specific customization for unique requirements
Java SDK Space ID Support
January 14, 2026
Modern authentication for Java applications:
- Space ID authentication (space keys deprecated)
- Backward compatibility maintained with existing constructors
- Updated documentation and examples
- Test coverage for new authentication method
Improved Error Handling for Exceptions
January 23, 2026
Better filtering and debugging capabilities:
- Filter by
exception.type and exception.message in the UI
- OpenInference semantic convention support for exceptions
- Consistent data structure across datasources
- Faster troubleshooting of error patterns
SAML Role Mapping Search
January 23, 2026
Navigate large role mapping configurations easily:
- Client-side search across attributes, spaces, roles, and organizations
- Visual highlighting of search matches
- Keyboard navigation through results
- Improved usability for enterprise customers
Span-to-Queue Workflow
January 15, 2026
Add spans and dataset examples to annotation queues seamlessly:
- Multiple entry points from spans table, trace slideover, and queue records
- New or existing queue selection
- Batch operations for efficient queue population
- Dataclusters integration for reliable processing
Enhanced Session Slideover
January 21, 2026
Better conversation visualization and navigation:
- Trace labels with links to detailed views
- Visual separators between traces
- Hover highlighting synchronized between list and conversation
- Improved readability for multi-turn interactions
Batch Annotation Updates
January 12, 2026
Efficiently annotate large volumes of data:
- Optimization direction support in annotation configs
- Category-based labeling for issue detection
- Best practice guidance for naming and structure
- Streamlined categorization workflows
Prompt Optimization on Experiments
January 6, 2026
Run prompt optimization directly on experiment results:
- Experiment selector in optimization task creation
- Dynamic column resolution for experiment data
- Enhanced iteration on proven prompts
- Seamless workflow from experiments to optimization
Custom Metrics with LIKE Operator
January 27, 2026
More powerful filtering in custom metrics:
- LIKE and ILIKE operators for pattern matching
- Wildcard support with
% syntax
- Case-insensitive matching with ILIKE
- Direct Druid mapping for performance
Dashboard Template Filtering
January 27, 2026
Cleaner dashboard creation experience:
- LLM-only space filtering shows only relevant templates
- Context-aware templates based on project types
- Reduced clutter in template selection
- Consistent experience across spaces and projects
Pivot Table Widget Schema
January 27, 2026
Foundation for advanced tabular data visualization:
- Grouped categorical dimensions for organized views
- Configurable numeric columns with aggregations
- Flexible filtering and time range support
- Dashboard integration ready
Enhanced Space Model Schema
January 14, 2026
More control over data retention and lookback:
- Space-level schema lookback overrides for custom retention
- Model-specific configurations for unique requirements
- Flexible data management across different use cases
Exact Match Code Evaluator
January 14, 2026
New built-in evaluator for validation:
- String equality checks for exact matches
- Expected vs actual comparisons for testing
- Multi-field access with dataset row support
- Alphabetically sorted evaluator list in UI
Experiment Task Timeout Configuration
January 21, 2026
Accommodate long-running evaluations:
- Configurable timeout parameter beyond 120 seconds
- Function-level control in run_experiment and evaluate_experiment
- Backward compatibility with default values
- Support for complex evaluators requiring extended processing
Arrow Schema Reconciliation
January 14, 2026
Improved data handling across distributed segments:
- Parallel schema fetching from historicals
- Unified schema reconciliation across partitions
- Automatic conversion for schema consistency
- Support for both Druid and Arrow segments
Atlantis Terraform Automation
January 15, 2026
Streamlined infrastructure-as-code workflows:
- Pull request integration for Terraform plans
- Automated plan posting as PR comments
- DevOps team permissions for webhook debugging
- Structured review process before applying changes
Google Analytics 4 BigQuery Sync
January 8, 2026
Automated analytics data export:
- Daily GA4 to BigQuery transfers via Terraform
- Raw event data access for advanced analysis
- Overcome GA4 limitations like sampling and retention
- Custom reporting capabilities with full data access
Vertex AI Migration
January 8, 2026
Updated integration with Google Cloud AI:
- Seamless Vertex AI connectivity for LLM applications
- Enhanced observability for Google Cloud deployments
- Modernized instrumentation for better tracing
Custom Model Migrations
January 7, 2026
Expanded support for custom integrations:
- Custom model endpoint support in evaluations
- Higher traffic model optimization for performance
- Flexible integration options for enterprise deployments
Generative Service Monitoring
January 8, 2026
Comprehensive monitoring for evaluation infrastructure:
- Uptime and health alerts with paging
- CPU and memory monitoring with warnings
- Dedicated Grafana dashboard for visibility
- Runbook documentation for incident response
Labeling Queue Annotations
January 5, 2026
More flexible annotation management:
- Clear annotations (reset to null) anywhere
- Support across spans, queues, and experiments for consistent workflows
- Improved annotation lifecycle management
Enhanced Eval Hub Empty States
January 9, 2026
Better guidance for getting started:
- Improved empty state design with clear next steps
- Documentation links for learning resources
- Actionable cards for common workflows
Resizable Trace Slideover
January 22, 2026
Customize your viewing experience:
- Draggable slideover width for optimal layout
- Persistent sizing preferences across sessions
- Better content visibility for long traces
Configurable Experiment Timeout
January 21, 2026
Handle complex evaluation scenarios:
- Custom timeout values for long-running tasks
- Per-experiment configuration for flexibility
- Backward compatible defaults for existing code
Enhanced Platform Stability
January 2026
Numerous improvements to platform reliability and performance:
- Configuration drift resolution in GCP Terraform
- Enhanced error handling across services
- Improved logging and monitoring for faster troubleshooting
- Database migration optimizations for schema updates
- Better resource management for high-volume workloads