Our Vision for Alyx
The Vision: Claude Code and Cursor for AI Engineering
When Claude Code and Cursor revolutionized software development, they didn't just add AI to an existing workflow—they fundamentally reimagined how developers interact with code. Instead of manually searching through files, writing queries, or configuring tools, developers could simply describe what they needed in natural language and have an intelligent agent understand context, navigate codebases, and take action.
We're building Alyx with the same vision for AI engineering.
Just as Claude Code and Cursor became essential tools for code development, we envision Alyx as the indispensable agent for building, debugging, and optimizing AI systems. The core insight is the same: the future of complex technical work isn't about building more dashboards or exposing more configuration—it's about building intelligent agents that understand your applications and can execute complex workflows through natural conversation on your behalf.
Our Approach: Agent-First Architecture
Intelligent Orchestration Over Static Queries
Alyx is built around a sophisticated orchestration system that manages multi-step workflows, coordinates tool calls, and maintains context across complex analyses. Rather than requiring users to know which tool to use or what information to provide in their queries, Alyx dynamically routes requests, selects appropriate tools, and chains operations together to answer user questions.
The orchestrator architecture enables:
Automatic task decomposition - Complex questions are broken down into multi-step plans
Tool call coordination - The agent selects and sequences tools based on context
Conversation continuity - Context is maintained across iterations and tool calls
Error handling and recovery - The agent can adapt when operations fail or need retries
Native Understanding of AI System Data
Traditional analysis tools treat AI system data as generic observability traces. Alyx is purpose-built to understand the unique structure and semantics of LLM workflows and AI agents. The system has deep knowledge of:
Traces and spans with inputs, outputs, tool calls, prompts, latency, and error tracebacks
Evaluation frameworks and how to interpret evaluation results
Prompt structure and optimization strategies
Experiment workflows from dataset creation through evaluation
Annotation schemas for categorizing and labeling issues
This domain expertise means Alyx can answer questions with precision without requiring you to manually configure what your system does, which columns are important, or how your data is structured. Alyx automatically understands that a "latency bottleneck" in an LLM system requires different analysis than a latency issue in a traditional service, and it knows which traces, spans, and metrics matter most for your specific question.
Multi-Step Planning with Visibility
One of our core design principles is transparency into the agent's reasoning and progress. Alyx uses an explicit todo management system that:
Plans complex tasks before execution
Tracks progress through multi-step analyses
Provides visibility into what the agent is doing and why
Enables iteration as users refine their questions
When you ask Alyx to "find what's wrong with this model and suggest improvements," it doesn't just start executing—it first creates a plan:

This planning approach, inspired by how experienced engineers approach complex problems, ensures thorough analysis and gives users confidence in the agent's methodology.
Analysis, Not Just Categorization
Counting and categorizing alone aren't enough. Without context, labels and aggregate metrics don't explain what patterns mean, why they matter, or what to do next.
Alyx takes a fundamentally different approach: instead of just categorizing and aggregating, it analyzes your data to provide actual insights. When Alyx identifies patterns in your traces, it explains:
What the pattern means in the context of your system
Why it's happening based on the specific data it's analyzing
What actions to take — including actions Alyx can execute on your behalf
What to investigate next as you iterate on improvements

This insight-driven approach means you don't just get statistics—you get understanding.
Why This Approach Matters
Complexity Without Friction
Modern AI systems are incredibly complex—multi-agent workflows, complex prompt chains, evaluation results, and operational concerns all interact in ways that are difficult to understand. Traditional tools expose this complexity directly, requiring users to understand query languages, tool configurations, and data structures.
Alyx absorbs this complexity into an intelligent agent that understands your intent and handles the technical details. You can ask "what's causing the latency spikes?" and Alyx knows to:
Query trace data
Calculate latency contributions
Identify bottleneck spans
Correlate with error patterns
Provide specific recommendations
Democratizing AI Engineering Expertise
Building and operating AI systems requires specialized knowledge: understanding evaluation metrics, prompt engineering techniques, trace analysis, and system optimization. Alyx makes this expertise accessible through natural language, allowing more team members to contribute to AI system improvement without becoming experts in every aspect of the stack.
Iterative Improvement Through Conversation
The conversational interface enables an iterative workflow that's impossible with static dashboards or one-shot queries. Users can:
Ask follow-up questions based on initial results
Refine analyses as they learn more
Explore different angles of investigation
Get explanations for technical concepts
This conversational loop makes complex system analysis feel collaborative rather than investigative.
Building for the Future of AI Engineering
We believe the future of AI engineering tools looks like Alyx: intelligent agents that understand your domain, can execute complex workflows, and enable natural language interaction with sophisticated systems. This isn't about adding AI features to existing tools—it's about reimagining how humans and AI collaborate on technical work.
The same revolution that Claude Code and Cursor brought to software development—moving from manual navigation and configuration to intelligent, conversational assistance—is coming to AI engineering. Alyx is our vision for what that looks like.
Alyx represents a new paradigm in AI engineering tools: not more dashboards or more metrics, but an intelligent partner that understands your AI systems and helps you make them better.
Last updated
Was this helpful?

