Our Vision for Alyx

The Vision: Claude Code and Cursor for AI Engineering

When Claude Code and Cursor revolutionized software development, they didn't just add AI to an existing workflow—they fundamentally reimagined how developers interact with code. Instead of manually searching through files, writing queries, or configuring tools, developers could simply describe what they needed in natural language and have an intelligent agent understand context, navigate codebases, and take action.

We're building Alyx with the same vision for AI engineering.

Just as Claude Code and Cursor became essential tools for code development, we envision Alyx as the indispensable agent for building, debugging, and optimizing AI systems. The core insight is the same: the future of complex technical work isn't about building more dashboards or exposing more configuration—it's about building intelligent agents that understand your applications and can execute complex workflows through natural conversation on your behalf.

Our Approach: Agent-First Architecture

Intelligent Orchestration Over Static Queries

Alyx is built around a sophisticated orchestration system that manages multi-step workflows, coordinates tool calls, and maintains context across complex analyses. Rather than requiring users to know which tool to use or what information to provide in their queries, Alyx dynamically routes requests, selects appropriate tools, and chains operations together to answer user questions.

The orchestrator architecture enables:

  • Automatic task decomposition - Complex questions are broken down into multi-step plans

  • Tool call coordination - The agent selects and sequences tools based on context

  • Conversation continuity - Context is maintained across iterations and tool calls

  • Error handling and recovery - The agent can adapt when operations fail or need retries

Native Understanding of AI System Data

Traditional analysis tools treat AI system data as generic observability traces. Alyx is purpose-built to understand the unique structure and semantics of LLM workflows and AI agents. The system has deep knowledge of:

  • Traces and spans with inputs, outputs, tool calls, prompts, latency, and error tracebacks

  • Evaluation frameworks and how to interpret evaluation results

  • Prompt structure and optimization strategies

  • Experiment workflows from dataset creation through evaluation

  • Annotation schemas for categorizing and labeling issues

This domain expertise means Alyx can answer questions with precision without requiring you to manually configure what your system does, which columns are important, or how your data is structured. Alyx automatically understands that a "latency bottleneck" in an LLM system requires different analysis than a latency issue in a traditional service, and it knows which traces, spans, and metrics matter most for your specific question.

Multi-Step Planning with Visibility

One of our core design principles is transparency into the agent's reasoning and progress. Alyx uses an explicit todo management system that:

  • Plans complex tasks before execution

  • Tracks progress through multi-step analyses

  • Provides visibility into what the agent is doing and why

  • Enables iteration as users refine their questions

When you ask Alyx to "find what's wrong with this model and suggest improvements," it doesn't just start executing—it first creates a plan:

This planning approach, inspired by how experienced engineers approach complex problems, ensures thorough analysis and gives users confidence in the agent's methodology.

Analysis, Not Just Categorization

Counting and categorizing alone aren't enough. Without context, labels and aggregate metrics don't explain what patterns mean, why they matter, or what to do next.

Alyx takes a fundamentally different approach: instead of just categorizing and aggregating, it analyzes your data to provide actual insights. When Alyx identifies patterns in your traces, it explains:

  • What the pattern means in the context of your system

  • Why it's happening based on the specific data it's analyzing

  • What actions to take — including actions Alyx can execute on your behalf

  • What to investigate next as you iterate on improvements

This insight-driven approach means you don't just get statistics—you get understanding.

Why This Approach Matters

Complexity Without Friction

Modern AI systems are incredibly complex—multi-agent workflows, complex prompt chains, evaluation results, and operational concerns all interact in ways that are difficult to understand. Traditional tools expose this complexity directly, requiring users to understand query languages, tool configurations, and data structures.

Alyx absorbs this complexity into an intelligent agent that understands your intent and handles the technical details. You can ask "what's causing the latency spikes?" and Alyx knows to:

  • Query trace data

  • Calculate latency contributions

  • Identify bottleneck spans

  • Correlate with error patterns

  • Provide specific recommendations

Democratizing AI Engineering Expertise

Building and operating AI systems requires specialized knowledge: understanding evaluation metrics, prompt engineering techniques, trace analysis, and system optimization. Alyx makes this expertise accessible through natural language, allowing more team members to contribute to AI system improvement without becoming experts in every aspect of the stack.

Iterative Improvement Through Conversation

The conversational interface enables an iterative workflow that's impossible with static dashboards or one-shot queries. Users can:

  • Ask follow-up questions based on initial results

  • Refine analyses as they learn more

  • Explore different angles of investigation

  • Get explanations for technical concepts

This conversational loop makes complex system analysis feel collaborative rather than investigative.

Building for the Future of AI Engineering

We believe the future of AI engineering tools looks like Alyx: intelligent agents that understand your domain, can execute complex workflows, and enable natural language interaction with sophisticated systems. This isn't about adding AI features to existing tools—it's about reimagining how humans and AI collaborate on technical work.

The same revolution that Claude Code and Cursor brought to software development—moving from manual navigation and configuration to intelligent, conversational assistance—is coming to AI engineering. Alyx is our vision for what that looks like.


Alyx represents a new paradigm in AI engineering tools: not more dashboards or more metrics, but an intelligent partner that understands your AI systems and helps you make them better.

Last updated

Was this helpful?