Skip to main content
AX Agent Improvement Loop runs workers in isolated sandboxes. Each run uses an agent preset (harness, sandbox provider, optional skills, optional repo) and a project for trace and eval context. Workers read telemetry from Arize and optionally call external tools through skills. They write artifacts—investigation files, eval output, branches, or PRs—for your team to review. Workers do not deploy to your production environment on their own.

What a run uses

Harness — The agent runtime inside the sandbox (for example, Claude Code or Codex). Defined on the preset. Sandbox — The compute environment (Arize-managed Kubernetes, Claude-managed, or another connected provider). The worker clones a repo, installs dependencies, and loads skills here. Project — The LLM project whose traces and evals the worker may read for that run. Bound in Studio or on the Signal automation. Skills — Account-level integrations (GitHub, Arize, Datadog, custom skills) attached to the preset. Each skill injects credentials into the sandbox as environment variables. See Skills and permissions. Repo — Optional GitHub repository on the preset. Requires a GitHub skill. Used for code context and opening PRs.

Session vs automation

Both use the same preset and sandbox model.
  • Session — Started from Agent Studio. You can follow the transcript and send follow-ups.
  • Automation — Same worker configuration on a cron, metric threshold, or monitor trigger. Signal is a built-in automation on each LLM project.
Manage all runs from Fleet Observability.

What workers can access

SourceScope
Arize traces (default)Read spans on the bound project—provisioned automatically at job start; no Arize skill required
Arize skill (optional)Broader Arize API access (datasets, experiments, evaluators, etc.) per the ARIZE_API_KEY you attach
GitHubRepos and tokens you configure on the GitHub skill; repo field on the preset when GitHub is enabled
DatadogAPIs allowed by keys on the Datadog skill
Custom skillsGitHub repos you list as the skill install source (clone into the sandbox)
Default trace access respects project binding and the user’s space/project RBAC. An attached Arize skill adds whatever permissions that API key has—it does not bypass RBAC beyond what the key allows.

What workers cannot do

  • Run as your customer-facing production agent — the improvement loop improves systems you observe in Arize; it does not replace your app’s runtime.
  • Change production without review — Code changes go through PRs; you merge in GitHub.
  • Access projects or spaces outside the binding you set on the run (or your RBAC, whichever is narrower).

Credentials and secrets

  • Skill secrets (API keys, tokens) are stored encrypted at rest on the account and injected into the sandbox for the run.
  • Starting a worker may create a short-lived service key for sandbox provisioning. Creating jobs requires appropriate space permissions; see Skills and permissions.
  • Do not put secrets in task prompts. Configure them on the skill or preset.

Sandboxes and isolation

Each run gets a dedicated sandbox instance. When the run ends, the environment is torn down according to your sandbox provider settings. Agent execution is traced so you can audit tool use and outputs in job detail. Treat sandbox workers like any privileged automation: scope repos and API keys to least privilege, rotate credentials on the skill definitions, and review PRs before merge.

Human review

Design workflows assuming a person approves outcomes:
  • Read Signal investigations before acting.
  • Review PRs in GitHub.
  • Inspect job transcripts and artifacts in Fleet Observability.