Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
What this enables: Run Arize AX experiment evaluations automatically as part of your Azure Pipelines — on every PR, on a schedule, or on-demand. Catch regressions in accuracy, latency, and cost before they hit production.
Key Concepts
- Pipeline: An automated workflow defined in YAML, stored in your repo (typically
azure-pipelines.ymlat the root). - Stages → Jobs → Steps: A three-level hierarchy. A
stagegroups related work, ajobruns on a single agent, andstepsare the individual commands. Most simple pipelines use a single implicit stage with one job. - Variable Groups: Reusable, project-scoped collections of variables and secrets defined in Pipelines → Library. Linked to a pipeline via
variables: - group:. - Service Connections: Azure DevOps integrations that authenticate to external systems — Git providers, Docker registries, Azure resources, secret managers.
- Triggers: How pipelines get kicked off. CI pushes (
trigger:) and PR validation (pr:) are separate top-level blocks; cron schedules live underschedules:.
Prerequisites & Assumptions
This guide assumes:- An Azure DevOps organization and project with Pipelines enabled. Your platform team has set up project access.
- Your repository is connected. Either it lives in Azure Repos, or you’ve created a GitHub / Bitbucket / GitLab service connection so Azure Pipelines can clone it and post status checks back.
- Microsoft-hosted agents are available (the default). Python 3.12 is preinstalled on the
ubuntu-latestimage. Self-hosted agent pools also work as long as Python 3.12+ is on the agent. - A Variable Group exists. Navigate to Pipelines → Library → + Variable group and create one named
arize-experimentswithARIZE_API_KEY,ARIZE_SPACE_ID,ARIZE_DATASET_ID, andOPENAI_API_KEY. Click the lock icon next to each value to mark it as a secret. The pipeline YAML below references this group by name.
🔑 Secrets behave differently than in GitHub Actions. Azure Pipelines does not automatically inject secret variables into the step’s environment. You must explicitly map them via an env: block on each step that needs them, or they won’t be visible to your script. The example below shows the pattern.
Coming from Jenkins or GitHub Actions? Three things to know up front: (1) Azure DevOps uses a three-level hierarchy (stages → jobs → steps) rather than two-level, though small pipelines can omitstages. (2) Secret variables require explicitenv:mapping per step (see callout above). (3)ubuntu-latestnow resolves to Ubuntu 24.04 (the cutover happened in March 2025). Pin toubuntu-24.04explicitly if you want stability across future image rollovers. The Python script that runs your experiment is identical — no changes needed.
Setting Up Your First Experiment Pipeline
Create an azure-pipelines.yml
Place an azure-pipelines.yml at the root of your repository (or anywhere — you’ll point to it when creating the pipeline in the Azure DevOps UI). Then in Azure DevOps go to Pipelines → New pipeline, select your repo, and choose Existing Azure Pipelines YAML file.
Breakdown
trigger— CI trigger block. Fires when commits are pushed tomain. PR-only pipelines drop this and usepr:instead.pool.vmImage: ubuntu-latest— Runs on a Microsoft-hosted Linux agent. Currently maps to Ubuntu 24.04. Pin toubuntu-24.04explicitly if you want to avoid being moved by future Microsoft rollovers.variables: - group:— Pulls in thearize-experimentsVariable Group. Secret values from the group are masked in logs automatically; non-secrets become normal pipeline variables.task: UsePythonVersion@0— Selects the Python version. 3.12 is already on the image, but pinning here makes the choice explicit and survives future image changes.script:— Shorthand forBash@3on Linux agents. Equivalent toshin Jenkins orrun:in GitHub Actions.env:on the run step — The required mapping from Variable Group secrets to environment variables. Without this block your script can’t seeARIZE_API_KEYeven though the Variable Group is loaded.publish:— Storesexperiment_results.jsonas a pipeline artifact.condition: always()keeps the artifact even when the script exits nonzero (useful when the experiment “fails” on a regression you want to inspect).
Self-hosted agent? Drop thevmImageline and usepool: name: <your-pool-name>. Make sure Python 3.12+ is on the agent or thatUsePythonVersion@0can install it (the task supports the Python tool installer on agents that allow downloads).
Trigger Options
Azure DevOps splits triggers across three top-level blocks:trigger: for CI pushes, pr: for PR validation, and schedules: for cron. Path filtering is pipeline-level on every trigger type — same posture as Harness, cleaner than Jenkins’ stage-level changeset.
1. Webhook (Pull Request)
The most common setup. Azure Pipelines runs the YAML on every PR open / update against a target branch, posts the status as a check, and blocks merging if you’ve configured branch policies to require it.Path filters are pipeline-level. If nothing incopilot/search/**orcopilot/experiments/**changed, the pipeline doesn’t start at all — no skipped stages, no no-op builds. This matches HarnesspayloadConditionsand is stricter than Jenkins, which evaluateschangesetafter the build has already started.
PR triggers from GitHub. When the repo lives in GitHub (not Azure Repos), the pr: block in YAML is ignored — PR triggers must be configured in the GitHub side of the service connection. Azure Repos honors the YAML directly. Microsoft documents this gotcha here.
2. Webhook (CI / Push)
Fires on every push to a matching branch. Combine withpaths: to scope tightly.
3. Scheduled (Cron)
always: true matters. Without it, a scheduled run only fires when there have been new commits since the last scheduled run. For nightly evals against a fixed dataset you almost always want it to run regardless.
4. Pipeline Chaining
Trigger this pipeline after another one finishes — useful when experiments should only run on a green build.5. Manual or Parameterized Runs
Omittrigger: and pr: (or set trigger: none) to make the pipeline manual-only. Add parameters: to expose inputs in the Run pipeline dialog and the REST API.
More Mature Patterns
Once the basics are working, these patterns become relevant as your experiment workflows grow.Parallel Evaluation Runs
Run experiments against multiple models or datasets simultaneously usingstrategy.matrix:. Each leg gets its own job and Microsoft-hosted agent.
Pipeline Templates
If multiple repos need the same experiment setup (install deps, configure credentials, run the script), extract it into a YAML template and reference it viaextends:. Templates can live alongside the pipeline or in a dedicated repository surfaced through resources.repositories.
Variable Groups Linked to Azure Key Vault
For Azure-native orgs, link the Variable Group to an Azure Key Vault so secrets are managed centrally and rotated outside Azure DevOps. In Pipelines → Library → Variable group, toggle Link secrets from an Azure key vault and pick your subscription and vault. Only secret names are stored in the group; values are pulled from Key Vault at runtime. See Microsoft’s guide for the full setup. For ad-hoc fetches (a single secret, no Variable Group), use theAzureKeyVault@2 task directly:
Workload Identity (OIDC) service connections are the modern way to authenticate the AzureKeyVault@2 task to Azure — no client secrets to rotate. Worth setting up if your org runs anything else on Azure. See Microsoft’s workload identity guide.
Environments and Approvals for Promotion Gates
Azure Pipelines Environments let you require manual approval before a stage runs. This pairs naturally with experiments-as-gates: run the experiment in one stage, gate prompt or model promotion in the next.production-prompts → Approvals and checks), not in YAML — this keeps the approver list outside the repo and editable by platform owners without a code change.
Notifications
Azure DevOps has built-in Service Hooks for Slack and Microsoft Teams. Configure them at the project level under Project settings → Service hooks for a specific event (e.g., “Run state changed → Failed”) and pipeline. No YAML changes needed for the hook itself. For inline messaging from a step (richer payloads, custom routing) post directly to a webhook withcurl:
PR Status Checks and Comments
When the pipeline is triggered by a PR, Azure DevOps automatically posts a status check back to the Git provider — same UX as GitHub Actions checks. For Azure Repos this is built-in; for GitHub repos it requires the GitHub service connection to have the right scopes. To post the experiment summary as an actual PR comment, use the Azure DevOps CLI for Azure Repos, or theGitHubComment@0 task for GitHub: