
Google Colab
log_evaluations_sync.
This guide assumes you have traces in Arize AX and are looking to run an evaluation to measure your application performance.
To add evaluations you can set up online evaluations as a task to run automatically, or you can follow the steps below to generate evaluations and log them to Arize AX:
1
Install the Arize AX SDK
2
Import your spans in code
3
Run a custom evaluator using Phoenix
4
Log evaluations back to Arize AX
Install dependencies and setup keys
ARIZE_API_KEY and SPACE_ID from your Space Settings page (shown below) to the variables in the cell below.

Import your spans in code
Once you have traces in Arize AX, you can visit the LLM Tracing tab to see your traces and export them in code. By clicking the export button, you can get the boilerplate code to copy paste to your evaluator.Run a custom evaluator using Phoenix
Create a prompt template for the LLM to judge the quality of your responses. You can utilize any of the Arize AX Evaluator Templates or you can create your own. Below is an example which judges the positivity or negativity of the LLM output.llm_classify function to run the evaluation using your custom template. You will be using the dataframe from the traces you generated above. We also add nest_asyncio to run the evaluations concurrently (if you are running multiple evaluations).
Log evaluations back to Arize
Use thelog_evaluations_sync function as part of our Python SDK to attach evaluations you’ve run to traces. The code below assumes that you have already completed an evaluation run, and you have the evals_dataframe object. It also assumes you have a traces_dataframe object to get the span_id that you need to attach the evals.
The evals_dataframe requires four columns, which should be auto-generated for you based on the evaluation you ran using Phoenix. The <eval_name> must be alphanumeric and cannot have hyphens or spaces.
eval.<eval_name>.labeleval.<eval_name>.scoreeval.<eval_name>.explanationcontext.span_id