All pages
Powered by GitBook
1 of 7

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

How to: Prompts

Guides on how to do prompt engineering with Phoenix

Getting Started

  • Configure AI Providers - how to configure API keys for OpenAI, Anthropic, Gemini, and more.

Prompt Management

Organize and manage prompts with Phoenix to streamline your development workflow

Prompt management is currently available on a feature branch only and will be released in the next major version.

  • Create a prompt - how to create, update, and track prompt changes

  • Test a prompt - how to test changes to a prompt in the playground and in the notebook

  • Tag a prompt - how to mark certain prompt versions as ready for

  • Using a prompt - how to integrate prompts into your code and experiments

Playground

Iterate on prompts and models in the prompt playground

  • Using the Playground - how to setup the playground and how to test prompt changes via datasets and experiments.

Configure AI Providers

Phoenix natively integrates with OpenAI, Azure OpenAI, Anthropic, and Google AI Studio (gemini) to make it easy to test changes to your prompts. In addition to the above, since many AI providers (deepseek, ollama) can be used directly with the OpenAI client, you can talk to any OpenAI compatible LLM provider.

Credentials

To securely provide your API keys, you have two options. One is to store them in your browser in local storage. Alternatively, you can set them as environment variables on the server side. If both are set at the same time, the credential set in the browser will take precedence.

Option 1: Store API Keys in the Browser

API keys can be entered in the playground application via the API Keys dropdown menu. This option stores API keys in the browser. Simply navigate to to settings and set your API keys.

Option 2: Set Environment Variables on Server Side

Available on self-hosted Phoenix

If the following variables are set in the server environment, they'll be used at API invocation time.

Provider
Environment Variable
Platform Link

Using OpenAI Compatible LLMs

Option 1: Configure the base URL in the prompt playground

Since you can configure the base URL for the OpenAI client, you can use the prompt playground with a variety of OpenAI Client compatible LLMs such as Ollama, DeepSeek, and more.

If you are using an LLM provider, you will have to set the OpenAI api key to that provider's api key for it to work.

OpenAI Client compatible providers Include

Provider
Base URL
Docs

Option 2: Server side configuration of the OpenAI base URL

Optionally, the server can be configured with the OPENAI_BASE_URL environment variable to change target any OpenAI compatible REST API.

For app.phoenix.arize.com, this may fail due to security reasons. In that case, you'd see a Connection Error appear.

If there is a LLM endpoint you would like to use, reach out to

OpenAI

  • OPENAI_API_KEY

https://platform.openai.com/

Azure OpenAI

  • AZURE_OPENAI_API_KEY

  • AZURE_OPENAI_ENDPOINT

  • OPENAI_API_VERSION

https://azure.microsoft.com/en-us/products/ai-services/openai-service/

Anthropic

  • ANTHROPIC_API_KEY

https://console.anthropic.com/

Gemini

  • GEMINI_API_KEY or GOOGLE_API_KEY

https://aistudio.google.com/

DeepSeek

https://api.deepseek.com

https://api-docs.deepseek.com/

Ollama

http://localhost:11434/v1/

https://github.com/ollama/ollama/blob/main/docs/openai.md

mailto://phoenix-support@arize.com
Simply insert the URL for the OpenAI client compatible LLM provider

Using the Playground

General guidelines on how to use Phoenix's prompt playground

Setup

To first get started, you will first Configure AI Providers. In the playground view, create a valid prompt for the LLM and click Run on the top right (or the mod + enter)

If successful you should see the LLM output stream out in the Output section of the UI.

Pick an LLM and setup the API Key for that provider to get started

Prompt Editor

The prompt editor (typically on the left side of the screen) is where you define the . You select the template language (mustache or f-string) on the toolbar. Whenever you type a variable placeholder in the prompt (say {{question}} for mustache), the variable to fill will show up in the inputs section. Input variables must either be filled in by hand or can be filled in via a dataset (where each row has key / value pairs for the input).

Use the template language to create prompt template variables that can be applied during runtime

Model Configuration

Every prompt instance can be configured to use a specific LLM and set of invocation parameters. Click on the model configuration button at the top of the prompt editor and configure your LLM of choice. Click on the "save as default" option to make your configuration sticky across playground sessions.

Switch models and modify invocation params

Comparing Prompts

The Prompt Playground offers the capability to compare multiple prompt variants directly within the playground. Simply click the + Compare button at the top of the first prompt to create duplicate instances. Each prompt variant manages its own independent template, model, and parameters. This allows you to quickly compare prompts (labeled A, B, C, and D in the UI) and run experiments to determine which prompt and model configuration is optimal for the given task.

Compare multiple different prompt variants at once

Using Datasets with Prompts

Phoenix lets you run a prompt (or multiple prompts) on a dataset. Simply load a dataset containing the input variables you want to use in your prompt template. When you click Run, Phoenix will apply each configured prompt to every example in the dataset, invoking the LLM for all possible prompt-example combinations. The result of your playground runs will be tracked as an experiment under the loaded dataset (see Playground Traces)

Each example's input is used to fill the prompt template

Playground Traces

All invocations of an LLM via the playground is recorded for analysis, annotations, evaluations, and dataset curation.

If you simply run an LLM in the playground using the free form inputs (e.g. not using a dataset), Your spans will be recorded in a project aptly titled "playground".

All free form playground runs are recorded under the playground project

If however you run a prompt over dataset examples, the outputs and spans from your playground runs will be captured as an experiment. Each experiment will be named according to the prompt you ran the experiment over.

If you run over a dataset, the output and traces is tracked as a dataset experiment

Tag a prompt

How to deploy prompts to different environments safely

Prompts in Phoenix are versioned in a linear history, creating a comprehensive audit trail of all modifications. Each change is tracked, allowing you to:

  • Review the complete history of a prompt

  • Understand who made specific changes

  • Revert to previous versions if needed

Creating a Tag

When you are ready to deploy a prompt to a certain environment (let's say staging), the best thing to do is to tag a specific version of your prompt as ready. By default Phoenix offers 3 tags, production, staging, and development but you can create your own tags as well.

Each tag can include an optional description to provide additional context about its purpose or significance. Tags are unique per prompt, meaning you cannot have two tags with the same name for the same prompt.

Creating a custom tag

It can be helpful to have custom tags to track different versions of a prompt. For example if you wanted to tag a certain prompt as the one that was used in your v0 release, you can create a custom tag with that name to keep track!

When creating a custom tag, you can provide:

  • A name for the tag (must be a valid identifier)

  • An optional description to provide context about the tag's purpose

Use custom tags to track releases or maybe just an arbitrary milestone

Pulling a prompt by tag

Once a prompt version is tagged, you can pull this version of the prompt into any environment that you would like (an application, an experiment). Similar to git tags, prompt version tags let you create a "release" of a prompt (e.x. pushing a prompt to staging).

You can retrieve a prompt version by:

  • Using the tag name directly (e.g., "production", "staging", "development")

  • Using a custom tag name

  • Using the latest version (which will return the most recent version regardless of tags)

For full details on how to use prompts in code, see Using a prompt

Listing tags

You can list all tags associated with a specific prompt version. The list is paginated, allowing you to efficiently browse through large numbers of tags. Each tag in the list includes:

  • The tag's unique identifier

  • The tag's name

  • The tag's description (if provided)

This is particularly useful when you need to:

  • Review all tags associated with a prompt version

  • Verify which version is currently tagged for a specific environment

  • Track the history of tag changes for a prompt version

Using the Client

Tag Naming Rules

Tag names must be valid identifiers: lowercase letters, numbers, hyphens, and underscores, starting and ending with a letter or number.

Examples: staging, production-v1, release-2024

Creating and Managing Tags

from phoenix.client import Client

# Create a tag for a prompt version
Client().prompts.tags.create(
    prompt_version_id="version-123",
    name="production",
    description="Ready for production environment"
)

# List tags for a prompt version
tags = Client().prompts.tags.list(prompt_version_id="version-123")
for tag in tags:
    print(f"Tag: {tag.name}, Description: {tag.description}")

# Get a prompt version by tag
prompt_version = Client().prompts.get(
    prompt_identifier="my-prompt",
    tag="production"
)
from phoenix.client import AsyncClient

# Create a tag for a prompt version
await AsyncClient().prompts.tags.create(
    prompt_version_id="version-123",
    name="production",
    description="Ready for production environment"
)

# List tags for a prompt version
tags = await AsyncClient().prompts.tags.list(prompt_version_id="version-123")
for tag in tags:
    print(f"Tag: {tag.name}, Description: {tag.description}")

# Get a prompt version by tag
prompt_version = await AsyncClient().prompts.get(
    prompt_identifier="my-prompt",
    tag="production"
)

Test a prompt

Testing your prompts before you ship them is vital to deploying reliable AI applications

Testing in the Playground

Testing a prompt in the playground

The Playground is a fast and efficient way to refine prompt variations. You can load previous prompts and validate their performance by applying different variables.

Each single-run test in the Playground is recorded as a span in the Playground project, allowing you to revisit and analyze LLM invocations later. These spans can be added to datasets or reloaded for further testing.

Testing a prompt over a dataset

The ideal way to test a prompt is to construct a golden dataset where the dataset examples contains the variables to be applied to the prompt in the inputs and the outputs contains the ideal answer you want from the LLM. This way you can run a given prompt over N number of examples all at once and compare the synthesized answers against the golden answers.

Playground integrates with datasets and experiments to help you iterate and incrementally improve your prompts. Experiment runs are automatically recorded and available for subsequent evaluation to help you understand how changes to your prompts, LLM model, or invocation parameters affect performance.

Testing prompt variations side-by-side

Prompt Playground supports side-by-side comparisons of multiple prompt variants. Click + Compare to add a new variant. Whether using Span Replay or testing prompts over a Dataset, the Playground processes inputs through each variant and displays the results for easy comparison.

Testing multiple prompts simultaneously

Testing a prompt using code

Sometimes you may want to test a prompt and run evaluations on a given prompt. This can be particularly useful when custom manipulation is needed (e.x. you are trying to iterate on a system prompt on a variety of different chat messages). 🚧 This tutorial is coming soon

Using a prompt

Once you have tagged a version of a prompt as ready (e.x. "staging") you can pull a prompt into your code base and use it to prompt an LLM.

ℹ️ A caution about using prompts inside your application code

When integrating Phoenix prompts into your application, it's important to understand that prompts are treated as code and are stored externally from your primary codebase. This architectural decision introduces several considerations:

Key Implementation Impacts

  • Network dependencies for prompt retrieval

  • Additional debugging complexity

  • External system dependencies

Current Status

The Phoenix team is actively implementing safeguards to minimize these risks through:

  • Caching mechanisms

  • Fallback systems

Best Practices

If you choose to implement Phoenix prompts in your application, ensure you:

  1. Implement robust caching strategies

  2. Develop comprehensive fallback mechanisms

  3. Consider the impact on your application's reliability requirements

If you have any feedback on the above improvements, please let us know

To use prompts in your code you will need to install the phoenix client library.

For Python:

For JavaScript / TypeScript:

Pulling a prompt

There are three major ways pull prompts, pull by (latest), pull by version, and pull by tag.

Pulling a prompt by Name or ID

Pulling a prompt by name or ID (e.g. the identifier) is the easiest way to pull a prompt. Note that since name and ID doesn't specify a specific version, you will always get the latest version of a prompt. For this reason we only recommend doing this during development.

Note prompt names and IDs are synonymous.

Pulling a prompt by Version ID

Pulling a prompt by version retrieves the content of a prompt at a particular point in time. The version can never change, nor be deleted, so you can reasonably rely on it in production-like use cases.

Pulling a prompt by Tag

Pulling by prompt by is most useful when you want a particular version of a prompt to be automatically used in a specific environment (say "staging"). To pull prompts by tag, you must in the UI first.

Note that tags are unique per prompt so it must be paired with the prompt_identifier

A Prompt pulled in this way can be automatically updated in your application by simply moving the "staging" tag from one prompt version to another.

Using a prompt

The phoenix clients support formatting the prompt with variables, and providing the messages, model information, , and response format (when applicable).

The Phoenix Client libraries make it simple to transform prompts to the SDK that you are using (no proxying necessary!)

Both the Python and TypeScript SDKs support transforming your prompts to a variety of SDKs (no proprietary SDK necessary).

  • Python - support for OpenAI, Anthropic, Gemini

  • TypeScript - support for OpenAI, Anthropic, and the Vercel AI SDK

pip install arize-phoenix-client
npm install @arizeai/phoenix-client
# Initialize a phoenix client with your phoenix endpoint
# By default it will read from your environment variables
client = Client(
 # endpoint="https://my-phoenix.com",
)

# The version ID can be found in the versions tab in the UI
prompt = client.prompts.get(prompt_version_id="UHJvbXB0VmVyc2lvbjoy")
print(prompt.id)
prompt.dumps()
import { getPrompt } from "@arizeai/phoenix-client/prompts";

const promptByVersionId = await getPrompt({ versionId: "b5678" })
// ^ the latest version of the prompt with Id "a1234"
# By default it will read from your environment variables
client = Client(
 # endpoint="https://my-phoenix.com",
)

# Since tags don't uniquely identify a prompt version 
#  it must be paired with the prompt identifier (e.g. name)
prompt = client.prompts.get(prompt_identifier="my-prompt-name", tag="staging")
print(prompt.id)
prompt.dumps()
import { getPrompt } from "@arizeai/phoenix-client/prompts";

const promptByTag = await getPrompt({ tag: "staging", name: "my-prompt" });
// ^ the specific prompt version tagged "production", for prompt "my-prompt"
from openai import OpenAI

prompt_vars = {"topic": "Sports", "article": "Surrey have signed Australia all-rounder Moises Henriques for this summer's NatWest T20 Blast. Henriques will join Surrey immediately after the Indian Premier League season concludes at the end of next month and will be with them throughout their Blast campaign and also as overseas cover for Kumar Sangakkara - depending on the veteran Sri Lanka batsman's Test commitments in the second half of the summer. Australian all-rounder Moises Henriques has signed a deal to play in the T20 Blast for Surrey . Henriques, pictured in the Big Bash (left) and in ODI action for Australia (right), will join after the IPL . Twenty-eight-year-old Henriques, capped by his country in all formats but not selected for the forthcoming Ashes, said: 'I'm really looking forward to playing for Surrey this season. It's a club with a proud history and an exciting squad, and I hope to play my part in achieving success this summer. 'I've seen some of the names that are coming to England to be involved in the NatWest T20 Blast this summer, so am looking forward to testing myself against some of the best players in the world.' Surrey director of cricket Alec Stewart added: 'Moises is a fine all-round cricketer and will add great depth to our squad.'"}
formatted_prompt = prompt.format(variables=prompt_vars)

# Make a request with your Prompt
oai_client = OpenAI()
resp = oai_client.chat.completions.create(**formatted_prompt)
import { getPrompt, toSDK } from "@arizeai/phoenix-client/prompts";
import OpenAI from "openai";

const openai = new OpenAI()
const prompt = await getPrompt({ name: "my-prompt" });

// openaiParameters is fully typed, and safe to use directly in the openai client
const openaiParameters = toSDK({
  // sdk does not have to match the provider saved in your prompt
  // if it differs, we will apply a best effort conversion between providers automatically
  sdk: "openai",
  prompt: questionAskerPrompt,
  // variables within your prompt template can be replaced across messages
  variables: { question: "How do I write 'Hello World' in JavaScript?" }
});

const response = await openai.chat.completions.create({
  ...openaiParameters,
  // you can still override any of the invocation parameters as needed
  // for example, you can change the model or stream the response
  model: "gpt-4o-mini",
  stream: false
})
https://github.com/Arize-ai/phoenix/issues/6290
name or ID
Tag a prompt
The ID of a specific prompt version can be found in the prompt history.
You can control the prompt version tags in the UI.

Create a prompt

Store and track prompt versions in Phoenix

Prompts with Phoenix can be created using the playground as well as via the phoenix-clients.

Using the Playground

Navigate to the Prompts in the navigation and click the add prompt button on the top right. This will navigate you to the Playground.

The playground is like the IDE where you will develop your prompt. The prompt section on the right lets you add more messages, change the template format (f-string or mustache), and an output schema (JSON mode).

Compose a prompt

To the right you can enter sample inputs for your prompt variables and run your prompt against a model. Make sure that you have an API key set for the LLM provider of your choosing.

Save the prompt

To save the prompt, click the save button in the header of the prompt on the right. Name the prompt using alpha numeric characters (e.x. `my-first-prompt`) with no spaces. The model configuration you selected in the Playground will be saved with the prompt. When you re-open the prompt, the model and configuration will be loaded along with the prompt.

Once you are satisfied with your prompt in the playground, you can name it and save it

View your prompts

You just created your first prompt in Phoenix! You can view and search for prompts by navigating to Prompts in the UI.

Prompts can be loaded back into the Playground at any time by clicking on "open in playground"

You can quickly load in the latest version of a prompt into the playground

To view the details of a prompt, click on the prompt name. You will be taken to the prompt details view. The prompt details view shows all the that has been saved (ex: the model used, the invocation parameters, etc.)

The details of a prompt shows everything that is saved about a prompt

Making edits to a prompt

Once you've crated a prompt, you probably need to make tweaks over time. The best way to make tweaks to a prompt is using the playground. Depending on how destructive a change you are making you might want to just create a new or clone the prompt.

Editing a prompt in the playground

To make edits to a prompt, click on the edit in Playground on the top right of the prompt details view.

Iterate on prompts in the playground and save when you are happy with the prompt

When you are happy with your prompt, click save. You will be asked to provide a description of the changes you made to the prompt. This description will show up in the history of the prompt for others to understand what you did.

Cloning a prompt

In some cases, you may need to modify a prompt without altering its original version. To achieve this, you can clone a prompt, similar to forking a repository in Git.

Cloning a prompt allows you to experiment with changes while preserving the history of the main prompt. Once you have made and reviewed your modifications, you can choose to either keep the cloned version as a separate prompt or merge your changes back into the main prompt. To do this, simply load the cloned prompt in the playground and save it as the main prompt.

This approach ensures that your edits are flexible and reversible, preventing unintended modifications to the original prompt.

Adding labels and metadata

🚧 Prompt labels and metadata is still under construction.

Using the Phoenix Client

Starting with prompts, Phoenix has a dedicated client that lets you programmatically. Make sure you have installed the appropriate phoenix-client before proceeding.

phoenix-client for both Python and TypeScript are very early in it's development and may not have every feature you might be looking for. Please drop us an issue if there's an enhancement you'd like to see. https://github.com/Arize-ai/phoenix/issues

Compose a Prompt

Creating a prompt in code can be useful if you want a programatic way to sync prompts with the Phoenix server.

Below is an example prompt for summarizing articles as bullet points. Use the Phoenix client to store the prompt in the Phoenix server. The name of the prompt is an identifier with lowercase alphanumeric characters plus hyphens and underscores (no spaces).

import phoenix as px
from phoenix.client.types import PromptVersion

content = """\
You're an expert educator in {{ topic }}. Summarize the following article
in a few concise bullet points that are easy for beginners to understand.

{{ article }}
"""

prompt_name = "article-bullet-summarizer"
prompt = px.Client().prompts.create(
    name=prompt_name,
    version=PromptVersion(
        [{"role": "user", "content": content}],
        model_name="gpt-4o-mini",
    ),
)

A prompt stored in the database can be retrieved later by its name. By default the latest version is fetched. Specific version ID or a tag can also be used for retrieval of a specific version.

prompt = px.Client().prompts.get(prompt_identifier=prompt_name)

If a version is tagged with, e.g. "production", it can retrieved as follows.

prompt = px.Client().prompts.get(prompt_identifier=prompt_name, tag="production")

Below is an example prompt for summarizing articles as bullet points. Use the Phoenix client to store the prompt in the Phoenix server. The name of the prompt is an identifier with lowercase alphanumeric characters plus hyphens and underscores (no spaces).

import { createPrompt, promptVersion } from "@arizeai/phoenix-client";

const promptTemplate = `
You're an expert educator in {{ topic }}. Summarize the following article
in a few concise bullet points that are easy for beginners to understand.

{{ article }}
`;

const version = createPrompt({
  name: "article-bullet-summarizer",
  version: promptVersion({
    modelProvider: "OPENAI",
    modelName: "gpt-3.5-turbo",
    template: [
      {
        role: "user",
        content: promptTemplate,
      },
    ],
  }),
});

A prompt stored in the database can be retrieved later by its name. By default the latest version is fetched. Specific version ID or a tag can also be used for retrieval of a specific version.

import { getPrompt } from "@arizeai/phoenix-client/prompts";

const prompt = await getPrompt({ name: "article-bullet-summarizer" });
// ^ you now have a strongly-typed prompt object, in the Phoenix SDK Prompt type

If a version is tagged with, e.g. "production", it can retrieved as follows.

const promptByTag = await getPrompt({ tag: "production", name: "article-bullet-summarizer" });
// ^ you can optionally specify a tag to filter by

from phoenix.client import Client

# Initialize a phoenix client with your phoenix endpoint
# By default it will read from your environment variables
client = Client(
 # endpoint="https://my-phoenix.com",
)

# Pulling a prompt by name
prompt_name = "my-prompt-name"
client.prompts.get(prompt_identifier=prompt_name)
import { getPrompt } from "@arizeai/phoenix-client/prompts";

const prompt = await getPrompt({ name: "my-prompt" });
// ^ the latest version of the prompt named "my-prompt"

const promptById = await getPrompt({ promptId: "a1234" })
// ^ the latest version of the prompt with Id "a1234"
Prompts Concepts
tag
tools
parts of the prompt
prompt version