Trace as Inferences

If you are already storing your LLM application data in tables or files, Arize supports an alternative way to log LLM application data as inferences.

The following code example provides a brief overview of uploading prompts, embeddings, and other model parameters. Run the above Colab for a more detailed view LLM ingestion.

Example Row

prompt_text

prompt_vector

response_text

openai_relevance_0

retrieval_text_0

text_similarity_0

text_similarity_1

user_feedback

prediction_ts

How often does Arize query the table for table import jobs?

[ 0.00393428 -0.00417591 -0.00854287...

Arize will regularly sync new data from your data source with Arize...

irrelevant

Arize will attempt a dry run to validate your job for any access...

0.86539755255...

0.8653975525...

NaN

2023-04-05 20:33:22.006650000

# Declare prompt and response columns
prompt_columns=EmbeddingColumnNames(
    vector_column_name="prompt_vector",
    data_column_name="prompt_text"
)

response_columns="response_text"

# feature & tag columns can be optionally defined with typing:
tag_columns = TypedColumns(
    to_float=["text_similarity_0", "text_similarity_1"],
    inferred=["openai_relevance_0", "retrieval_text_1"],
)

# Define the Schema, including embedding information
schema = Schema(
    timestamp_column_name="prediction_ts",
    actual_label_column_name="user_feedback",
    tag_column_names=tag_columns,
    prompt_column_names=prompt_columns,
    response_column_names=response_columns
)

# Log the dataframe with the schema mapping 
response = arize_client.log(
    model_id="search-and-retrieval-with-corpus-dataset",
    model_version= "v1",
    model_type=ModelTypes.GENERATIVE_LLM,
    environment=Environments.PRODUCTION,
    dataframe=test_dataframe,
    schema=schema,
)

from arize.api import Client
from arize.utils.types import ModelTypes, Environments, Embedding

API_KEY = 'ARIZE_API_KEY'
SPACE_ID = 'YOUR SPACE ID'
arize_client = Client(space_id=SPACE_ID, api_key=API_KEY)

# example features
# features = {
    'feature_name': feature_value,
    ...
}

# example tags - features & tags can be optionally defined with typing
tags = {
    'step': TypedValue(value=0, type=ArizeTypes.INT),
    'task_type': 'Q&A',
    'conversation_id': TypedValue(value='0a9c', type=ArizeTypes.STR),
    'api_call_duration': 6.238615,
    'response_len': 73,
    'prompt_len': 88,
    'prompt_template': "You are an agent engineered to provide concise and precise answers to any question."
}

prompt = Embedding(
    vector= [0.10031, -0.06012, -0.00259, -0.08568, ...],
    data="! Why does the UI have to be so clunky and unintuitive?! It's like every time I try to navigate through the options, I end up getting lost in a sea of confusing menus and poorly labeled buttons. And don't even get me started on the color scheme - who thought neon green on black was a good idea? At this point, I'm ready to throw my computer out the window and switch to a different product altogether. Come on, developers - make things simpler for us already!"
)
response = Embedding(
    vector= [-0.11366, -0.20642, -0.03115, -0.18099,...],
    data="I'm sorry to hear that you're experiencing difficulties with the UI. As an AI language model, I don't have the capability to change the UI, but I suggest looking into user guides, tutorials or support forums to help navigate through the menus more easily. Additionally, you could provide feedback to the developers on the UI, either through in-product feedback tools or their website. They may take into account your suggestions in future updates."
)

# Log data into the Arize platform
response = arize_client.log(
    prediction_id='plED4eERDCasd9797ca34',
    model_id='sample-model-1',
    model_type=ModelTypes.GENERATIVE_LLM,
    environment=Environments.PRODUCTION,
    model_version='v1',
    prediction_timestamp=1618590882,
    # prediction_label will default to 1 if not sent
    actual_label=1,  # 1 represents a thumbs up, the user liked the response of the model
    features=features,
    tags=tags,
    prompt=prompt,
    response=response,
)

Ingesting Prompt & Response

Arize supports ingesting prompts sent to the LLM and its responses via the prompt and response fields in the log function.

The following examples include both prompt and response information. However, you can send just one of them (or none) if you do not have both.

Sending Prompt & Response with Embeddings

Use the Embedding object to create the prompt(containing the document_vector) and the response(containing the summary_vector).

The embedding vector is the dense vector representation of the unstructured input. ⚠️ Note: embedding features are not sparse vectors.
The embedding data is the raw text data associated with the vector.

# Build prompt & response embeddings
prompt = Embedding(
    vector= [0.10031, -0.06012, -0.00259, -0.08568, ...],
    data="! Why does the UI have to be so clunky and unintuitive?! It's like every time I try to navigate through the options, I end up getting lost in a sea of confusing menus and poorly labeled buttons. And don't even get me started on the color scheme - who thought neon green on black was a good idea? At this point, I'm ready to throw my computer out the window and switch to a different product altogether. Come on, developers - make things simpler for us already!"
)
response = Embedding(
    vector= [-0.11366, -0.20642, -0.03115, -0.18099,...],
    data="I'm sorry to hear that you're experiencing difficulties with the UI. As an AI language model, I don't have the capability to change the UI, but I suggest looking into user guides, tutorials or support forums to help navigate through the menus more easily. Additionally, you could provide feedback to the developers on the UI, either through in-product feedback tools or their website. They may take into account your suggestions in future updates."
)

# Log data into the Arize platform
response = arize_client.log(
    ...
    model_type=ModelTypes.GENERATIVE_LLM,
    prompt=prompt,
    response=response,
)

See here for more information on embeddings and options for generating them.

Sending Prompt & Response without Embeddings

If you prefer not to include embedding vectors with your prompt or response text, you can achieve this by just providing the prompt or response text.

# Declare prompt & response
prompt = "! Why does the UI have to be so clunky and unintuitive?! It's like every time I try to navigate through the options, I end up getting lost in a sea of confusing menus and poorly labeled buttons. And don't even get me started on the color scheme - who thought neon green on black was a good idea? At this point, I'm ready to throw my computer out the window and switch to a different product altogether. Come on, developers - make things simpler for us already!"
response = "I'm sorry to hear that you're experiencing difficulties with the UI. As an AI language model, I don't have the capability to change the UI, but I suggest looking into user guides, tutorials or support forums to help navigate through the menus more easily. Additionally, you could provide feedback to the developers on the UI, either through in-product feedback tools or their website. They may take into account your suggestions in future updates."

# Log data into the Arize platform
response = arize_client.log(
    ...
    model_type=ModelTypes.GENERATIVE_LLM,
    prompt=prompt,
    response=response,
)

Ingesting Prompt Template (Optional)

Arize supports ingesting prompt versions and the overall prompt to be natively tracked in the platform. The following fields in the log call are available:

prompt_template: This field should receive the prompt template in string format. The variables are represented by using the double key braces: {{variable_name}}.
- For example, your prompt template might look something like:

```
Given the context of '{{retrieval_text_0}} + {{retrieval_text_1}}', and based on the frequently asked questions from our users, answer the user query as follows: '{{user_query}}'. Follow the instructions here exactly: '{{instruction}}'.
```

prompt_template_version: This field should receive the version of the template used. This will allow you to filter by this field in the Arize platform.

# Declare prompt template columns
prompt_template = "Given the context of '{{retrieval_text_0}} + {{retrieval_text_1}}', and based on the frequently asked questions from our users, answer the user query as follows: '{{user_query}}'. Follow the instructions here exactly: '{{instruction}}'."
prompt_template_version = "template_v1"

# Log data into the Arize platform
response = arize_client.log(
    ...
    model_type=ModelTypes.GENERATIVE_LLM,
    prompt=prompt,
    response=response,
    prompt_template=prompt_template,
    prompt_template_version=prompt_template_version,
)

Ingesting LLM Configuration Parameters (Optional)

Arize supports the original LLM configurations to be tracked + monitored as well as modified as a part of the prompt playground feature. Currently, Open AI models are supported, with more model support coming soon. The following fields in the log call are available:

llm_model_name: This field should receive the name of the LLM used to produce a response to the prompt. Common examples are gpt-3.5turbo or gpt-4.
llm_params: This field should receive the hyperparameters used to configure the LLM used. The type should be a well-formatted JSON string. For example: {'max_tokens': 500, 'presence_penalty': 0.66, 'temperature': 0.28}

# Declare LLM hyperparameters
llm_params="{'max_tokens': 500, 'presence_penalty': 0.66, 'temperature': 0.28}"

# Log data into the Arize platform
response = arize_client.log(
    ...
    model_type=ModelTypes.GENERATIVE_LLM,
    prompt=prompt,
    response=response,
    prompt_template=prompt_template,
    prompt_template_version=prompt_template_version,
    llm_model_name="gpt-3.5turbo",
    llm_params=llm_params,
)

Ingesting LLM Run Metadata

Arize supports tracking token usage and response latency from the LLM run inference. Learn more about metadata tracking here.

LLMRunMetadataColumnNames: This field groups together the run metadata

llm_run_metadata = LLMRunMetadata(
    total_token_count=4325,
    prompt_token_count=2325
    response_token_count=2000,
    response_latency_ms=20000,
)
# Log data into the Arize platform
response = arize_client.log(
    ...
    model_type=ModelTypes.GENERATIVE_LLM,
    prompt=prompt,
    response=response,
    prompt_template=prompt_template,
    prompt_template_version=prompt_template_version,
    llm_model_name="gpt-3.5turbo",
    llm_params=llm_params,
    llm_run_metadata=llm_run_metadata,
)

Case-Specific LLM Ingestion

Prompt & Response

Upload LLM prompt and responses via the prompt_column_names and response_column_names fields.

Upload Without Embeddings

Upload With Embeddings

Prompt & Response without Embeddings

Upload prompts and responses without embeddings vectors using the relevant column name for your prompt and/or response text.

The following examples include both prompt and response information. However, you can send either a prompt or a response if you do not have both.

# Declare prompt & response text columns
prompt_columns="document"
response_columns="summary"
# Define the Schema
schema = Schema(
    ...
    prompt_column_names=prompt_columns,
    response_column_names=response_columns,
)

Prompt & Response with Embeddings

Upload prompt and responses with embedding vectors using the EmbeddingColumnNames object to define the prompt_column_names and response_column_names in your model schema.

The vector_column_name should match the column name representing your embedding vectors.

⚠️ Note: The embedding vector is the dense vector representation of the unstructured input. E_mbedding features are not sparse vectors._

The data_column_name should match the column name representing the raw text associated with the vector stored.

The data_column_name is typically used for NLP use cases. The column can contain both strings (full sentences) or a list of strings (token arrays).

# Declare prompt & response embedding columns
prompt_columns=EmbeddingColumnNames(
    vector_column_name="prompt_vector", #optional
    data_column_name="response"
),
response_columns=EmbeddingColumnNames(
    vector_column_name="response_vector", #optional
    data_column_name="response"
)
# Define the Schema
schema = Schema(
    ...
    prompt_column_names=prompt_columns,
    response_column_names=response_columns,
)

Prompt Playground

Upload prompt versions and the prompt templates using the PromptTemplateColumnNames object.

PromptTemplateColumnNames: The field that groups prompt templates with their versions
template_column_name: The field that contains the prompt template in string format
template_version_column_name: The field that defines the template version

Example fields:

The template_column_name variables are represented via the double key braces {{variable_name}}.

Given the context of '{{retrieval_text_0}} + {{retrieval_text_1}}', and based on the frequently asked questions from our users, answer the user query as follows: '{{user_query}}'. Follow the instructions here exactly: '{{instruction}}'.

The template_version_column_name field enables you to filter by version in Arize.

# Declare prompt template columns
prompt_template_columns = PromptTemplateColumnNames(
        template_column_name="prompt_template",
        template_version_column_name="prompt_template_name"
)
# Define the Schema
schema = Schema(
    ...
    prompt_template_column_names=prompt_template_columns,
)

Learn more about prompt engineering here.

LLM Configuration Parameters

Track and monitor original and modified LLMs with the LLMConfigColumnNames object.

LLMConfigColumnNames: This field groups the LLM with its hyperparameters
model_column_name: This field contains the LLM names used to produce responses (i.e. gpt-3.5turbo or gpt-4).
params_column_name: This field contains the hyperparameters used to configure the LLM. The contents of the column must be well-formatted JSON string (i.e. {'max_tokens': 500, 'presence_penalty': 0.66, 'temperature': 0.28}).

# Declare LLM config columns
llm_config_columns = LLMConfigColumnNames(
        model_column_name="llm_config_model_name",
        params_column_name="llm_params",
)
# Define the Schema
schema = Schema(
    ...
    llm_config_column_names=llm_config_columns,
)

Track Token Usage

Track token usage and response latency from the LLM run inference with the LLMRunMetadataColumnNames field.

LLMRunMetadataColumnNames: This field groups together the run metadata

llm_run_metadata = LLMRunMetadataColumnNames(
    total_token_count_column_name="total_tokens_used",
    prompt_token_count_column_name="prompt_tokens_used",
    response_token_count_column_name="response_tokens_used",
    response_latency_ms_column_name="response_latency",
)
# Define the Schema
schema = Schema(
    ...
    llm_run_metadata_column_names=llm_run_metadata,
)

Retrieval Debugging with Knowledge Base (Corpus) Data

Upload the dataset of documents, such as the Knowledge Base (or Corpus), of the deployed application with the CorpusSchema object.

# Logging the Corpus dataset
response = arize_client.log(  
        dataframe=corpus_df, # Refers to the above dataframe with the example row 
        model_id="search-and-retrieval-with-corpus-dataset",
        model_type=ModelTypes.GENERATIVE_LLM,
        environment=Environments.CORPUS,
        schema=CorpusSchema(
            document_id_column_name='document_id',
            document_text_embedding_column_names=EmbeddingColumnNames(
                vector_column_name='text_vector',
                data_column_name='text'
            ),
            document_version_column_name='document_version'
        ),
)

Learn more about how Corpus datasets are used here.

Last updated 9 months ago

Was this helpful?