Trace as Inferences

If you are already storing your LLM application data in tables or files, Arize supports an alternative way to log LLM application data as inferences.

A step-by-step guide to ingesting LLMs

The following code example provides a brief overview of uploading prompts, embeddings, and other model parameters. Run the above Colab for a more detailed view LLM ingestion.

Example Row

prompt_text
prompt_vector
response_text
openai_relevance_0
retrieval_text_0
text_similarity_0
text_similarity_1
user_feedback
prediction_ts

How often does Arize query the table for table import jobs?

[ 0.00393428 -0.00417591 -0.00854287...

Arize will regularly sync new data from your data source with Arize...

irrelevant

Arize will attempt a dry run to validate your job for any access...

0.86539755255...

0.8653975525...

NaN

2023-04-05 20:33:22.006650000

# Declare prompt and response columns
prompt_columns=EmbeddingColumnNames(
    vector_column_name="prompt_vector",
    data_column_name="prompt_text"
)

response_columns="response_text"

# feature & tag columns can be optionally defined with typing:
tag_columns = TypedColumns(
    to_float=["text_similarity_0", "text_similarity_1"],
    inferred=["openai_relevance_0", "retrieval_text_1"],
)

# Define the Schema, including embedding information
schema = Schema(
    timestamp_column_name="prediction_ts",
    actual_label_column_name="user_feedback",
    tag_column_names=tag_columns,
    prompt_column_names=prompt_columns,
    response_column_names=response_columns
)

# Log the dataframe with the schema mapping 
response = arize_client.log(
    model_id="search-and-retrieval-with-corpus-dataset",
    model_version= "v1",
    model_type=ModelTypes.GENERATIVE_LLM,
    environment=Environments.PRODUCTION,
    dataframe=test_dataframe,
    schema=schema,
)

Case-Specific LLM Ingestion

Prompt & Response

Upload LLM prompt and responses via the prompt_column_names and response_column_names fields.

Prompt & Response without Embeddings

Upload prompts and responses without embeddings vectors using the relevant column name for your prompt and/or response text.

The following examples include both prompt and response information. However, you can send either a prompt or a response if you do not have both.

# Declare prompt & response text columns
prompt_columns="document"
response_columns="summary"
# Define the Schema
schema = Schema(
    ...
    prompt_column_names=prompt_columns,
    response_column_names=response_columns,
)

Prompt & Response with Embeddings

Upload prompt and responses with embedding vectors using the EmbeddingColumnNames object to define the prompt_column_names and response_column_names in your model schema.

  • The vector_column_name should match the column name representing your embedding vectors.

⚠️ Note: The embedding vector is the dense vector representation of the unstructured input. E_mbedding features are not sparse vectors._

  • The data_column_name should match the column name representing the raw text associated with the vector stored.

The data_column_name is typically used for NLP use cases. The column can contain both strings (full sentences) or a list of strings (token arrays).

# Declare prompt & response embedding columns
prompt_columns=EmbeddingColumnNames(
    vector_column_name="prompt_vector", #optional
    data_column_name="response"
),
response_columns=EmbeddingColumnNames(
    vector_column_name="response_vector", #optional
    data_column_name="response"
)
# Define the Schema
schema = Schema(
    ...
    prompt_column_names=prompt_columns,
    response_column_names=response_columns,
)

Prompt Playground

Upload prompt versions and the prompt templates using the PromptTemplateColumnNames object.

  • PromptTemplateColumnNames: The field that groups prompt templates with their versions

  • template_column_name: The field that contains the prompt template in string format

  • template_version_column_name: The field that defines the template version

Example fields:

The template_column_name variables are represented via the double key braces {{variable_name}}.

Given the context of '{{retrieval_text_0}} + {{retrieval_text_1}}', and based on the frequently asked questions from our users, answer the user query as follows: '{{user_query}}'. Follow the instructions here exactly: '{{instruction}}'.

The template_version_column_name field enables you to filter by version in Arize.

# Declare prompt template columns
prompt_template_columns = PromptTemplateColumnNames(
        template_column_name="prompt_template",
        template_version_column_name="prompt_template_name"
)
# Define the Schema
schema = Schema(
    ...
    prompt_template_column_names=prompt_template_columns,
)

Learn more about prompt engineering here.

LLM Configuration Parameters

Track and monitor original and modified LLMs with the LLMConfigColumnNames object.

  • LLMConfigColumnNames: This field groups the LLM with its hyperparameters

  • model_column_name: This field contains the LLM names used to produce responses (i.e. gpt-3.5turbo or gpt-4).

  • params_column_name: This field contains the hyperparameters used to configure the LLM. The contents of the column must be well-formatted JSON string (i.e. {'max_tokens': 500, 'presence_penalty': 0.66, 'temperature': 0.28}).

# Declare LLM config columns
llm_config_columns = LLMConfigColumnNames(
        model_column_name="llm_config_model_name",
        params_column_name="llm_params",
)
# Define the Schema
schema = Schema(
    ...
    llm_config_column_names=llm_config_columns,
)

Track Token Usage

Track token usage and response latency from the LLM run inference with the LLMRunMetadataColumnNames field.

  • LLMRunMetadataColumnNames: This field groups together the run metadata

llm_run_metadata = LLMRunMetadataColumnNames(
    total_token_count_column_name="total_tokens_used",
    prompt_token_count_column_name="prompt_tokens_used",
    response_token_count_column_name="response_tokens_used",
    response_latency_ms_column_name="response_latency",
)
# Define the Schema
schema = Schema(
    ...
    llm_run_metadata_column_names=llm_run_metadata,
)

Retrieval Debugging with Knowledge Base (Corpus) Data

Upload the dataset of documents, such as the Knowledge Base (or Corpus), of the deployed application with the CorpusSchema object.

# Logging the Corpus dataset
response = arize_client.log(  
        dataframe=corpus_df, # Refers to the above dataframe with the example row 
        model_id="search-and-retrieval-with-corpus-dataset",
        model_type=ModelTypes.GENERATIVE_LLM,
        environment=Environments.CORPUS,
        schema=CorpusSchema(
            document_id_column_name='document_id',
            document_text_embedding_column_names=EmbeddingColumnNames(
                vector_column_name='text_vector',
                data_column_name='text'
            ),
            document_version_column_name='document_version'
        ),
)

Learn more about how Corpus datasets are used here.

Last updated

Was this helpful?