Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
View Source on Github
Arize class to organize and map column names containing model data within your Pandas dataframe to Arize.
Import and initialize Arize Schema from arize.utils.types
| Parameter | Data Type | Expected Type In Column | Description |
|---|---|---|---|
prediction_id_column_name | str | Contents must be a string limited to 128 characters | (Optional) A unique string to identify a prediction event. Required to match a prediction to delayed actuals or feature importances in Arize. If the column is not provided, Arize will generate a random prediction id. |
feature_column_names | List[str] or TypedColumns | Feature values can be int, float, string, list of strings | (Optional) Column names for features. If TypedColumns is used, the columns will be cast to the provided types prior to logging. |
embedding_feature_column_names | Dict[str, EmbeddingColumnNames] | Learn more here | (Optional) Dictionary mapping embedding display names to EmbeddingColumnNames objects |
timestamp_column_name | str | The content of this column must be int Unix Timestamps in seconds | (Optional) Column name for timestamps |
prediction_label_column_name | str | The content of this column must be convertible to string | (Optional) Column name for categorical prediction values |
prediction_score_column_name | str | The content of this column must be int/float.For Multi-Class models, content of this column must be a dictionary, mapping class name to int/float prediction scores. | (Optional Column name for numeric prediction values |
actual_label_column_name | str | The content of this column must be convertible to string | (Optional) Column name for categorical ground truth values |
actual_score_column_name | str | The content of this column must be int/float.For Multi-Class models, content of this column must be a dictionary, mapping class name to int/float actual scores. | (Optional) Column name for numeric ground truth |
tag_column_names | List[str] or TypedColumns | Tag values can be int, float, string. LImited to 1k values | (Optional) Column names for tags. If TypedColumns is used, the columns will be cast to the provided types prior to logging. |
shap_values_column_names | Dict[str,str] | The content of this column must be int/float | (Optional) dict of k-v pairs where k is the feature_colname and v is the corresponding shap_val_col_name. For example, your dataframe contains features columnsfeat1, feat2, feat3,...and corresponding shap value columns feat1_shap, feat2_shap, feat3_shap,... You want to set shap_values_column_names = {"feat1": "feat1shap", "feat2": "feat2_shap:", "feat3": "feat3_shap"} |
prediction_group_id_column_name | str | The content of this column must be string and is limited to 128 characters | (Required*) Column name for ranking groups or lists in ranking models *for ranking models only |
rank_column_name | str | The content of this column must be integer between 1-100 | (Required*) Column name for rank of each element on the its group or list *for ranking models only |
relevance_score_column_name | str | The content of this column must be int/float | (Required*) Column name for ranking model type numeric ground truth values *for ranking models only |
relevance_labels_column_name | str | The content of this column must be a string | (Required*) Column name for ranking model type categorical ground truth values *for ranking models only |
object_detection_prediction_column_names | ObjectDetectionColumnNames | Learn more here | ObjectDetectionColumnNames object containing information defining the predicted bounding boxes’ coordinates, categories, and scores. |
object_detection_actual_column_names | ObjectDetectionColumnNames | Learn more here | ObjectDetectionColumnNames object containing information defining the actula bounding boxes’ coordinates, categories, and scores. |
prompt_column_names | EmbeddingColumnNames | Learn more here | EmbeddingColumnNames object containing the embedding vector data (required) and raw text (optional) for the input text your model acts on |
response_column_names | EmbeddingColumnNames | Learn more here | EmbeddingColumnNames object containing the embedding vector data (required) and raw text (optional) for the text your model generates |
prompt_template_column_names | PromptTemplateColumnNames | Learn more here | PromptTemplateColumnNames object containing the prompt template and prompt template version, both optional |
llm_config_column_names | LLMConfigColumnNames | Learn more here | LLMConfigColumnNames object containing the LLM model name (optional) and its hyper-parameters (optional) used at inference time |
llm_run_metadata_column_names | LLMRunMetadataColumnNames | Learn more here | LLMRunMetadata object containing metadata about the LLM inference, i.e., token counts and response latency |
retrieved_document_ids_column_name | str | The contents of this column must be list of entries convertible to strings | Column name for retrieved document ids |
multi_class_threshold_scores_column_name | str | Contents of this column must be a dictionary mapping string class names to float scores.Learn more here | (Optional) Column name used only for Multi-Label Multi-Class models and determines the minimum prediction value for a class to be considered a positive prediction. |
Code Example
| prediction id | feature_1 | feature_2 | tag_1 | tag_2 | prediction_ts | prediction_label | actual_label | embedding |
|---|---|---|---|---|---|---|---|---|
| 1fcd50f4689 | ca | [ca, ak] | female | 25 | 1637538845 | No Claims | No Claims | [1.27346, -0.2138, …] |