Skip to main content
Log predictions and actuals for classification, regression, ranking, and object detection models. Monitor drift, performance, and data quality.

Key Capabilities

  • Stream or batch logging
  • Embedding features for drift detection
  • SHAP values for explainability
  • Tags and metadata for segmentation
  • Support for delayed actuals
  • Export data for offline analysis

Stream Logging

Log individual predictions in real-time as they occur in production.
from arize.types import ModelTypes, Environments

response = client.ml.log_stream(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    prediction_id="unique-prediction-id",
    prediction_label=("not fraud", 0.85),
    actual_label=("fraud", 1.0),
    features={
        "transaction_amount": 150.0,
        "merchant_category": "online_retail",
        "user_age": 32,
    },
    embedding_features={
        "user_embedding": ([0.1, 0.2, ...], "user_123"),
    },
)

print(f"Logged prediction: {response.status}")

With Tags and Metadata

Add custom tags for segmentation and filtering.
response = client.ml.log_stream(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    prediction_id="unique-prediction-id",
    prediction_label=("not fraud", 0.85),
    features={"transaction_amount": 150.0},
    tags=["high-risk", "new-merchant"],
)

Batch Logging

Log bulk predictions from historical data or batch processing.
from arize.types import Schema, EmbeddingColumnNames
import pandas as pd

# Define schema to map DataFrame columns
schema = Schema(
    prediction_id_column_name="prediction_id",
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="predicted_label",
    actual_label_column_name="actual_label",
    feature_column_names=["feature_1", "feature_2", "feature_3"],
    embedding_feature_column_names={
        "text_embedding": EmbeddingColumnNames(
            vector_column_name="text_vector",
            link_to_data_column_name="text_content",
        ),
    },
)

# Log batch data
response = client.ml.log_batch(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    dataframe=prod_df,
    schema=schema,
    model_version="1.2.0",
)

print(f"Logged {response.record_count} predictions")

With SHAP Values

Include SHAP values for model explainability.
schema = Schema(
    prediction_id_column_name="prediction_id",
    prediction_label_column_name="predicted_label",
    feature_column_names=["feature_1", "feature_2"],
    shap_values_column_names={
        "feature_1": "shap_feature_1",
        "feature_2": "shap_feature_2",
    },
)

response = client.ml.log_batch(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    dataframe=df_with_shap,
    schema=schema,
)

Export Data

Export ML model data for offline analysis, custom processing, or archival.
from datetime import datetime
from arize.types import Environments

start_time = datetime.strptime("2024-01-01", "%Y-%m-%d")
end_time = datetime.strptime("2026-01-01", "%Y-%m-%d")

# Export to DataFrame
df = client.ml.export_to_df(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    environment=Environments.PRODUCTION,
    model_version="1.2.0",
    start_time=start_time,
    end_time=end_time,
)

print(f"Exported {len(df)} records")

Export to Parquet

client.ml.export_to_parquet(
    space_id="your-space-id",
    model_name="fraud-detection-v1",
    environment=Environments.PRODUCTION,
    start_time=start_time,
    end_time=end_time,
    output_path="./model_data_export.parquet",
)
Export capabilities:
  • Time-range filtering
  • DataFrame or Parquet output
  • Efficient Arrow Flight transport for large exports
  • Progress bars for long-running exports

Supported Model Types

Model TypeUse Case
SCORE_CATEGORICAL, MULTI_CLASSMulti-class classification
BINARY_CLASSIFICATIONBinary classification
NUMERIC, REGRESSIONRegression tasks
RANKINGRanking and recommendation systems
OBJECT_DETECTIONComputer vision object detection
GENERATIVE_LLMUse client.spans instead for LLMs

Supported Environments

EnvironmentDescription
PRODUCTIONLive production traffic
TRAININGTraining dataset
VALIDATIONValidation/test dataset
TRACINGFor LLM traces (use client.spans)