> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Similarity Search

> The Similarity Search feature allows you to find items that are similar to a set of reference embeddings using cosine similarity. This feature supports both image and text embeddings.

<Frame caption="Using embeddings similarity search in production workflows">
  <iframe src="https://cdn.iframe.ly/M7f0XUZ" width={1000} height={400} allowFullScreen scrolling="no" allow="accelerometer *; clipboard-write *; encrypted-media *; gyroscope *; picture-in-picture *; web-share *;" />
</Frame>

### Key Concepts

* **Reference Embedding**: The embedding vector that serves as the baseline for similarity comparisons. Select the column containing these vectors, representing the characteristics or features you are interested in matching.

* **Search Embedding**: The column containing embedding vectors of items to be compared against the reference embedding using cosine similarity.

* **Threshold**: A user-defined value that determines the minimum similarity score required for an item to be considered similar to the reference embeddings.

## Performing Similarity Search

**Selecting an Embedding Cell Directly**

* Hover over an embedding column in the table view and click the “Find Similar” button.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/6d3a5920-image.jpeg" />
</Frame>

* Select points in UMAP and then press the “Find Similar” button.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/acbbd417-image.jpeg" />
</Frame>

* Press the “Find Similar” button in dimension details after selecting an embedding or row.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/2d6d4de6-image.jpeg" />
</Frame>

Any selection automatically updates the reference object with the prediction ID and the name of the embedding column.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/63450d65-image.jpeg" />
</Frame>

### Additional Features

**Multiple Embeddings**

* Add multiple items from any of the entry points.

* When multiple embeddings are selected, their vectors will be averaged to form the reference embedding.

**Limitations**

* Different columns can be used for the search and reference, but adding a new reference point from a different column will trigger a modal error.

* Similarity search is only supported in performance tracing and embedding views.

## Programmatic Export

### How it Works

1. **Define Reference Embeddings**: Specify the embeddings you want to use as references. Ensure that all reference embeddings are in the same column.

2. **Set Search Parameters**: Define the search embedding column and the similarity threshold.

3. **Execute the Search**: Use the provided API to perform the similarity search and retrieve the results.

#### Prerequisites

Make sure you have at least version 7.18.1 of Arize installed:

```bash theme={null}
%pip install -q "arize<8.0.0"
```

### Code Example

```python theme={null}
from arize.exporter import ArizeExportClient
from arize.utils.types import Environments, SimilaritySearchParams, SimilarityReference

ARIZE_API_KEY = ""
client = ArizeExportClient(api_key=ARIZE_API_KEY)

# Establish references
similarity_references = [
    SimilarityReference(
        prediction_id="pred_1",
        reference_column_name="image_vector",
    ),
    SimilarityReference(
        prediction_id="pred_2",
        reference_column_name="image_vector",
    ),
]

# Define search parameters
search_column_name = "image_vector"
threshold = 0.8

# Execute similarity search
df = client.export_model_to_df(
    model_id=dev_model_id,
    start_time=start_time,
    end_time=end_time,
    environment=Environments.PRODUCTION,
    space_id=dev_space_id,
    similarity_search_params=SimilaritySearchParams(
        references=similarity_references,
        search_column_name=search_column_name,
        threshold=threshold
    )
)
```

###
