> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Databricks

> Learn how to setup an import job using Databricks

### Step 1 - Generate a Token

If necessary, generate a PAT (Personal Access Token), which will be used to authenticate in the following steps when you generate a token for your service principal.

Navigate to your Workspace and click "User Settings"

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/75712400-image.jpeg" />
</Frame>

Click "Generate new token"

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/2ec4742b-image.jpeg" />
</Frame>

Take note of your PAT

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/7015d172-image.jpeg" />
</Frame>

<Tabs>
  <Tab title="With Unity Catalog">
    1. Navigate to your Workspace and click "Admin Settings"

    <Frame caption="">
      <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/a9bf7273-image.jpeg" />
    </Frame>

    2. In the "Service Principals" tab, click "Add Service Principal"

    <Frame caption="">
      <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/c65b2ce7-image.jpeg" />
    </Frame>
  </Tab>

  <Tab title="Without Unity Catalog">
    1. Click on "User Management" on `accounts.cloud.databricks.com`

           <Frame caption="">
             <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/e7104c65-image.jpeg" />
           </Frame>

    2. Create a Service Principal

           <Frame caption="">
             <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/2c0695d4-image.jpeg" />
           </Frame>

    3. Take note of the Application ID

           <Frame caption="">
             <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/cfad8d33-image.jpeg" />
           </Frame>

    4. Run the following `curl` command to create a service principal in your workspace where `${DATABRICKS_HOST}` is the workspace URL, `${DATABRICKS_TOKEN}` is the PAT you just created, and `$APPLICATION_ID` is the Application ID of the service principal you just created

    ```swift theme={null}
    curl -X POST \
    ${DATABRICKS_HOST}/api/2.0/preview/scim/v2/ServicePrincipals \
    --header "Content-type: application/json" \
    --header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
    --data "{
      \"displayName\": \"displayName\",
      \"externalId\": \"externalId\",
      \"applicationId\": \"${APPLICATION_ID}\",
      \"id\": \"id\",
      \"active\": true
    }"
    ```
  </Tab>
</Tabs>

Click on the service principal and enable “Databricks SQL access” and “Workspace access” and click “Update”

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/290b4994-image.jpeg" />
</Frame>

Navigate to "Admin Settings" > "Workspace Settings". Search for *Personal Access Tokens*

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/7fecfd35-image.jpeg" />
</Frame>

Click Permission Settings and grant "Can Use" to the service account you just created.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/132c032a-image.jpeg" />
</Frame>

With your **Token** (PAT) and **Application ID**, run the following CURL command. Don't forget to fill in the environment variables with your specific information (`${DATABRICKS_HOST}` should be the URL of your workspace)

```swift theme={null}
curl -X POST \
${DATABRICKS_HOST}/api/2.0/token-management/on-behalf-of/tokens \
--header "Content-type: application/json" \
--header "Authorization: Bearer ${DATABRICKS_TOKEN}" \
--data "{\"application_id\": \"${APPLICATION_ID}\" }"
```

Save the **token\_value** from the response. **This is the Token you will use to complete the remaining setup in Arize later.**

### Step 2 - Grant Access To Your Table

Go to the Data Explorer (on the left drawer) and click on the catalog with the table/view you want to grant access.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/97e1a81d-image.jpeg" />
</Frame>

Click “Permissions” and grant “USE CATALOG” and “USE SCHEMA”. Click Grant.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/66801865-image.jpeg" />
</Frame>

Go to the view/table and click “Permissions” and grant “SELECT” to the view/table

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/510c30a0-image.jpeg" />
</Frame>

Go to "SQL Warehouses" > \[YOUR\_WAREHOUSE\_NAME] and click on "Permissions". Grant Can Use permissions to your service principal.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/4ba81674-image.jpeg" />
</Frame>

### Step 3 - Start the Data Upload Wizard

Navigate to the 'Upload Data' page on the left navigation bar in the Arize platform. From there, select the 'Databricks' card or navigate to the Data Warehouse tab to start a new table import job to begin **a new table import job.**

**Storage Selection: Databricks**

<Frame caption="Select Databricks from Table Options">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/a2231c39-image.jpeg" />
</Frame>

Input Hostname, Endpoint, Port, and Token (from Step 1)

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/fd1a9387-image.jpeg" />
</Frame>

You can find Hostname, Endpoint, and Port in your Workspace

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/a4943ebd-image.jpeg" />
</Frame>

Similarly for Table ID

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/4fd181b5-image.jpeg" />
</Frame>

<Info>
  If you have issues granting permissions please reach out to [support@arize.com](mailto:support@arize.com)
</Info>

### Step 4 - Grant Access To Your Catalog, Schema, or Table

Tag your Catalog/Schema/Table with the `arize_ingestion_key` and the provided label value using the steps below. For more details, see docs on [Table\_tags](https://docs.databricks.com/en/sql/language-manual/information-schema/table_tags.html#table_tags) for Databricks.

**In Arize UI:** Copy `arize_ingestion_key` value

<Frame caption="copy arize_ingestion_key value">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/e3b42d20-image.jpeg" />
</Frame>

<Info>
  You can grant access to [**a Table**](/ax/machine-learning/machine-learning/integrations-ml/databricks#granting-access-to-a-table) , [**a Schema**](/ax/machine-learning/machine-learning/integrations-ml/databricks#granting-access-to-a-schema) or [**a Catalog**](/ax/machine-learning/machine-learning/integrations-ml/databricks#granting-access-to-a-catalog) in databricks
</Info>

<AccordionGroup>
  <Accordion title="Granting Access to A Table (via apply tags feature)">
    1. Navigate to your Workspace > Catalog, click on the **Table** to grant access to

    2. Click the \*\*Add tags \*\*button underneath the **Table** name

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/c0bf3703-image.jpeg)

    1. In the pop up open, enter **arize\_ingestion\_key** in the **Key** field and paste the copied tag value in the **Value** field

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/21ddbd02-image.jpeg)
  </Accordion>

  <Accordion title="Granting Access to A Schema (via apply tags feature)">
    1. Navigate to your Workspace > Catalog, click on the **Schema** to grant access to

    2. Click the \*\*Add tags \*\*button underneath the **Schema** name

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/2649c93a-image.jpeg)

    1. In the pop up open, enter **arize\_ingestion\_key** in the **Key** field and paste the copied tag value in the **Value** field

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/8ab70c96-image.jpeg)
  </Accordion>

  <Accordion title="Granting Access to A Catalog (via apply tags feature)">
    1. Navigate to your Workspace > Catalog, and click on the **Catalog** to grant access to

    2. Click the \*\*Add tags \*\*button underneath the **Schema** name

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/3d858d36-image.jpeg)

    1. In the pop up open, enter **arize\_ingestion\_key** in the **Key** field and paste the copied tag value in the **Value** field

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/00305cd7-image.jpeg)
  </Accordion>

  <Accordion title="Granting Access to A Table (via adding key value pairs in table properties)">
    If you are using built-in catalogs like hive\_metastore or an older version of Databricks, you might encounter limitations when applying `table_tags`, `schema_tags`, and `catalog_tags`. However, there's an effective workaround to set up the `arize_ingestion_key` tag for your table to ensure proper access validation.

    1. Navigate to your SQL editor in your workspace and run the following SQL query:

    ```
    ALTER TABLE table_name SET TBLPROPERTIES ('arize_ingestion_key' = 'key');
    ```

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/0ff3ce3e-image.jpeg)

    1. To confirm that the `arize_ingestion_key` has been successfully applied to your table, run the following SQL command

    ```
    SHOW TBLPROPERTIES table_name;
    ```

    Look for the `arize_ingestion_key` in the results. You should see it listed along with the key-values returned from the query

    ![](https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/011803aa-image.jpeg)
  </Accordion>
</AccordionGroup>

### Step 5 - Configure Your Model And Define Your Table’s Schema

Match your model schema to your model type and define your model schema through the form input or a json schema.

<Frame caption="Set up model configurations">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/1c5938c6-image.jpeg" />
</Frame>

<Frame caption="Map your table using a form">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/6ca62695-image.jpeg" />
</Frame>

<Frame caption="Map your table using a JSON schema">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/f39899d4-image.jpeg" />
</Frame>

Learn more about Schema fields [here](/ax/machine-learning/machine-learning/concepts-ml/model-schema-reference#list-of-model-schema-fields-for-data-ingestion-integrations).

Once finished, Arize will begin querying your table and ingesting your records as model inferences.

### Step 6 - Add Model Data To The Table Or View

Arize will run queries to ingest records from your table based on your configured **refresh interval**.

### Step 7 - Check your Table Import Job

Arize will attempt a dry run to validate your job for any access, schema or record-level errors. If the dry run is successful, you may then create the import job.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/1abf1022-image.jpeg" />
</Frame>

After creating a job following a successful dry run, you will be taken to the 'Job Status' tab where you can see the status of your import jobs.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/240ace7f-image.jpeg" />
</Frame>

You can view the job details and import progress by clicking on the job ID, which uncovers more information about the job.

### Step 8 - Troubleshooting An Import Job

An import job may run into a few problems. Use the dry run and job details UI to troubleshoot and quickly resolve data ingestion issues.

#### Validation Errors

If there is an error validating a file or table against the model schema, Arize will surface an **actionable** error message. From there, click on the 'Fix Schema' button to adjust your model schema.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/8b2a6365-image.jpeg" />
</Frame>

#### Dry Run File/Table Passes But The Job Fails

If your dry run is successful, but your job fails, click on the job ID to view the\*\* job details\*\*. This uncovers job details such as information about the file path or query id, the last import job, potential errors, and error locations.

<Frame caption="">
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/e41ba027-image.jpeg" />
</Frame>

Once you've identified the job failure point, append the edited row to the end of your table with an updated [change\_timestamp](/ax/machine-learning/machine-learning/integrations-ml/google-bigquery/google-bigquery-faq#how-do-i-update-fix-a-row-that-failed-to-ingest) value.
