> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Access Control Best Practices

> Secure Data Fabric synced data with cloud IAM policies and Identity Provider integration

## Understanding the Access Boundary

Arize enforces role-based access control (RBAC) for all UI and API interactions — users can only view and manage Data Fabric connectors for spaces they have access to. However, once data is written to your cloud storage bucket, access is governed entirely by your cloud provider's IAM policies.

This means two layers of access control matter:

* **Arize RBAC** controls who can create, modify, and view Data Fabric connectors
* **Cloud IAM** (AWS IAM or GCP IAM) controls who can read the exported Parquet files in your bucket

If these two layers aren't aligned, users could bypass Arize permissions by accessing the bucket directly.

## Use Separate Buckets or Prefixes Per Space

Each Data Fabric connector is scoped to a single Arize space. To create clean IAM boundaries, mirror this structure in your cloud storage layout — use a dedicated bucket or prefix for each space.

```
your-bucket/
├── team-alpha/arize-data-fabric/...    ← Space A connector
├── team-beta/arize-data-fabric/...     ← Space B connector
```

This allows you to write IAM policies that grant each team access only to their space's data, without needing to parse file contents or table metadata.

## Sync Access via Your Identity Provider

Arize supports [SAML SSO with role mappings](/ax/security-and-settings/sso-and-rbac) that map Identity Provider (IdP) attributes to Arize organizations, spaces, and roles. You can use the same IdP groups to control cloud storage access, creating a single source of truth for both Arize and your data warehouse.

The pattern:

1. **Define IdP groups** for each team (e.g., `team-alpha` in Okta or Entra ID)
2. **Map groups to Arize spaces** via SAML role mappings — members of `team-alpha` get access to Space A
3. **Map the same groups to cloud IAM** — members of `team-alpha` get read access to `your-bucket/team-alpha/*`

Adding or removing a user from the IdP group updates access in both Arize and your cloud storage simultaneously.

## Provider-Specific IAM Guidance

<Tabs>
  <Tab title="Amazon S3">
    ### Scope IAM policies to prefixes

    Create an IAM policy for each team that restricts S3 access to their Data Fabric prefix:

    ```json theme={null}
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:ListBucket"
          ],
          "Resource": [
            "arn:aws:s3:::your-bucket",
            "arn:aws:s3:::your-bucket/team-alpha/*"
          ],
          "Condition": {
            "StringLike": {
              "s3:prefix": ["team-alpha/*"]
            }
          }
        }
      ]
    }
    ```

    <Tip>
      Arize's own `arize-data-fabric` service role needs write access across all prefixes to perform syncs. The policies above are for **human and analytics consumers** who should only have read access to their team's data.
    </Tip>

    ### Use IAM Identity Center with IdP groups

    If you use AWS IAM Identity Center (formerly AWS SSO) with your IdP, you can create permission sets scoped to each team's prefix and assign them to the corresponding IdP group. This ensures that group membership in your IdP controls both Arize space access and S3 access.

    ### Bucket tagging for ownership

    Data Fabric validates that each S3 bucket has an `arize-ingestion-key` tag that matches the space's key. This prevents a connector from writing to a bucket it doesn't own. Ensure this tag is set correctly on every Data Fabric bucket.
  </Tab>

  <Tab title="Google Cloud Storage">
    ### Scope IAM bindings to prefixes

    Use IAM Conditions to restrict access to specific prefixes within your bucket:

    ```bash theme={null}
    gcloud storage buckets add-iam-policy-binding gs://your-bucket \
        --member="group:team-alpha@your-domain.com" \
        --role="roles/storage.objectViewer" \
        --condition="expression=resource.name.startsWith('projects/_/buckets/your-bucket/objects/team-alpha/'),title=team-alpha-prefix-only"
    ```

    <Tip>
      Arize's `arize-data-fabric` service account needs write access across all prefixes to perform syncs. The bindings above are for **human and analytics consumers** who should only have read access to their team's data.
    </Tip>

    ### Use Workload Identity Federation with IdP groups

    If you federate your IdP with GCP via Workload Identity Federation, you can map IdP groups to GCP principals and assign prefix-scoped IAM bindings to each group. This keeps your IdP as the single control plane for both Arize and GCS access.

    ### Bucket labels for ownership

    Data Fabric validates that each GCS bucket has an `arize-ingestion-key` label that matches the space's key. This prevents a connector from writing to a bucket it doesn't own. Ensure this label is set correctly on every Data Fabric bucket.
  </Tab>
</Tabs>

## Optional: Query-Time Access Control

If teams query Iceberg tables directly through a data warehouse, consider adding an additional layer of access control at the query engine:

* **AWS Lake Formation** — enforce column-level and row-level permissions when querying via Athena or Redshift Spectrum
* **BigQuery IAM** — control access to external tables independently of the underlying GCS bucket
* **Snowflake RBAC** — use Snowflake roles and grants to restrict which teams can query specific external tables

This provides defense in depth: even if a user has bucket-level read access, the query engine can further restrict what data they can see.
