Access Control Best Practices

Understanding the Access Boundary

Arize enforces role-based access control (RBAC) for all UI and API interactions — users can only view and manage Data Fabric connectors for spaces they have access to. However, once data is written to your cloud storage bucket, access is governed entirely by your cloud provider’s IAM policies. This means two layers of access control matter:

Arize RBAC controls who can create, modify, and view Data Fabric connectors
Cloud IAM (AWS IAM or GCP IAM) controls who can read the exported Parquet files in your bucket

If these two layers aren’t aligned, users could bypass Arize permissions by accessing the bucket directly.

Use Separate Buckets or Prefixes Per Space

Each Data Fabric connector is scoped to a single Arize space. To create clean IAM boundaries, mirror this structure in your cloud storage layout — use a dedicated bucket or prefix for each space.

your-bucket/
├── team-alpha/arize-data-fabric/...    ← Space A connector
├── team-beta/arize-data-fabric/...     ← Space B connector

This allows you to write IAM policies that grant each team access only to their space’s data, without needing to parse file contents or table metadata.

Sync Access via Your Identity Provider

Arize supports SAML SSO with role mappings that map Identity Provider (IdP) attributes to Arize organizations, spaces, and roles. You can use the same IdP groups to control cloud storage access, creating a single source of truth for both Arize and your data warehouse. The pattern:

Define IdP groups for each team (e.g., team-alpha in Okta or Entra ID)
Map groups to Arize spaces via SAML role mappings — members of team-alpha get access to Space A
Map the same groups to cloud IAM — members of team-alpha get read access to your-bucket/team-alpha/*

Adding or removing a user from the IdP group updates access in both Arize and your cloud storage simultaneously.

Provider-Specific IAM Guidance

Amazon S3
Google Cloud Storage

Scope IAM policies to prefixes

Create an IAM policy for each team that restricts S3 access to their Data Fabric prefix:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket",
        "arn:aws:s3:::your-bucket/team-alpha/*"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": ["team-alpha/*"]
        }
      }
    }
  ]
}

Arize’s own arize-data-fabric service role needs write access across all prefixes to perform syncs. The policies above are for human and analytics consumers who should only have read access to their team’s data.

Use IAM Identity Center with IdP groups

If you use AWS IAM Identity Center (formerly AWS SSO) with your IdP, you can create permission sets scoped to each team’s prefix and assign them to the corresponding IdP group. This ensures that group membership in your IdP controls both Arize space access and S3 access.

Bucket tagging for ownership

Data Fabric validates that each S3 bucket has an arize-ingestion-key tag that matches the space’s key. This prevents a connector from writing to a bucket it doesn’t own. Ensure this tag is set correctly on every Data Fabric bucket.

Scope IAM bindings to prefixes

Use IAM Conditions to restrict access to specific prefixes within your bucket:

gcloud storage buckets add-iam-policy-binding gs://your-bucket \
    --member="group:team-alpha@your-domain.com" \
    --role="roles/storage.objectViewer" \
    --condition="expression=resource.name.startsWith('projects/_/buckets/your-bucket/objects/team-alpha/'),title=team-alpha-prefix-only"

Arize’s arize-data-fabric service account needs write access across all prefixes to perform syncs. The bindings above are for human and analytics consumers who should only have read access to their team’s data.

Use Workload Identity Federation with IdP groups

If you federate your IdP with GCP via Workload Identity Federation, you can map IdP groups to GCP principals and assign prefix-scoped IAM bindings to each group. This keeps your IdP as the single control plane for both Arize and GCS access.

Bucket labels for ownership

Data Fabric validates that each GCS bucket has an arize-ingestion-key label that matches the space’s key. This prevents a connector from writing to a bucket it doesn’t own. Ensure this label is set correctly on every Data Fabric bucket.

Optional: Query-Time Access Control

If teams query Iceberg tables directly through a data warehouse, consider adding an additional layer of access control at the query engine:

AWS Lake Formation — enforce column-level and row-level permissions when querying via Athena or Redshift Spectrum
BigQuery IAM — control access to external tables independently of the underlying GCS bucket
Snowflake RBAC — use Snowflake roles and grants to restrict which teams can query specific external tables

This provides defense in depth: even if a user has bucket-level read access, the query engine can further restrict what data they can see.

How to Use Arize AX

Quickstart

Instrument

Observe

Evaluate

Improve

Machine Learning

Settings

Security

Access Control Best Practices

Understanding the Access Boundary

Use Separate Buckets or Prefixes Per Space

Sync Access via Your Identity Provider

Provider-Specific IAM Guidance

Scope IAM policies to prefixes

Use IAM Identity Center with IdP groups

Bucket tagging for ownership

Scope IAM bindings to prefixes

Use Workload Identity Federation with IdP groups

Bucket labels for ownership

Optional: Query-Time Access Control

How to Use Arize AX

Quickstart

Instrument

Observe

Evaluate

Improve

Machine Learning

Settings

Security

Documentation Index

​Understanding the Access Boundary

​Use Separate Buckets or Prefixes Per Space

​Sync Access via Your Identity Provider

​Provider-Specific IAM Guidance

​Scope IAM policies to prefixes

​Use IAM Identity Center with IdP groups

​Bucket tagging for ownership

​Scope IAM bindings to prefixes

​Use Workload Identity Federation with IdP groups

​Bucket labels for ownership

​Optional: Query-Time Access Control

Understanding the Access Boundary

Use Separate Buckets or Prefixes Per Space

Sync Access via Your Identity Provider

Provider-Specific IAM Guidance

Scope IAM policies to prefixes

Use IAM Identity Center with IdP groups

Bucket tagging for ownership

Scope IAM bindings to prefixes

Use Workload Identity Federation with IdP groups

Bucket labels for ownership

Optional: Query-Time Access Control