Skip to main content

Documentation Index

Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Understanding the Access Boundary

Arize enforces role-based access control (RBAC) for all UI and API interactions — users can only view and manage Data Fabric connectors for spaces they have access to. However, once data is written to your cloud storage bucket, access is governed entirely by your cloud provider’s IAM policies. This means two layers of access control matter:
  • Arize RBAC controls who can create, modify, and view Data Fabric connectors
  • Cloud IAM (AWS IAM or GCP IAM) controls who can read the exported Parquet files in your bucket
If these two layers aren’t aligned, users could bypass Arize permissions by accessing the bucket directly.

Use Separate Buckets or Prefixes Per Space

Each Data Fabric connector is scoped to a single Arize space. To create clean IAM boundaries, mirror this structure in your cloud storage layout — use a dedicated bucket or prefix for each space.
your-bucket/
├── team-alpha/arize-data-fabric/...    ← Space A connector
├── team-beta/arize-data-fabric/...     ← Space B connector
This allows you to write IAM policies that grant each team access only to their space’s data, without needing to parse file contents or table metadata.

Sync Access via Your Identity Provider

Arize supports SAML SSO with role mappings that map Identity Provider (IdP) attributes to Arize organizations, spaces, and roles. You can use the same IdP groups to control cloud storage access, creating a single source of truth for both Arize and your data warehouse. The pattern:
  1. Define IdP groups for each team (e.g., team-alpha in Okta or Entra ID)
  2. Map groups to Arize spaces via SAML role mappings — members of team-alpha get access to Space A
  3. Map the same groups to cloud IAM — members of team-alpha get read access to your-bucket/team-alpha/*
Adding or removing a user from the IdP group updates access in both Arize and your cloud storage simultaneously.

Provider-Specific IAM Guidance

Scope IAM policies to prefixes

Create an IAM policy for each team that restricts S3 access to their Data Fabric prefix:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket",
        "arn:aws:s3:::your-bucket/team-alpha/*"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": ["team-alpha/*"]
        }
      }
    }
  ]
}
Arize’s own arize-data-fabric service role needs write access across all prefixes to perform syncs. The policies above are for human and analytics consumers who should only have read access to their team’s data.

Use IAM Identity Center with IdP groups

If you use AWS IAM Identity Center (formerly AWS SSO) with your IdP, you can create permission sets scoped to each team’s prefix and assign them to the corresponding IdP group. This ensures that group membership in your IdP controls both Arize space access and S3 access.

Bucket tagging for ownership

Data Fabric validates that each S3 bucket has an arize-ingestion-key tag that matches the space’s key. This prevents a connector from writing to a bucket it doesn’t own. Ensure this tag is set correctly on every Data Fabric bucket.

Optional: Query-Time Access Control

If teams query Iceberg tables directly through a data warehouse, consider adding an additional layer of access control at the query engine:
  • AWS Lake Formation — enforce column-level and row-level permissions when querying via Athena or Redshift Spectrum
  • BigQuery IAM — control access to external tables independently of the underlying GCS bucket
  • Snowflake RBAC — use Snowflake roles and grants to restrict which teams can query specific external tables
This provides defense in depth: even if a user has bucket-level read access, the query engine can further restrict what data they can see.