Skip to main content

Notebook Tutorial

Blog: How To Improve AI Agent Security with Microsoft’s AI Red Teaming Agent

The AI Red Teaming Agent is a powerful tool designed to help organizations proactively find safety risks associated with generative AI systems during design and development of generative AI models and applications. Traditional red teaming involves exploiting the cyber kill chain and describes the process by which a system is tested for security vulnerabilities. However, with the rise of generative AI, the term AI red teaming has been coined to describe probing for novel risks (both content and security related) that these systems present and refers to simulating the behavior of an adversarial user who is trying to cause your AI system to misbehave in a particular way. The AI Red Teaming Agent leverages Microsoft’s open-source framework for Python Risk Identification Tool’s (PyRIT) AI red teaming capabilities along with Microsoft Foundry’s Risk and Safety Evaluations to help you automatically assess safety issues.

1. Create the Azure AI Red Teaming Agent

# Setup up red teaming agent
import os

# Azure imports
from azure.identity import DefaultAzureCredential
from azure.ai.evaluation.red_team import RedTeam, RiskCategory, AttackStrategy

#Set up environment variables
os.environ["AZURE_SUBSCRIPTION_ID"] = ""
os.environ["AZURE_RESOURCE_GROUP"] = ""
os.environ["AZURE_PROJECT_NAME"] = ""
os.environ["PROJECT_ENDPOINT"] = ""
os.environ["ARIZE_SPACE_ID"] = ""
os.environ["ARIZE_API_KEY"] = ""
os.environ["PROJECT_NAME"] = "red-team-violence-examples"


## Using Azure AI Foundry Hub project
azure_ai_project = {
    "subscription_id": os.environ["AZURE_SUBSCRIPTION_ID"],
    "resource_group_name": os.environ["AZURE_RESOURCE_GROUP"],
    "project_name": os.environ["AZURE_PROJECT_NAME"],
}
azure_ai_project = os.environ["PROJECT_ENDPOINT"]

# Instantiate your AI Red Teaming Agent
red_team_agent = RedTeam(
    azure_ai_project=azure_ai_project, # required
    credential=DefaultAzureCredential() # required
)
Optionally, you can specify which risk categories of content risks you want to cover with risk_categories parameter and define the number of prompts covering each risk category with num_objectives parameter.

# Specifying risk categories and number of attack objectives per risk categories you want the AI Red Teaming Agent to cover
red_team_agent = RedTeam(
    azure_ai_project=azure_ai_project, # required
    credential=DefaultAzureCredential(), # required
    risk_categories=[ # optional, defaults to all four risk categories
        RiskCategory.Violence,
        RiskCategory.HateUnfairness,
        RiskCategory.Sexual,
        RiskCategory.SelfHarm
    ], 
    num_objectives=10, # optional, defaults to 10
)
     

2. Trace your Agent


#enable tracing for openai
from arize.otel import register

tracer_provider = register(
    space_id = os.environ["ARIZE_SPACE_ID"],
    api_key = os.environ["ARIZE_API_KEY"],
    project_name = os.environ["PROJECT_NAME"], 
)

from openinference.instrumentation.openai import OpenAIInstrumentor
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

3. Create a target for the red teaming agent

# Set up a callback function to pass to the red teaming agent scan
import openai

# Define a simple callback function that simulates a chatbot
def simple_callback(query: str) -> str:
    # Insert your LLM or agent here
    openai_client = openai.OpenAI()
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system",
            "content": "You are a helpful AI assistant. Always maintain a polite and professional tone. Provide concise answers."
            },
            {"role": "user", "content": query}
        ],
        max_tokens=100,
    )

4. Run the red teaming scan

When the scan is finished, you can specify an output_path to capture a JSON file that represents a scorecard of your results for using in your own reporting tool or compliance platform.
red_team_result = await red_team_agent.scan(target=simple_callback)

5. Now view the traces from your red teaming scans in Arize!

Image