Prompt injection

Prompt injection is an attack where malicious or untrusted text attempts to override system instructions, change tool behavior, leak data, or manipulate an agent's decisions. It often appears inside user input, retrieved documents, web pages, emails, tickets, or tool outputs.

Agents are especially exposed because they read untrusted content and can take actions. Defenses include instruction hierarchy, input isolation, tool permissions, retrieval sanitization, policy checks, and eval suites that include adversarial examples.

Docs

Learn

Insights

Company

Docs

Learn

Insights

Company

What Is Prompt Injection?