Glossary of AI Terminology

What is LLM Jailbreaking?

LLM Jailbreaking

LLM jailbreaking refers to escaping the guardrails and safeguards of an LLM application or foundation model. These methods exploit vulnerabilities in the model’s design or prompt engineering to elicit responses that the model would normally be restricted from generating. Jailbreaking sometimes leads to the dissemination of harmful or unintended content.

Example:

Prominent LLM jailbreaks have led to a car getting sold for $1 and disturbing replies in healthcare.

LLM Jailbreaking

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.