Glossary of AI Terminology

What Is Toxicity?

Toxicity

Toxicity measures whether an output contains abusive, hateful, harassing, or otherwise harmful language. Toxicity evals can be run with classifiers, LLM judges, policy models, or human review.

For developers, toxicity should be treated as a safety signal, not a general quality score. A non-toxic answer can still be wrong, biased, ungrounded, or policy-violating.

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.

View Research Papers

Docs

Learn

Insights

Company

Docs

Learn

Insights

Company

What Is Toxicity?

Toxicity

Bi-weekly AI Research Paper Readings

Docs

Learn

Insights

Company

What Is Toxicity?

Toxicity

Bi-weekly AI Research Paper Readings

Subscribe to The Evaluator