Labeling is the process of attaching human or machine-generated judgments to examples. Human labeling uses reviewers to mark correctness, relevance, safety, preference, task success, or other qualities.
Human labels are especially valuable for calibrating LLM judges and building ground truth. The quality of labels depends on clear rubrics, reviewer training, agreement checks, and escalation for ambiguous cases.