Levenshtein Distance

Calculates the Levenshtein (edit) distance between two strings — the minimum number of single-character insertions, deletions, or substitutions needed to transform one string into the other. A score of 0 means the strings are identical; higher scores indicate more differences.

Parameters

Parameter	Type	Required	Default	Description
`expected`	string	Yes	—	The reference string
`actual`	string	Yes	—	The string to evaluate
`case_sensitive`	boolean	No	`true`	Whether the comparison is case-sensitive

Output

Property	Value	Description
`score`	Integer ≥ 0	Number of edits required; `0` = identical strings
Optimization	Minimize	Lower scores are better

Configuring Inputs

Each evaluator parameter can be set to either a path (a JSONPath expression that extracts a value from the evaluation parameters) or a literal (a fixed value typed directly). Use paths to pull from dataset inputs, task outputs, reference data, or metadata. See Input Mapping for full details on mapping modes, resolution order, and examples.

Usage Examples

Answer closeness — A QA model where small paraphrasing is acceptable but significant divergence is not. Actual receives the model’s text response; Expected receives the reference answer from your dataset, typically a path like reference.answer. Comparing average edit distance across experiment runs shows whether prompt changes are moving outputs closer to reference. Entity extraction quality — A pipeline that extracts a specific named value (a product name, location, or identifier). Actual is the extracted value from the model’s output — often a nested path like output.entity if the response is structured JSON. Expected is the ground-truth value per example in your dataset. Edit distance reveals whether extraction is improving as you iterate on prompts or model configuration. Comparative prompt evaluation — Two prompt variants tested against the same dataset. Actual receives the response field from each run; Expected stays fixed, pointing to the same reference column. The variant with the lower average Levenshtein score is closer to the reference outputs.

Notes

The algorithm runs in O(n×m) time, where n and m are the lengths of the two strings. Performance degrades quadratically on very long inputs. Keep inputs under a few thousand characters for predictable evaluation times.

​Parameters

​Output

​Configuring Inputs

​Usage Examples

​Notes

​See Also

Parameters

Output

Configuring Inputs

Usage Examples

Notes

See Also