Browse topics Hub · essay · articles · FAQ · glossary

Glossary · Validation

Eval (evaluation)

Systematic benchmarking of agent output quality, accuracy, and regression across prompt changes.

Eval (evaluation) — Systematic benchmarking of agent output quality, accuracy, and regression across prompt changes..