Browse topics Hub · essay · articles · FAQ · glossary

Holdout scenarios and the train/test wall

The trust mechanism at the heart of the dark factory, borrowed directly from how ML models are evaluated. Acceptance criteria are written as plain-language b…

The trust mechanism at the heart of the dark factory, borrowed directly from how ML
models are evaluated. Acceptance criteria are written as plain-language behavioural
scenarios kept in a location the coding agent cannot read; a separate
evaluator plans and executes the calls needed to exercise the described behaviour,
and an LLM judges whether the output satisfied it. The spec is the training data,
the scenarios are the held-out test set, and the agent never sees the test — when it
fails it gets only a terse failure message, not the scenario text, so it cannot game
the gate. Students learn why this isolation is what makes the gate meaningful, and a
practical bonus: because the evaluator interprets plain-English scenarios
dynamically, there is no brittle step-definition glue to rot as in traditional BDD.

Sources