Dark factory architecture — the four layers and generator/evaluator isolation

The reference architecture has four layers: an **inputs** layer owned by humans (specs, holdout scenarios, instruction files, linter rules); an autonomous **…

The reference architecture has four layers: an inputs layer owned by humans
(specs, holdout scenarios, instruction files, linter rules); an autonomous code
generation layer (the agent reads spec plus repo knowledge, generates, builds,
tests, self-reviews, opens a change); an isolated validation layer (standard CI
first — build, unit tests, static analysis — then an LLM-driven evaluator run
against an ephemeral deployment); and a merge & deploy layer that hands off to
the existing CI/CD pipeline. The same shape recurs in vendor write-ups as
planner → generator → evaluator → deployment. Students learn the single
non-negotiable invariant: the generation layer and the validation layer must be
completely isolated, because an agent that can see how it will be judged optimises
for the judge rather than the goal. Without that wall there is no quality gate, only
theatre.

Dark factory architecture — the four layers and generator/evaluator isolation

Sources