EasySpecs Labs

Build your first Dark Factory in one day.

AI in the editor made you faster, but the backlog still grows. In one day in Barcelona, learn to leverage your Harness and stand up the basics of a Dark Factory — so you can build tools that build tools for clients or employers.

  • 24 June 2026
  • 9:00–19:00
  • Terrace of Pier07, Barcelona

Limited to 10 participants

Individual speed is not organizational velocity

Faster typing does not mean faster delivery. Agents flood you with pull requests and review becomes the bottleneck. The real climb is orchestrating systems and operating the factory — not prompting alone.

Logistics

Event at a glance

One day, one cohort, one terrace in Barcelona — everything you need before you book.

  • Date

    24 June 2026

  • Time

    9:00 – 19:00 (Barcelona local · 1 h lunch break)

  • Venue

    Terrace of Pier07, Barcelona

  • Price

    €600 per person (VAT included)

  • Format

    In person only — individual ticket

  • Instructors

    2 instructors

Diploma

Diploma pathway — online exam 10 days after the workshop

The room

Laptops open, whiteboards full, and a small cohort working through real agentic engineering — this is the day you are booking.

Workshop whiteboard with agent frameworks, LLM orchestration, and harness engineering diagrams
Whiteboard sessions on agent orchestration, harnessing, and architecture.

You build the factory, not every line of code

  • 1

    A coding agent

    You trigger it, watch it, and review every diff. Faster typing — same bottleneck.

  • 2

    A Dark Factory

    Agents run the lifecycle behind deterministic gates. You steer intent and rules.

  • 3

    Your job moves up

    Design the system that builds the software — factory, Harness, and tool-for-tools — for client or employer delivery.

Why own the floor

Benefits of building and working with your own Dark Factory

When the factory is yours — not a vendor dashboard — compounding, delivery, and leverage stay in your hands whether you freelance, join a team, or ship for an employer.

  • Compound returns on every project

    Instruction files, evaluators, and gates you own get sharper with each run. The floor learns — you are not starting from zero on the next engagement.

  • Throughput without review burnout

    Deterministic gates and isolated evaluators absorb volume so you steer intent and rules instead of drowning in pull requests.

  • A portable asset across clients and jobs

    Your Harness, rules, and factory layers travel with you. Reusable infrastructure beats throwaway prompts when you change employer or client.

  • Swap models, keep the floor

    Context engineering and Harness patterns sit above any provider. Change models without rebuilding the system you depend on.

  • Operate the factory, not every ticket

    Your work shifts to design, verification, and the compound step — the layer that scales when agents handle the typing.

  • Ship with proof, not hope

    Verification engineering and holdout scenarios mean autonomous runs finish when checks pass — credible delivery for clients and teams.

What you will learn

One intensive day on the EasySpecs Content Blocks that matter for Dark Factory delivery — Harness, compound engineering, loops, verification, and production gates.

  1. Pillar 01

    Harness engineering

    Instruction files, skills, hooks, MCP, and team `.claude/` infrastructure — the body around the model that makes agents repeatable.

  2. Pillar 02

    Dark Factory & compound engineering

    Factory layers, generator/evaluator isolation, holdout scenarios, and the compound loop that makes each run easier than the last.

  3. Pillar 03

    Loops, workflows & verification

    Dynamic workflows, agentic loops with verifiable goals, and verification engineering so autonomous runs know when they are done.

  4. Pillar 04

    Observability, guardrails & context

    Traces and failure modes, layered defense and accountability, plus context engineering so agents stay on real codebases.

Hands-on all day on your laptop with your preferred Harness and your API credits.

Workshop depth

What we cover in one day

Expand any module below to preview scope and depth — featured concepts open inline with full write-ups. No syllabus hours: this is the material we draw from live in Barcelona.

6 modules · 69 concepts in scope

Workshop modules

EB-10 13 topics

Compound Engineering & the Dark Factory

Two opposing answers to the same question: what does engineering become when agents write the code? Compound engineering keeps the human at the high-leverage ends of the loop and makes the system measurably smarter every cycle. The dark factory removes the human from the deployment path entirely and lets specs flow to production in the dark. This module is the synthesis of the whole course — it composes the harness (EB-2), evaluation (EB-6), guardrails and audit (EB-8), and memory (EB-5) into a working methodology and a working pipeline, and it closes the human-in-the-loop thread that runs through EB-1, EB-7, EB-8, and EB-9.

Featured deep dives

What a dark factory is — and how it differs from a harness

The term comes from "lights-out" manufacturing — a plant that runs with no humans on the floor, the canonical example being a robotics maker whose robots bui…

The term comes from "lights-out" manufacturing — a plant that runs with no humans on
the floor, the canonical example being a robotics maker whose robots build robots.
Applied to software, a dark factory is a pipeline that plans, writes, tests, and
ships code with no human reviewing a single line on the path to production. Students
learn the one precise distinction that separates it from the agent harness of EB-2:
the harness keeps a human review step before deployment; the dark factory removes
it. The defining characteristic is the absence of human code review in the
deployment path — everything downstream of merge is unchanged. The phrase is often
used loosely, so the module insists on this precision.

Primary sources

  • MindStudio — What Is a Dark Factory? The AI Coding Pattern That Ships Code Autonomously
  • MindStudio — What Is a Dark Factory Codebase?

Dark factory architecture — the four layers and generator/evaluator isolation

The reference architecture has four layers: an inputs layer owned by humans (specs, holdout scenarios, instruction files, linter rules); an autonomous **…

The reference architecture has four layers: an inputs layer owned by humans
(specs, holdout scenarios, instruction files, linter rules); an autonomous code
generation
layer (the agent reads spec plus repo knowledge, generates, builds,
tests, self-reviews, opens a change); an isolated validation layer (standard CI
first — build, unit tests, static analysis — then an LLM-driven evaluator run
against an ephemeral deployment); and a merge & deploy layer that hands off to
the existing CI/CD pipeline. The same shape recurs in vendor write-ups as
planner → generator → evaluator → deployment. Students learn the single
non-negotiable invariant: the generation layer and the validation layer must be
completely isolated, because an agent that can see how it will be judged optimises
for the judge rather than the goal. Without that wall there is no quality gate, only
theatre.

Primary sources

  • HackerNoon — The Dark Factory Pattern
  • MindStudio — How to Build an AI Dark Factory

Holdout scenarios and the train/test wall

The trust mechanism at the heart of the dark factory, borrowed directly from how ML models are evaluated. Acceptance criteria are written as plain-language b…

The trust mechanism at the heart of the dark factory, borrowed directly from how ML
models are evaluated. Acceptance criteria are written as plain-language behavioural
scenarios kept in a location the coding agent cannot read; a separate
evaluator plans and executes the calls needed to exercise the described behaviour,
and an LLM judges whether the output satisfied it. The spec is the training data,
the scenarios are the held-out test set, and the agent never sees the test — when it
fails it gets only a terse failure message, not the scenario text, so it cannot game
the gate. Students learn why this isolation is what makes the gate meaningful, and a
practical bonus: because the evaluator interprets plain-English scenarios
dynamically, there is no brittle step-definition glue to rot as in traditional BDD.

Primary sources

  • HackerNoon — The Dark Factory Pattern
  • StrongDM — The Software Factory (via Simon Willison)

Also covered in the workshop

  • The compound principle — work that makes future work easier
  • The compound loop — plan, work, review, compound
  • Levels of agentic autonomy
  • Codifying learnings — instruction files, solution docs, the compound step
  • Spec-driven input — plans and specs as the primary artifact
  • Building the compound system — parallel review, the 50/50 rule, agent-native parity
  • Phased rollout and trust gates
  • Dark-factory failure modes and maintenance agents
  • Economics, leverage, and organisational redesign
  • Identity, autonomy, and the human as metacognitive controller
EB-11 9 topics

Dynamic Workflows, Loop Engineering & Verification Engineering

The 2026 frontier of how agentic coding work is driven and trusted. Three closely linked ideas: the harness can now write itself per task (dynamic workflows); the unit of instruction is shifting from the one-shot prompt to a self-running cycle with a goal and a stop condition (loop engineering); and the bottleneck moves from generating code to proving it is correct (verification engineering). This module is the moving edge of the course — it operationalises the harness (EB-2), multi-agent coordination (EB-7), evaluation (EB-6), and the dark factory (EB-10) into the practices teams are adopting right now. Because the tooling here changes monthly, every topic is taught concept-first so it survives the next release.

Featured deep dives

Dynamic workflows — when the harness writes itself

A dynamic workflow is a harness an agent composes on the fly for the task in front of it, rather than the single fixed harness it normally runs in. Students…

A dynamic workflow is a harness an agent composes on the fly for the task in front of
it, rather than the single fixed harness it normally runs in. Students learn the core
mechanic: from a natural-language request the agent writes an orchestration script,
and a separate runtime executes that script in the background, spinning up tens to
hundreds of parallel subagents whose intermediate results live in script variables
rather than the main context window — so the conversation stays responsive and the
plan stays on track no matter how large the task. The motivating example is a
multi-hundred-thousand-line language port completed in days with the test suite kept
green. This is the harness fundamentals of EB-2 and the orchestrator pattern of EB-7,
now generated automatically and run at scale.

Primary sources

  • Claude — Introducing dynamic workflows in Claude Code
  • InfoQ — Claude Code Adds Dynamic Workflows for Parallel Agent Coordination

Loop engineering — designing loops that prompt the agent

The practical discipline behind the slogan that you should stop prompting agents and start designing the loops that prompt them. Students learn the canonical…

The practical discipline behind the slogan that you should stop prompting agents and
start designing the loops that prompt them. Students learn the canonical cycle —
plan, change, validate, observe, revise — and how to wire it: project instructions as
reusable rules, tools and connectors for the agent to act through, work isolation
(branch, worktree, container) so a run cannot damage production, and a built-in
verification step. They learn the bundled /loop-style skill pattern (e.g. a loop
that babysits open PRs, auto-fixes build breaks, and addresses review comments in a
worktree) and the principle that loops calling sharp, named skills get cheaper over
time while loops that re-derive everything do not — a direct tie to the compound
step of EB-10.

Primary sources

  • explainx.ai — Loop Engineering: Coding Agent Loops (2026 Guide)
  • Louis Bouchard — Loop Engineering Explained

Verification engineering — making "done" machine-checkable

If a loop or workflow runs until a goal is met, the whole approach is only as good as the check that defines "met". Verification engineering is the disciplin…

If a loop or workflow runs until a goal is met, the whole approach is only as good as
the check that defines "met". Verification engineering is the discipline of building
that signal. Students learn the spectrum — unit tests, type checks, linters,
benchmark suites, security scans, and LLM-as-judge — and the central rule taught
across the course: the approach only works where the validation signal is strong
(tests, type checks, clear acceptance criteria are friendly terrain; vague UX polish
is not). They learn to choose deterministic checks over model judgement wherever
possible, and how a code-plus-LLM split handles hard constraints with code and soft
constraints with a judge. This is the EB-6 evaluation discipline and the EB-10 holdout
gate, reframed as the thing that drives autonomous work rather than just measuring
it after the fact.

Primary sources

  • Verdent — AI Coding Agents 2026: verification & self-correction
  • DEV — How AI Coding Agents Finally Got Good: RLVR & Verifiable Rewards

Also covered in the workshop

  • The agentic loop — trigger, goal, verify, stop
  • Dynamic-workflow orchestration patterns
  • Loop control — termination, no-progress detection, cost governance
  • Verification engineering in depth — verifiers, adversarial agents, the isolation wall
  • Verifiable rewards & the training-side view (RLVR)
  • When loops and workflows fail — reward hacking, verifier gaming, and scale limits
EB-2 16 topics

Harness Engineering Core

The harness is the body around the model's brain — the orchestration loop, the tool layer, state, permissions, and every configuration surface that turns a chat model into an autonomous engineering agent. In current practice the harness has become the critical infrastructure layer, and the same model can perform very differently depending on the harness it runs in.

Featured deep dives

Harness fundamentals — the agent loop, what the harness provides

The harness is everything wrapping the model. Students learn the core agent loop (the model proposes an action, the harness executes it and feeds back the re…

The harness is everything wrapping the model. Students learn the core agent loop
(the model proposes an action, the harness executes it and feeds back the result,
repeat) and the services a harness must provide around that loop: filesystem and
shell access, the tool layer, permission handling, and state. This is the
conceptual anchor for the entire module — every later topic is a component of, or a
variation on, the harness.

Primary sources

  • Requesty — Agentic Coding Tools Compared (2026)
  • LM Po — Mastering Agentic Coding in Claude

Skills — SKILL.md, frontmatter, progressive disclosure

Skills package reusable behaviours as a folder containing a SKILL.md file plus optional supporting scripts and templates. Frontmatter controls invocation:…

Skills package reusable behaviours as a folder containing a SKILL.md file plus
optional supporting scripts and templates. Frontmatter controls invocation: a skill
can be triggered manually, auto-invoked by the model when its description matches
the task, or both. The key technique is progressive disclosure — keep the
SKILL.md lean and move examples, edge cases, and templates into separate files
that are loaded only when relevant, keeping the context window clean. Skills follow
an open standard that works across multiple tools, and modern tooling unifies them
with slash commands so that every skill also exposes a command interface.

Primary sources

  • Claude Code Docs — Extend Claude with skills
  • Shashank Mishra — Skills, Subagents, Hooks and Plugins
  • alexop.dev — Understanding Claude Code's Full Stack

Also covered in the workshop

  • Interactive vs. non-interactive development
  • The "+16pt harness effect"
  • Memory / instruction files (e.g. CLAUDE.md)
  • Slash commands & argument substitution
  • Hooks — lifecycle events, deterministic enforcement
  • Sub-agents — context isolation, model assignment
  • MCP — the tool/integration layer
  • Framework vs. runtime vs. harness
  • Other harnesses — CLI agents (Claude Code, Codex, Antigravity)
  • Open-source CLI / BYOK — OpenCode, Aider, Cline
  • SDK/library harnesses you build on — LangGraph + DeepAgents
  • IDE-integrated harnesses — Cursor, Copilot, Devin Desktop
  • Fixed-vendor vs. BYOK as an architecture decision
  • Designing a harness configuration; `.claude/` as team infrastructure
EB-6 7 topics

Evaluation, Failures & Tuning

Agents fail differently from traditional software — usually silently — so measuring them requires a discipline of its own. Reference module for the structured format: every topic below is split into Theory, Use cases, and Practical exercises (concept-check → applied), and the block closes with a put-into-practice capstone.

Featured deep dives

Observability pillars — traces, tool calls, decisions, failures

Theory . Why standard monitoring is insufficient for agents, which build their own path at runtime rather than following fixed code. Students learn the p…

Theory . Why standard monitoring is insufficient for agents, which build their
own path at runtime rather than following fixed code. Students learn the pillars of
agent observability — execution traces, tool calls, decision steps, and failures — and
why you must read an agent's process, not just its final output.

Use cases . A support agent that returns a confident wrong answer and the
trace reveals it called the wrong tool; a coding agent whose final diff looks fine but
whose trace shows it never ran the tests; cost spikes traced to a single runaway
sub-agent. Each shows why the process, not just the output, is the unit of analysis.

Practical exercises .

  • Concept-check: given a short annotated agent transcript, label each step as a
    trace event, tool call, decision, or failure, and identify where it first went wrong.
  • Applied: turn on tracing for a provided sample agent, run one task, and read the
    resulting trace to narrate, in your own words, the path the agent actually took.

Primary sources

  • LangChain — AI Agent Observability
  • TrueFoundry — AI Agent Observability

Also covered in the workshop

  • The six agent failure modes
  • Diagnostic loop — trace → cluster → root cause → eval
  • Why agents fail silently
  • Production traces → test cases
  • Tooling landscape — LangSmith, Langfuse, Phoenix, Braintrust
  • Block capstone — put into practice
EB-8 5 topics

Guardrails & Accountability

The less a human watches, the more the guardrails must carry the load. This module is the safety and governance counterweight to the autonomy taught everywhere else in the course.

Featured deep dives

Layered defense — model / app / tool / human

Guardrails work in layers, each catching what the others miss: model-level safety for content, application-level validation for domain errors, tool-level per…

Guardrails work in layers, each catching what the others miss: model-level safety
for content, application-level validation for domain errors, tool-level permissions
for unauthorised actions, and human oversight for judgment calls. Students learn that
no single layer is sufficient and how the layers compose.

Primary sources

  • Medium — When AI Agents Break Compliance
  • Toloka — Essential AI agent guardrails

Also covered in the workshop

  • Human-in-the-loop — pre / post / conditional approvals
  • Least-privilege & approved-API whitelisting
  • Audit trails
  • Compliance frameworks — ISO 42001, NIST AI RMF, EU AI Act, GDPR/HIPAA
EB-1 19 topics

Context Engineering

The discipline of providing the right information, tools, and memory, in the right format, before the model reasons. If the model is the engine, context is the fuel mixture — and getting it wrong is the single most common cause of unreliable agents.

Also covered in the workshop

  • Foundations — prompt vs. context engineering, the attention budget, the "right altitude"
  • Chain-of-Thought & self-consistency
  • ReAct — the thought–action–observation loop
  • The four context strategies — Write / Select / Compress / Isolate
  • RAG basics — embeddings, vector search, chunking
  • Prompt structure & XML tagging
  • Plan-and-Execute & ReWOO
  • Choosing a reasoning pattern & hybrids (LATS)
  • Reflexion — verbal self-improvement
  • Tree of Thoughts & Graph-of-Thoughts
  • Context rot, compaction, clear vs. summarise
  • Multi-agent context isolation
  • Advanced RAG / retrieval depth
  • Cognitive Load Theory ↔ the context window — bounded workspaces & chunking
  • Load types & the token economy — intrinsic / extraneous / germane
  • Attention degradation — context saturation & attentional residue
  • Cognitive overload as an attack vector
  • Metacognition — the human divergence & human-as-controller
  • Metacognitive prompting & confidence calibration

What to bring

  • Laptop and charger
  • Your preferred Harness installed (e.g. Cursor, OpenCode, or VS Code with the EasySpecs extension)
  • Funded API credits for hands-on agent runs
  • Git access and network connectivity for exercises

Schedule includes a 1-hour lunch break. Meal not provided — arrange locally in Barcelona.

Diploma & certification

Workshop attendance enrolls you in the diploma pathway — the diploma is not automatic on attendance alone.

  • Online exam opens 10 calendar days after the workshop (from 4 July 2026).
  • You must pass the exam to receive the diploma.
  • We will email exam access details after the workshop.

Agentic Coding Coaching

Workshops are coached by senior EasySpecs practitioners — product, structured requirements, and platform architecture at the table with your team, not junior facilitators reading slides.

  • Portrait of Xesca Alabart, EasySpecs Labs coaching lead

    Xesca Alabart

    Lead coach — product & requirements · CEO, EasySpecs

    Xesca combines product leadership with requirements engineering, helping teams turn business intent into explicit objectives and acceptance signals agents can work against—not vague chat prompts. She facilitates mixed rooms of engineering, product, and design while keeping a sharp definition of what “done” means.

    • Aligns engineering, product, and design without diluting technical depth
    • Brings structured requirements practice into AI-assisted delivery
    • Based in Barcelona; delivers in English, Spanish, or Catalan as needed
    LinkedIn
  • Portrait of Carlos Guirao, EasySpecs Labs coaching lead

    Carlos Guirao

    Carlos Guirao Capistany

    Lead coach — platform & agent workflows · CTO, EasySpecs

    Carlos architects EasySpecs’ platform and AI systems with an emphasis on safe, reviewable change in mature codebases. He focuses on boundaries, APIs, and verification so agentic engineering produces diffs your team can trust—not opaque churn.

    • Systems and integration architecture for complex products
    • Hands-on with agent workflows, tools, and how they land in your repository
    • Leads the technical spine behind Application Mapping and agent-ready context
    LinkedIn

Questions

Who is this for?
Developers and technical builders who want to ship agentic tooling for clients or employers — freelancers, ICs, and anyone paying their own seat. Companies can register multiple people as separate tickets.
Is this in person only?
Yes. The workshop runs on the terrace of Pier07 in Barcelona. There is no remote stream for this cohort.
What Harness and credits do I need?
Bring your laptop with a Harness you already use (Cursor, OpenCode, VS Code + EasySpecs extension, or equivalent) and funded API credits for hands-on exercises.
How does the diploma work?
Attendance enrolls you in the pathway. The online exam opens 10 days after 24 June 2026. You receive the diploma only after passing the exam.
How is this different from other EasySpecs Labs offers?
This is a one-day Dark Factory + Harness workshop (€600). For a shorter agentic coding day or multi-day team training on your codebase, see our team training overview.
Are we locked into one AI vendor?
No. The factory is built on context engineering and Harness patterns — you can swap models underneath without rebuilding the floor.

Limited to 10 participants

Your seat is one click away

24 June 2026 · Terrace of Pier07 · €600 VAT included · Max 10 participants

Reserve seat — €600