Tooling landscape — LangSmith, Langfuse, Phoenix, Braintrust
**Theory .** A practical survey of the observability and evaluation platforms, organised by what distinguishes them — framework coupling, evaluation depth, s…
**Theory .** A practical survey of the observability and evaluation platforms, organised by what distinguishes them — framework coupling, evaluation depth, s…
Theory . A practical survey of the observability and evaluation platforms,
organised by what distinguishes them — framework coupling, evaluation depth,
self-hosting, and attribution (knowing which agent, model version, and cost produced a
given output). Students learn to choose based on the full debugging lifecycle, not just
trace inspection.
Use cases . Matching tool to need: a self-hosting-mandatory regulated team; a
team that needs cost attribution per agent; a team that wants tight LangGraph coupling
versus one that wants vendor-neutral OpenTelemetry.
Practical exercises .