Back to writing
· 5 min read

What's next for agent-trace

agent-trace shipped two months ago as a debugging tool. People started asking how to run it in production. v0.4.0 and v0.5.0 are the answer.

I shipped agent-trace two months ago as a debugging tool. You wrap your agent, it captures every tool call, and you replay the session later. Simple.

What I did not expect: people started asking how to run it in production.

Not "how do I debug my agent." How do I run this in CI. How do I send traces from a container. How do I get this into Datadog. How do I instrument an agent I did not write.

That gap is what v0.4.0 and v0.5.0 are about.

What shipped

agent-trace (pip install agent-strace or uv tool install agent-strace) captures the full session of any AI agent: every tool call, every LLM request, every file read and write, every error. Not just the LLM calls. Everything around them.

Three integration paths:

  • Claude Code hooks for full session capture, no code changes
  • MCP stdio proxy that wraps any MCP client transparently
  • Python decorator for custom agents and raw API calls

Exports to Honeycomb, Datadog, Grafana Tempo, and Jaeger via OTLP. Zero dependencies. Stdlib only.

Since shipping: 50+ stars, a VS Code extension, and a steady stream of questions I did not anticipate.

Coming in v0.4.0: watchdog mode

Background agents run for hours. Two things go wrong.

The first: the agent loops, retries, or over-reasons and burns through a budget before finishing. No signal. The bill arrives later.

The second: the agent stalls waiting on a tool or a rate limit. The process is alive. Nothing is happening. Also no signal.

Both cases share the same gap: when the session ends badly, there is no record of what the agent was doing at the moment it failed. Not in a format a human or a recovery agent can act on.

v0.4.0 adds --timeout and --budget flags to agent-strace watch (#98). When either limit is hit, the agent process is terminated and a post-mortem is written to the session directory:

agent-strace watch --timeout 30m --budget 5.00 -- python my_agent.py

The post-mortem captures the last tool call, the last LLM response, tokens used, cost at death, and a recovery_context field formatted for a follow-up agent to read. It is not a new tool. It is a feature of the existing session capture, because the data is already there.

Coming in v0.5.0: production integration

The local workflow works. The production workflow does not, for three reasons.

Traces are local. Agents running in containers or CI have no way to send traces anywhere. Each agent writes its own .agent-traces/ directory that nobody else reads.

Integration requires code changes. The decorator and MCP proxy work for new projects. Existing agents built on LangGraph, CrewAI, or the OpenAI Agents SDK need every tool and LLM call wrapped manually. Most teams will not do that.

OTLP export does not map to standard conventions. The OpenTelemetry GenAI semantic conventions (gen_ai.*) are now the standard attribute set for AI spans. Datadog, Grafana, and Honeycomb all understand them natively. Without this mapping, agent-trace traces land as unrecognized custom spans. The data is there. The backends do not know what to do with it.

v0.5.0 fixes all three (#99).

A server-side collector (agent-strace server) receives events from any agent over the network. Point an agent at it with one environment variable. No code changes. (#101)

Auto-instrumentation patches the frameworks at import time. One line instruments an entire LangGraph or OpenAI Agents SDK application without touching the agent code (#102):

from agent_trace.integrations import instrument_langchain
instrument_langchain()
# everything below this line is traced automatically

And the OTLP export gets mapped to gen_ai.* conventions, so traces land correctly in whatever backend the team already uses. (#100)

None of this changes the local workflow. The .agent-traces/ format is unchanged. The CLI commands are unchanged. The zero-dependencies constraint holds for the core package. Integrations ship as optional extras.

The shift

agent-trace started as a developer tool. A better strace. Something you run when an agent does something unexpected and you want to know why.

That is still what it is. But the production integration work makes it something else too: the observability layer you drop into any agent system, regardless of framework, regardless of where it runs, without rewriting the agent.

LangSmith works if you are on LangChain. Datadog LLM Observability works if you already pay for Datadog. agent-trace works with anything, self-hosted, no vendor required.

That is the direction. The issues are open at github.com/Siddhant-K-code/agent-trace if you want to follow along or contribute. The full v0.5.0 tracking issue is #99.


agent-trace is on PyPI: pip install agent-strace or uv tool install agent-strace. The VS Code extension is on Open VSX and the VS Code Marketplace.

Support independent writing

If this post was useful, consider supporting my open source work and independent writing.