I’ve been experimenting heavily with different agentic workflows lately, specifically trying to tackle large, messy legacy codebases. You know the type: sprawling, undocumented, and fragile.
My initial instinct was to go big. I wanted to build the “perfect” system.
One thing I came to appreciate early on was the concept of specialized agents. It just makes sense, right? You have a Spec Architect to design the solution, an Implementer to write the code, and a Verifier to check the work. All of them coordinated by an Orchestrator agent that manages the state and hands off tasks.
It felt proper. It felt like engineering.
The Context Trap
The main driver behind this multi-agent architecture was context management.
If you’ve worked with LLMs on large projects, you know that context engineering is the single most consequential aspect of setting up these workflows for success. You can’t just dump a million lines of code into a prompt and expect a miracle.
So, inspired by frameworks like the mesh agent and swarm-style coding demos I’d seen, I built a system where each agent had specific instructions (standard agent.md files) and the orchestrator decided who to call and when.
These agents used a folder system within my Obsidian vault to log progress, findings, and learnings for the next agent in the chain. The orchestrator would run, reconstruct the state from these logs, pass ticket specs to the architect, divide the task, and so on.
Ideally, the task would be completed, or marked as completed (we all know how that goes), and I would just need to do a final manual verification.
When “Fancy” Becomes Friction
While this approach felt sophisticated, the reality was a bit different.
I ended up with these long-running processes that were surprisingly hard to track. LLMs are prone to being verbose, and over time, the state files and logs would grow larger and larger. The “context” I was trying so hard to manage became a noisy mess of its own.
I found myself spending more time debugging the orchestration logic and managing the “guardrails” for the agents than actually fixing the legacy code.
The Pivot to Simplicity
Frustrated by the friction, I asked myself a simple question: Why not just try a single agent?
Could I instruct a single agent to go from start to finish on its own?
The immediate concern, of course, was the context window. If a run goes on too long, say past 60-70% of the context window, hallucination territory starts to creep in. The model loses the thread.
So I did some research and decided to try a different kind of constraint.
Instead of splitting the roles (architect vs. coder), I split the work.
I instructed the LLM to never do any work that would take a senior engineer more than 40 minutes.
That was the key.
I added further instructions for it to clearly define the work for the next run before signing off. It would do a focused chunk of work, document the state, and then stop.
Why Less Was More
This turned out to be a much more manageable and, honestly, efficient way to conduct my work.
- Easier to Trace: Reading through a single, linear trace was far simpler than jumping between multiple agent logs.
- Better Quality: By time-boxing the agent, the context remained fresh. Issues were addressed as they surfaced, rather than compounding through a chain of hand-offs.
- Closer to Desired Result: The output was much closer to what I actually wanted.
The Wild West of AI Standards
I’m not entirely sure which approach is objectively “superior.” For different problems, a swarm might be the answer. But for refactoring legacy code, the single, constrained agent won out for me.
The field of AI is evolving so rapidly that it’s genuinely exciting. One of the coolest things about it is that we are establishing the rules as we go.
There really aren’t established best practices yet, even though it feels like hundreds of new “standards” get advertised every week on Twitter or Reddit.
My advice? Don’t just follow the hype of the latest “autonomous swarm” framework. Experiment. Try the simple thing first. You might find that a single, well-instructed agent is all you really need.