Why agentic coding fails in legacy codebases (and how to make it work)

May 3, 2026

AGENTIC CODING

Why agentic coding fails in legacy codebases (and how to make it work)

An agent that ships features in a clean Next.js project will fall over inside a 12-year-old PHP estate with three frameworks layered on top of each other. The reason isn’t model intelligence. It’s context. Legacy codebases hide their rules in places agents cannot see, and the only way to make agentic coding work in them is to surface those rules deliberately.

Where agents lose context in legacy code

In a freshly-scaffolded modern project, the conventions are visible. The framework dictates folder structure. Linters enforce style. Test files sit next to source files. The agent reads the codebase, infers the rules, and writes code that matches.

Legacy codebases work nothing like that. Conventions live in tribal knowledge: ‘we never call this function from outside the billing module’, ‘this looks abstracted but you have to update three other places when you add a field’, ‘the validator runs twice on this path because of an old bug nobody fixed.’ The agent sees the code but not the rules. So it writes plausible-looking code that breaks something three modules away.

The CLAUDE.md / cursorrules / AGENTS.md fix

The most leverage you can apply to a legacy codebase before turning an agent loose on it is writing a single context file. Different tools call it different things — CLAUDE.md, .cursorrules, AGENTS.md — but the structure is the same. It documents the implicit rules of the codebase in plain English: where to put new files, which patterns to follow, which functions are dangerous, what the testing strategy is, and what ‘done’ means.

We typically draft this file in 30–60 minutes by interviewing the most senior engineer on the team. ‘What would you tell a contractor on day one that’s not in any document?’ That conversation is the gold. Once the file exists, every agent session in that repo gets dramatically better. We have measured 40–60% reductions in ‘agent did something wrong’ incidents from this single change.

Boundary-first refactors: carve out modern islands

When the codebase is too tangled for even a context file to make agents productive, the next move is to carve out a modern island and let the agent work there. A new microservice, a new module behind an interface, a clean rewrite of a single bounded context. Inside the island, conventions are tight and modern. At the boundary, an interface translates between the old world and the new.

This is not always feasible. Sometimes the legacy is the only place the work can happen. But where it is feasible, the productivity gap is enormous. An agent that writes 200 lines of TypeScript in a clean module is shipping a feature; an agent that writes 200 lines inside a 50,000-line PHP file is generating bugs.

What still doesn’t work in 2026

Some categories of legacy work remain genuinely hard for agents and probably will for another year. The first is testing strategies based on integration tests against a shared mutable database. Agents struggle with the implicit ordering and the test-data setup. The second is code that depends on undocumented runtime behaviour — race conditions, cron interactions, distributed locks that exist only by convention.

The third is anything where the language itself is poorly represented in training data. Agentic coding is excellent at TypeScript, Python, Go, Java. It is mediocre at COBOL, Visual Basic 6, and obscure DSLs. We tell clients honestly: if your legacy is in a language with thin model support, agentic coding will help with planning and review, not with raw generation.

When to call us

We do this often: come into a legacy codebase, write the context file, identify the boundary islands, and either teach the in-house team to drive agents productively from there or take the rescue work over the line ourselves. It usually takes a triage week to scope and 4–8 weeks to ship the first measurable acceleration.

If you are looking at a legacy codebase wondering whether agentic coding can move the needle, the answer is almost always yes — but only after the right context plumbing is in place. Project rescue and team training are the two services that handle this end-to-end.

Got a legacy codebase that’s resisting AI?

We diagnose and fix exactly this in a triage week.

Book a triage call →

apiadmin