Blog

Agentic development is context management on two fronts

Joonas Pajunen Technology AI

Agentic development on a laptop Agentic development on a laptop

Agentic software development, in its core, is about managing information in order to steer an agent towards a goal. On an individual and tool level, it is about managing context sent to the LLM using your chosen harness (Claude Code, Codex, Opencode, etc.). On a team and organisational level, it is about transferring information and managing comprehension.

When agents get going, in addition to technical debt, you will be worried about comprehension debt. First, let's figure out context management to understand why comprehension is a team-level equivalent concept.

Context management

The single most important concept to understand when doing agentic software development is how to keep the context window as relevant and fresh as possible. One human analogy is your working memory; you keep it focused on the task at hand instead of filling it with nonsense while working.

An LLM (Large Language Model, like a GPT or Claude Opus) takes in text and spits out text. Multimodal LLMs work with other media too, but we, as software developers, are only interested in the text output. Harnesses let LLMs use tools. Tools make it possible for the agent to make actual changes in the file system. The harness also allows the LLM to manage its own context by reading the file system and other sources of information, like the Internet and 3rd party services.

The chief task of the agentic engineer is to provide the most accurate context by utilising the agent harness. The usual and most obvious way is to plan comprehensively. And by planning, I mean to plan the task and the feature. Planning a product roadmap is out of scope of context management; it may be useful background information, but we are not talking about waterfalls here.

Planning

A good plan gives the agent a goal and often a list of steps to follow in order to reach that plan. Sometimes, in order to generate a plan, the agent needs to do research on the current state of the codebase and existing related solutions. The user doesn't need to write these plans by hand. They can leverage the agents' capabilities to do research and planning before implementation

The research–plan–implementation flow forces the agent and the user to create the best possible context for the task at hand. Claude Code, at least, suggests clearing the current context before undertaking a plan. This is the way.

Context management tips

What goes into context?

  • Your current chat session and its whole content
  • Your AGENTS.md and CLAUDE.md files' contents, which are basically the "project memory" that is shared between sessions
  • Skill, MCP, subagent, etc. "extension" definitions

Agent harnesses automatically manage context. They automatically compact context when usage reaches the upper bounds. This means they summarise the current chat. Usually, by this time, it's way too late to keep performance optimised. Some say you get degraded performance after using roughly half of the context window, but these things change as the tools and the models develop. Everything is changing rapidly, and likely, the harnesses are automatically improving their ability to optimise context.

Currently, you need to actively manage context. Voluntarily compact and clean up. Things like:

  • (Claude Code) "double esc" to discard debugging messages and go back in time in the session
  • (Claude Code) Run `/compact` to manually do it before the harness

  • Delegate selected work to sub-agents (because they have their own context windows)

  • Scrap dead ends and start fresh with new ideas yourself

  • Manually handoff compactions and plans to other agents/models

In the end, managing context resembles a little like managing people. The clearer you are about expectations and goals, the better the performance. Free management practice with entities whose jimmies aren't easily rustled!

The implications of context management work

If most of the development work is managing context for agents, what happens to the programming work by people? What happens to the ability to evaluate generated code? I believe developers should stay in touch with how code and syntax work. This means different things to different projects, industries, and levels-of-seriousness, to put it vaguely. Sometimes, core business logic needs to be fully human-written or tested. Sometimes every line needs to be carefully evaluated. In order to do this, the people responsible need to know how code works. They still must remember the craft!

Having said that, in some cases all code will be written by the machines. The humans need to evaluate and own the work. Taking automation a step further, AI will soon be responsible for reviewing code as well. Perhaps that too will reach a good enough point soon. However, ultimate responsibility for the outcomes still rests with a human. Whether the responsible one is reviewing code line by line or not, the fact is that there will be more stuff to review and take in. This creates bottlenecks in the amount of stuff people can take in to comprehend.

Comprehension debt and managing information within a team

Teamwork is planning what to do and then following those plans. It is about sharing information. When people build software together, one of the most important parts is reviewing each other's work.

The value of code reviews

Code reviews are not always straightforward, and opinions may differ. It is also perhaps one of the least glamorous parts of the profession, no matter how important it is declared collectively. This has often been a bottleneck, even before turbocharging the developers with code-generating AI!

It naturally follows that the bottleneck of reviewing code worsens. We solve it partially by throwing more AI tools to automatically do the reviews. The explicit goal of these reviews is to discover potential bugs, obvious errors, suspect architectural decisions, or perhaps one of those more mysterious "code smells". AI can definitely help with those.

But the implicit effect of code reviews is increased shared ownership and understanding of the code base. Increased comprehension of what goes on in the depths of the code. This implicit goal will become the explicit goal in the future, as AI tools enable the generation of more code and better descriptions of changes in the reviews (pull requests).

More code, more documentation

The planning required by context management makes developers write (or generate) more documentation than ever! Humans can understand the plans too, and since they don't have to type in so much code anymore, they have more time and capacity to read the plans. And after the plan has been implemented, the code review can include another set of beautiful prose about the changes. And, you know, why stick to only text when you have multimodal AIs available to deliver the message?

Whatever tricks you pull here, the mind can only absorb as much. Teams have a limit where coordination becomes too time-consuming. So eventually, they will begin to develop comprehension debt, which often indicates the presence of technical debt as well. At this time, the team needs to slow down, perhaps stop. Stop, reorient, continue. 

Full automation, human in the loop

The takeaway here is that I recommend you utilise these wonderful tools also after the code is generated, tested, committed, and pushed. Make sure your colleagues understand what you and your favourite AI-sidekick just accomplished there.

When you add comprehension-increasing steps into your workflow, you are one step closer to building almost fully autonomous, but human-steered agentic pipelines. The machines will take in context, hone it based on rules and conventions, bake plans, implement them, and create debriefs for you and your team to comprehend.