Agents and AI coding

01 Intro

When opencode was released I started using agentic coding in my day-to-day work. I had previously tried Claude Code for some personal projects, but opencode was the first terminal-based AI agent I tried that worked well with my company’s internal LLM proxy. Recently I shared my experience with agents and AI coding with some colleagues. This post is a write-up based on my preparation for that session.

02 What are agents?

There are many different definitions of “agent”. I’m happy with the one Simon Willison settled on:

An LLM agent runs tools in a loop to achieve a goal.

Let’s break this down:

LLM: The language model provides reasoning and decision-making
Tools: Functions the agent can call - web searches, file searches, creating and editing files, running shell commands
Loop: The iterative cycle of reasoning → tool use → evaluation → next action
Goal: A bounded objective that provides a stopping condition

03 Evolution of AI coding

Jason Liu and Beyang Liu describe three distinct areas in the evolution of AI coding:

the autocomplete era
the RAG chat era
the agentic era

The latest transition from the RAG chat era to the agentic era comes with an inversion of how some of the context is provided to the LLM. Whereas in the RAG chat era, a RAG system first retrieved context with a similarity search on top of the original query and then passed this context to the LLM to generate a response, in the agentic era the LLM itself decides which tools to use and which context to fetch. But this does not completely relieve the engineer from thinking about context completely. While the LLM is able to fetch context, left unchecked, an LLM agent tends to pollute its context window with irrelevant information.

04 Context Engineering

That’s where context engineering comes in.

As Dex Horthy from HumanLayer, who coined the term “context engineering”, says:

Everything is context engineering.

LLMs just turn inputs into outputs. Everything that goes into the LLM is the context. To get good output, you need good input.

Creating great context means being intentional about the prompt you give to the LLM, the additional documents that are retrieved, and any chat history, including tool calls and results.

How can we do great context engineering in practice when using agents for coding?

Context engineering is the delicate art and science of filling the context window with just the right information for the next step.

— Andrej Karpathy

05 My workflow

What works well is tackling an implementation in phases. Along the way, I create artifacts (markdown files) to help transition between phases. This approach prevents the history of a previous phase from polluting the context window of the next one. The input to one phase should only be the outcome of the previous phase. The phases in my workflow (and many others’) are: research, plan, implement. How you got to the result of the research is not relevant when planning and how you decided on a specific plan is not relevant during implementation.

Research During the research phase, the goal is to understand the codebase, the data flow, potential problems and their causes. Sometimes I skip this phase or include some minor research in the planning phase.
Plan During the plan phase, the goal is to decide on an approach to implement. I usually start by asking for possible implementation approaches, explore them, challenge the provided suggestions, clear up misconceptions and then decide on an approach for which I let the agent write a plan.
Implement Once the plan is set, the implementation can commence. I usually do this in small iterations, following the plan and reviewing code changes along the way. When possible, I start by writing failing tests and then continue by making the tests pass.

What do you do when the result of the implementation is wrong? Here’s Dex Horthy again:

Implementation is compiling the spec to code.

If your compiled program is wrong, you don’t change the assembly, you rewrite the code and recompile.

If your code is wrong, don’t resteer live, go fix the plan and restart the implementation.

Put more generally, the impact hierarchy for coding agents explains what you should spend human effort on.

06 Impact Hierarchy for Coding Agents

In keeping with my pattern of freely copying from Dex and the folks at HumanLayer, here’s the impact hierarchy for coding agents they propose.

Level of Abstraction	Error	Impact	Problem
Core Infrastructure	1 Bad Line of Agent Instructions	100,000+ Bad Lines of Code	Core Infrastructure
Specification	1 Bad Line of Specification	10,000+ Bad Lines of Code	Wrong Problem
Research	1 Bad Line of Research	1,000+ Bad Lines of Code	Misunderstanding the System
Plan	1 Bad Line of Plan	10-100 Bad Lines of Code	Wrong Solution
Implementation	1 Bad Line of Code	1 Bad Line of Code	1 Bad Line of Code

At the very top with the most impact we have a bad line in your agent instructions, e.g. CLAUDE.md or AGENTS.md. It affects every phase in your workflow and every session. I started my AI coding journey without an AGENTS.md file to get an understanding of the behavior of an agent without it. Over the past months I’ve been slowly adding some instructions, and monitoring how it affects the coding agent.

The remaining levels of abstraction align closely with the phases of the workflow described above. The specification is the input to the research phase, the research is the input to the plan phase, and the plan is the input to the implementation phase. As you step down these levels of abstraction, the impact lessens.

07 Conclusion

I’m still working on getting a better understanding for what works when coding with AI. For example, I’m currently interested in ways to streamline the hand-off between phases. Currently, this is a very manual process for me. I’m not re-using any prompts. Additionally, I’m curious about how to effectively use intentional compaction to keep the context window relevant. What helps me is reading others’ experience with AI agents. Here are some posts that I’ve found helpful¹:

Dex Horthy on getting AI to work in complex codebases.
Calvin French-Owen on how different AI coding tools shift your “thinking budget” between providing the right context, planning, implementation, and review.
Atharva Raykar on the Nilenso blog on AI-assisted coding for teams that can’t get away with vibes.
Sean Goedecke on how being good at code review translates to being effective with AI coding agents, emphasizing the importance of structural thinking over nitpicky line-by-line fixes.
Thomas Dohmke on how developers are evolving through distinct stages of AI adoption, from skeptic to strategist, and how the role is shifting from writing code to orchestrating and verifying AI-generated work.
Vicky Boykis on her favorite use-case for AI: writing logs.
Armin Ronacher on agentic coding things that didn’t work.
Conrad Irwin on the Zed blog on why LLMs can’t really build software.
Peter Steinberger on his AI coding workflow.

Bonus: Shreya Shankar on writing in the age of LLMs. Not directly related to coding but still relevant for software engineers.

If you’re new to AI coding agents, I’d recommend: Download a coding agent and try using it for most (if not all) of your coding work for a week. I like opencode for its defaults and the ease with which you can switch between models, even of different providers. But any other coding agent like Claude or Codex will give you a similar experience.

For the blogs that offer an RSS feed I use NetNewsWire to subscribe. ↩