Agentic Coding: A 6-Month Shift

Note from Writer:

Hi, I’m Jeremy. I enjoy writing simple, elegant, and performant code in my day‑to‑day work. Too much software is slow, frustrating, and poorly designed. Since agentic coding tools are now the dominant medium for software production, I am exploring ways to incorporate the basic principles of software engineering back into these tools so that people who want to learn from them don’t get bogged down in the modern dogma of software engineering.

It’s amazing how far agentic coding has come in the past 6 months, before I was a non-believer but now I am cautiously optimistic and maybe even of the opinion that not learning to use these tools now risks falling behind in the Software Engineering space.

September 2025: Early Agentic Engineering

Let me take you back in time all the way back to the ancient times of agentic engineering in September of 2025, the SOA model at this time is Sonnet 4, many people are already claiming it’s the second coming of Christ and software engineers are doomed.

SG Tech Day and RePAR

Our company, still enthusiastic about AI in general, has decided to host a brief internal event “SG Tech Day” for the more adept practitioners to introduce a new workflow RePAR (Research Plan Action Review) to us. Which essentially is asking you to do plan mode, at the time it was still a new concept in agentic engineering so this was supposed to be helpful.

What is RePAR?

Research is to let the agent read through the entire codebase and produce a markdown document on the architecture of the project and what it’s about, not too dissimilar to AGENTS.md.

Plan is essentially “Plan mode” that things like Claude Code and OpenCode has, the only difference is that this was in a markdown file to reference.

Action is just simply, alright go implement the plan.

And review is pretty self explanatory, have the agent review what it’s done and inform you about it.

Task 1: Cross-Project Changes and Plan Mode Friction

The day arrives, 3 tasks carefully curated from the backlog of the departments participating in this event to showcase the abilities and wonders of this new workflow. Immediately things weren’t looking too good.

A task that would require changes across 3 different projects.

A frontend built with Vue3 statically served from a ASP.NET server, a separate backend service that parses the data from the frontend and persists it in Mongo and finally a common “Models” project which is the general convention for our company to make sure that the data structures are consistent.

First thing it needs to do, based on the requirements off hand I would say that the existing model would need to be updated with at least 3 more fields, reading through the plan it knows it has to change the model, adding wait.. 6 extra fields?

Let me take a closer look, hmm a little over-engineered but still acceptable, why not.

Alright, some controller methods for the frontend to call, separate service class to handle this new requirement specifically yeah sure. Plan looks decent, let’s get to work.

Reading through its thought process in Cursor (our companies preferred agentic engineering solution at the time), Sonnet 4 immediately hits a road block, it can’t edit the model that it was supposed to.

Not surprising, it’s from a Nuget from the other “models” project I mentioned earlier. Great, it can tell from the namespace that it’s from another project.

But wait, why are you referencing the project instead of publishing a new version of the nuget..?

Let me stop you right there and correct you, and … it’s struggling with dotnets wonderfully idiomatic commands to get things done.

All good, I’ll just do this manually and have it continue, surely that’s the only complication for this task.

I’m getting annoyed, why is it making all these stupid assumptions that aren’t working out, oh wait I approved those assumptions in the plan.

Which I guess is the inherent problem in this workflow, that super long and gaudy robotic plan that I had to read through seemed coherent and made sense, but in a large enough system I can’t maintain every small detail about the codebase and things that it assumed for some reason existed in the codebase did not actually exist, but the crux is that I too assumed it existed because it’s in that damned plan that is way too long and it made sense for it to exist.

Time’s up, looks like nobody got a working solution for this task despite having a completely working local dev environment and a shiny new workflow, that’s kinda awkward.

Task 2: Frontend Build Success (Despite Slow Planning)

Task 2, a backend is provided with every api ready, the only thing we need to do is build a frontend with a set of requirements.

Not too complicated, the backend is already ready with the raw data, just some parsing and selectors to update the backend with the new values.

Do I even need a Plan for this? Welp it’s the theme of this event so might as well.

Damn, this planning is so incredibly slow, and reading the plan is slow as well, not to mention the plan might be filled with flaws like the first one.

Gosh, for that first task I would have finished it alone by the time it was done planning.

Hmm plan looks good, I know the api it’s calling actually exists let’s just go.

Let me go get some refreshments outside and check back in on it.

Hmm some spaghetti, fish and an eclair, coffee will help me get through this as well alright let’s get back to my seat.

What’s this? It’s done, alright go ahead and run it.

Wow everything works, UI looks infinitely more elegant than anything I would ever bother doing as well.

I look around and I see everyone experiencing something similar, the mood is slightly lifted and maybe this event wouldn’t be a disaster after all.

Task 3: The One That Didn’t Happen

Task 3, the most complicated of all. This would require extremely careful reading of the plan and a caffeinated eye on the thinking process of the model to not let it stray too far.

Oh but what’s this? We’re out of time? Great!

Day 2: Mini Hackathon Outcomes

Day 2, a mini hackathon. Build something that could help automate some tasks that are not possible programmatically or whatever, BUT do it with RePAR.

I won’t delve into the details too much but naturally every team got something working, and each product looks elegant on the surface.

Consensus? When it doesn’t have to work on an already established system it’s very good, it breaks for larger systems/tasks.

Learnings, only use it to do small/medium tasks and review the plan carefully.

Why do I feel like I’ve heard this a million times already?

A Shift Toward Positivity

Despite the general negativity in this part I promise this is actually a positive series on agentic engineering, as you are about to see now.

After this experience, I used agentic coding in a very limited capacity, small easily verifiable changes where I didn’t have to think too hard about the code produced and larger tasks that were by nature sandboxed for the agent to expand as it pleases.

Even limiting it to this scope I did not always expect the outcome to be functional, however it was nice to see it come out with a solution I was happy with from time to time.

Fast forward to the present day it is now my expectation that it not only produces something that is functional but also usable and carefully considered.

Re-Entry Point: Shipping at Inference Speed

What piqued my interest initially was this viral post by Peter Steinberger on shipping at inference speed, I confess at this point I was pretty out of touch with the latest and greatest in agentic coding so this got me a little interested in it again.

The biggest hook for me was him mentioning not needing the “Plan” mode when engaging with codex 5.2, as you could probably tell from my rant above, I have a distaste for plan mode in general. Also I failed to mention or have seen anybody mention the unsettling feeling of trying to robotically inject “order and stability” into the workflow of an nondeterministic workflow like agentic coding.

Codex 5.2/5.3: More Organic Collaboration

The first thing I noticed was that the eagerness to just “make changes” was gone? When I was using cursor it felt like plan mode was all but mandatory to stop the agent from just diving in and making changes on code it didn’t yet fully understand.

Now however, it felt like I was organically solutioning with a colleague and before long a plan naturally formed by itself. I guess it also helps that the agents themselves are good planners unlike me, let’s start building then.

At first, I was a little annoyed at how long codex took to do anything. It’s just constantly reading and reading and takes forever to generate anything, but the kicker is that it almost always generates something of decent-high quality.

This is not to say that you don’t have to review the code or you don’t have to maintain a mental model of the architecture of your system. That part is still mandatory and please don’t delude yourself into thinking that skimming through the code is enough, that is almost always not the case.

Productivity Trick: Parallel Codex Instances

So if I have to carefully review every single line of code, and the model itself now reads code a lot which results in it being slow, how am I supposed to be more productive with it? Turns out the answer is shockingly simple, just open more instances of codex.

I know it can be nice to have some downtime when the agent is going off doing its own thing but it’s quite a revelating feeling having multiple codex instances doing its thing while you plow through the changes/proposed changes and issue a new prompt.

Reinforcement: George Hotz and OpenCode

Days later, seeing George Hotz a highly respected engineer showing how he uses agentic coding, which is just 3 OpenCode instances running Opus 4.5 reaffirmed my thoughts that this is probably how being productive with agentic engineering will look like.

And in that same stream he reminded us of a very simple truth which I believe many people have forgotten in this craze of, “Opus is already better than me at coding! It can 1 shot this generic program that has millions of examples on github in 5minutes!!!”.

The Simple Truth: You Still Have to Be a Good Engineer

The simple truth is that you still have to be a good engineer who takes pride in your craft to make good use of these tools, for example you were trying to build a game engine from scratch. You heard that popular engines like Unreal Engine are built in C++ and you prompted it that you want to make a game engine just like Unreal Engine using C++.

The agent would probably ask you a few questions and suggest some things, but architectural decisions like, do I keep input and window management simple with something like GLFW, SDL etc or do I opt for something native inside the Win32 API and build the engine around producing outputs that would fit easily with any native solution for Windows, Mac and Linux.

What about the graphics api? Direct X 12 is pretty popular, oh but wait that’s native to windows, am I going to add the burden of maintaining 3 separate graphics api’s potentially for this project? These agents at the end of the day still produce what is most “popular” today and won’t consider what your engine is for, turns out I was talking about a simple 2D game where OpenGL 4.1 would serve me just fine as the sole graphics api. And maybe an even simpler question, do I even want the game to be cross platform?

You as the engineer still have to decide what’s best for the system at the end of the day and it’s important that you know your craft well enough to fight back against any chatbot psychosis that may occur when generating a plan/solution with the agent.

Transition: Opinionated Guidance Ahead

So after that long recap of how agentic coding went from a party trick to self doubt on whether a capable engineer using agentic coding can be actually more productive than without, I would like to share a few highly opinionated ways of going about agentic engineering.

Opinionated Practices

I like to open codex in the base folder of where all the relevant projects are stored, I can easily reference other projects with @ and I find that it easily goes off and draws the connections between projects when I failed to explicitly mention them.

Don’t ask for too much in one prompt, these agents can still make mistakes and once they go astray, trying to steer it back may not be hard but it’s definitely annoying. Also whether you like it or not, that mistake now lives within the context until its compacted away or something.

CLI is the preferred environment for agentic coding, not to say Cursor is bad. Ideally you wouldn’t be making any manual changes on your own. Plus, I always feel like it’s a dice roll to see if the agent has noticed that you changed something that it wrote, is it operating with the assumption that what it wrote is still in place? I find that it’s much too much trouble to have these kinds of things happen and the IDE right there easily leads you to this trap.

No longer prefer small iterative commits which is “good software engineering practice”. If I don’t like what it did earlier, I tell it to manually revert its changes, why? Because it’s now within the context that I do not want that, am I expecting it to constantly monitor the git logs to figure out that I reverted something or explicitly prompt it that I reverted a commit that did something that it may not fully understand?

My Config

[notice.model_migrations]

“gpt-5.2-codex” = “gpt-5.3-codex”

tool_output_token_limit = “25000”

model_auto_compact_token_limit = “233000”

web_search = “live”

[features]

ghost_commit = false

unified_exec = true

shell_snapshot = true

AGENTS.md

# Global coding style (performance-first, low-indirection)

## Core defaults

– Prefer **simple, direct code** over abstractions.

– Minimize **indirection** (layers, services, classes) unless it materially improves reuse or correctness.

– If you introduce a new abstraction, state the concrete reuse/benefit in a short comment.

## Inlining vs reuse

– Prefer **inlining** when logic is not reused (or unlikely to be reused).

– Extract a helper only when at least one of these is true:

– It is reused in 2+ places (or will be very soon).

– It meaningfully improves correctness (single point of truth) or testability.

– It reduces complexity in a way that clearly outweighs extra indirection.

## Data modeling: “fat structs” over fragmented designs

– Prefer **fat structs / cohesive data objects** that keep related data together.

– Avoid splitting a cohesive model into multiple classes/services just to enforce separation.

– Prefer passing a single cohesive struct/object over passing many loosely-related parameters.

## Collections & lookup strategy

– Prefer **arrays/vectors/slices** over dictionaries/maps when the data size is small or bounded.

– Prefer **linear search on small arrays** over hash maps (lower overhead, better locality).

– Prefer **array indexing** for enum/integer identifiers (dense keys).

– Even for string identifiers: if the dataset is small, prefer an **array of (key, value)** with string comparison over a dictionary/map (avoid pointer indirection / hashing overhead).

– Use dictionaries/maps when:

– The dataset is large or unbounded, or

– Lookups dominate and are performance-proven to require hashing, or

– Keys are sparse/unpredictable and an array would be wasteful.

## Allocation discipline (default: allocation-free)

– Prefer **allocation-free** approaches whenever practical.

– Avoid per-iteration heap allocations in hot paths.

– Reuse buffers/containers; clear and reuse rather than reallocate.

– Prefer APIs that accept an output buffer or allow reserving capacity up front.

## Data transformations

– When transforming data, prefer allocating the **final result once** and writing into it directly.

– Avoid patterns like:

– building an intermediate “request” object that duplicates fields,

– then copying the same fields into a “result” object.

– If sizing is unknown, use a two-step approach:

1) compute final size/count (cheap pass), then

2) allocate once and fill.

## When uncertain

– Default to the above rules; optimize for fewer allocations and less indirection.

– If there is a tradeoff, explain it briefly and choose the simplest design that meets performance goals.

End

That’s where I’m at right now: still skeptical in the ways that matter, but no longer skeptical that this is a real workflow shift.

In the next post (if any) I hope to share more about Peter Steinberger’s project OpenClaw and how it’s helping to automate most mundane tasks in the theme and accelerating workflow now.