
by Max Kanat-Alexander
Date: October 2023
AI agents are force multipliers that expose every structural flaw in your engineering organization. This summary explains why the best AI strategy is actually just world-class developer experience.
This episode answers:
AI agents are like high-performance race cars. If you put them on a dirt road of legacy code, they will crash just as fast as a minivan. Max Kanat-Alexander, a veteran of developer experience at Capital One, argues that the only way to win with AI is to fix the environment around it.
"Don't fight the training set."
"The agent cannot read your mind."
"Writing code has become reading code."
Podcast Link: Click here to listen

How's everybody doing? Still awake? Okay, great. So like the robot voice said, I have been doing developer experience for a very long time and I have never in my life seen anything like the last 12 months. About every 2 to 3 weeks software engineers been making this face on the screen. Okay. And if you work in developer experience the problem is even worse. You're like this guy on the screen every few weeks. You're like, "Oh yeah, yeah, yeah, yeah, yeah. Here's the new hotness." And then somebody else comes up and they're like, "Well, can I use the the new new hotness?" And people have been doing that for years. I've been working in developer experience for a long time. Everybody always shows up and they're like, "Oh, can I use this tool that came out yesterday?" And you're like, "No, of course not." And now we're like, "Uh, maybe yes." Right? And what this leads to overall is the future is super hard to predict right now.
So, I think a lot of people, a lot of CTO's, a lot of people who work in developer experience to people who care about helping developers are asking themselves this question, are all of my investments going to go to waste? Like, what could I invest in now that if I look back at the end of 2026, I'll be like, I sure am glad that I invested in that for my developers. And I think a lot of people have just decided, well, I don't know. I guess it's just coding agents and I guess they'll fix every single thing about my entire company by themselves which look they're amazing they're transformative but it's not the only thing that you need to invest in as a software engineering organization.
So we can clarify this by asking ourselves two questions. The first one is how can we use our understanding of the principles of developer experience to know what's going to be valuable no matter what happens. Okay. And what do we need to do to get the maximum possible value from AI agents? Like what would we need to fix at all levels outside of the agents in order to make sure that the agents and our developers can be as effective as possible? And this isn't like a minor question. These are the sorts of things that could make or break you as a software business going into the future.
So let's talk about what some of those things are that I think are no regrets investments that will help both our human beings and our agents. So the in general one of the framings that I think about here is things that are inputs to the agents. Things around the agents that help them be more effective. And one of the biggest one is the development environment. What are the tools that you use to build your code? What package manager do you use? What linters do you run? Those sorts of things.
You want to use the industry standard tools in the same way the industry uses them and ideally in the same way the outside world uses them because that's what's in the training set. And look, yes, you can write instruction files and you can try your best to try to fight the training set and make it do something unnatural and unholy with some crazy amalgamation that or modification that you've made of those developer tools. Like you might you invented your own package manager. You probably should not do that. you probably should undo that and try to go back to the way the outside world does software development because then you are not fighting the training set.
And also it means it means things like you can't use obscure programming languages anymore. Look, I'm a programming language nerd. I love those things. I do not use them anymore in my day-to-day agentic software development work. as an enthusiast, I do come sometimes go and I code on, you know, frontline uh software engineering languages, but not in my like real work anymore.
So, what people ask me sometimes, does that mean like we're never going to ever have any new tools again because we're always going to be dependent on the tools that the model already knows? Probably not because like I said, there's still going to be enthusiasts. And also, but like I would like to make a point. The thing that I'm talking about has always been a real problem. Like there's always some developer at the company has always come up to you and be like, "Can I use this technology that came out last week and has never been vetted in an enterprise to run my like 100,000 queries per second service that serves a billion users?" And I'm like, "No, you can't do that now and you can't do that yesterday. It's still the same."
Another one is in order to take action today, agents need either a CLI or an API to take that action. Yes, there's computer use. You can make them write playright and orchestrate a browser. But why? Like if you could have a CLI that the agent can just execute natively in its normal format that it understands the most natively, which is text interaction, why why would you choose to do something else, especially in an area where accuracy matters dramatically and where that accuracy dramatically influences the effectiveness of the agent?
One of the most important things that you can invest in is validation. So any kind of objective deterministic validation that you give an agent will increase its capabilities. So yes, sometimes you can create this with the agent. I'm going to talk about that in a second. But it doesn't really matter how you get it or where you get it from. You just need to think about how do I have high quality validation that produces very clear error messages. This is the same thing you always wanted by the way in your tests and your linters, right? But it's even more important for the agents because the agents cannot divine what you mean by 500 internal error with no other message, right? Like they need a way to actually understand what the problem was and what they should do about it.
However, there is a problem here. So, you know, you think, okay, I'll just get the agent to do it. They'll write my test and then I'll be fine. But have you ever asked an agent to write a test on a completely untestable codebase? They do kind of what it's like is happening on the screen here. They will write a test that says, "Hey boss, I pushed the button and the button pushed successfully. Test passed."
Like so there is a sort of a a larger problem that a lot of enterprises have in particular which is there's a lot of legacy code bases that either were not designed with testing in mind or were not designed with like high quality testing in mind. and like maybe they just have like some very high level endto-end tests and they don't have like great unit tests that the agent can actually run iteratively in a loop and that will produce actionable and useful errors.
So another thing that you can invest in that will can be perennially valuable both to humans and to agents is structure of your systems and structure of your code bases. Agents work better on better structured code bases. And for those of you who have never worked in a large enterprise and seen very old legacy code bases, you might not be familiar with what I'm talking about. But for those who have, you know that there are code bases that no human being could reason about in any kind of successful way because the information necessary to reason about that codebase isn't in the codebase and the structure of the codebase makes the codebase impossible to reason about looking at it.
Yes, you the agents can do the same thing human beings do in that case, which is sort of go through an iterative process of trying to run the thing and see what breaks, but that decreases the capability of the agent so much compared to just it having the ability to just look at the code and reason about it the exact same way that human capability is decreased. And of course, like I said, that all has to lead up to being testable. If the only thing I can do with your codebase is push a button and know if the button pushed successfully and not see the explosion behind it, like if if there's no way to get that information out of the codebase from the test, then the agent's not going to be able to do that either unless it it goes and refactors it or you go and refactor it first.
And you know, there's a lot of talk about documentation. There's always been a lot of talk about documentation in the field of developer experience, in the field of improving things. And there's people go back and forth about it. Engineers hate writing documentation. And the value of it is often debated like what kind of documentation you want or don't want, do or don't want. But here's the thing. The agent, let's just take this in the context of the agent. The agent cannot read your mind. It did not attend your verbal meeting that had no transcript. Okay?
Now there are many companies in the world that depend on that sort of tribal knowledge to understand what the requirements are for the system. Why the code is being written? What is the specification that we're we're writing towards if things are not written down? And that sounds like blatantly obvious, but like there are a lot of things that are fundamentally written. like if the code is comprehensible like all the other steps are in that we've gotten to so far. You don't need to reexlain what's in the code. So there's actually probably a whole class of documentation that we may not need anymore or you can just ask the agent like hey tell me about the structure of this codebase overall and it'll just do it but it won't be able to ever know why you wrote it unless that's written down somewhere or things that happen outside of the program like what is the shape of the data that comes in from this URL parameter as an example like if you have already written the code there's a validator and that does explain it but if you haven't written the code yet it doesn't know what comes in from the outside world.
So basically anything that can't be in the code or isn't in the code needs to somehow be written somewhere that the agent can access. Now we've covered sort of a few technical aspects of things that we need to improve, but there's a point about software development in general and it that's always been true and one of and that's you've heard this we spend more time reading code than writing it. The difference today is that writing code has become reading code. So even now when we are writing code we spend more time reading it than actually typing things into the terminal. And what that means is every software engineer becomes a code reviewer as basically their primary job.
In addition, as anybody who has worked in a in a shop that has deeply adopted Agentic coding, we generate far more PRs than ever before, which has led to code review itself, the like the big scale code review being a bottleneck. So one of the things that we need to do is we need to figure out how to improve code review velocity both for the big code reviews that we like where we you send a PR and somebody like you know writes comments on it and you go back and forth and also just the iterative process of working with the agent. How do you speed up a person's ability to look at code and know what to do with it?
The principles are pretty similar for both of those, but the exact way you implement them is a little bit different. What you care about the most is making each individual response fast. You don't actually want to shorten the whole timeline of code review generally because code review is a quality process. It's the same thing with agent iteration. Like what you want with agent iteration is you want to get to the place where you've got the right result. You don't want to like just be like, "Well, I guess I've hit my 5minute time limit, so I'm going to check in this garbage that doesn't work, right? You you But what you do want is you want the iterations to be fast." Not just the agents iterations, but the human response time to the agent to be fast. And in order to do that, they have to get very good at doing code reviews or knowing what the next step is to do with a lot of code.
At the big code review level, one thing that I see that I think is sort of a social disease that has infected a lot of companies is when people want PR reviews, they just send a Slack message to a team channel and say, "Hey, could one of the 10 of you review my PR?" And what and you know what that means is one person does all those reviews. That's what really happens there. There's like when you look at the code review stats of teams like that, there's one person who has like 50 and the other person have like three, two, five, seven because there's just one person who's like super responsive. So, but what that means is if you start generating dramatically more PRs, that one person cannot handle the load. You have to distribute it and really the only way to distribute it is to assign it to specific individuals, have a system that distributes it among those individuals and then set SLOs's that have some mechanism of enforcement.
And another thing is like that GitHub, for example, is not very good at today is making it clear whose turn it is to take action. Like I left a bunch of comments on your PR. Uh, you now responded to one of my comments. Should I come back again now? Oh, wait. No, no, now you pushed a no change. Should I come back now? Okay. No, no, now you've responded to more comments. What I rely on mostly is people telling me in Slack, I'm ready for you to review my PR again, which is a terrible and inefficient system.
And another thing you got to think about a lot is the quality of code reviews. And I mean this once again both for the individual developers doing it with the agent and the people doing it in the code review pipeline. You have to keep holding a high bar. I know that people have other opinions about this. And yes, depending on the timeline that you expect your software to live, you might not need as much software design. Like look, it's software design is not the goal of perfection. It's a goal of good enough and better than you had before, right? But sometimes good enough for a very long lived system is a much higher bar than people expect it to be. And if you don't have a process that is capable of rejecting things that shouldn't go in, you will very likely actually see decreasing productivity gains from your agentic coders over time as the system becomes harder and harder for both the agent and the human to work with.
The problem is this. In many companies, we have the people who are the best code reviewers not doing any of their time doing code review. They are spending all their times in meetings doing highle reviews doing strategy. And so we aren't teaching junior engineers to be better software engineers and to be better code reviewers. So we have to have some mechanism that allows the people who are the best at this to do this through apprenticeship. If somebody else has a better way of doing this than doing code reviews with people, I would love to know because in the 20 plus years that I've been doing this, I have never found a way to teach people to be good code reviewers other than doing good code reviews with them.
Now, if you do if you don't do all the things that I talked about, what is the danger? The danger is you take a bad codebase with a confusing environment. You give it to an agent or a developer working with that agent. The agent produces relative levels of nonsense and the developer experiences more or less frustration and depending on how persistent they are at some point they give up and they just send their PR off for review. They're like, I think it works. Right? And then if you have lowquality code reviews or code reviewers who are overwhelmed, they go, I don't know. I don't know what to do with this. I guess it's okay. And you just have lots and lots and lots of bad rubber stamp PRs that keep going in and you get into a vicious cycle where what I expect to occur and what my prediction is is if you are in this cycle, uh, your agent productivity will decrease consistently through the year.
On the other hand, we live in an amazing time where if we increase the ability of the agents to help us be productive, then they can actually help us be more productive and we actually get into a virtuous cycle instead where we actually accelerate more and more and more and more. And yes, some of these things sound like very expensive fundamental investments, but I think now is the time to make them because now is one of the times you're going to have the biggest differentiation in your business in terms of software engineering velocity if you can do these things versus other in industries or companies that can't structurally do these things.
Key Takeaways:
But if you look at all of these things, there's one lesson and one principle that we take away from all these things that covers even more things than this. And it's basically that what's good for humans is good for AI. And the great thing about this, one second. The great thing about this is that it means that when we invest in this thing, we will help our developers no matter what. Even if sometimes we miss on helping the agent, we are guaranteed to help the humans. Thank you very much.