AI Engineer

December 26, 2025

How Claude Code Works - Jared Zoneraich, PromptLayer

How Claude Code Kills the Scaffolding

Jared Zoneraich, PromptLayer

This summary is for engineers and founders building agentic systems. It reveals why the shift from rigid logic to simple loops is the secret to the current AI coding explosion.

Why did autonomous coding agents suddenly stop sucking?
Why is Bash the only tool your agent actually needs?
How do sub-agents solve the "stupid model" context problem?

Jared Zoneraich, founder of PromptLayer, explains the architectural shift behind Claude Code. He argues that we are moving away from over-engineered scaffolding toward a "less is more" philosophy.

The Death of the DAG

"Give it tools and then get out of the way."

Simple Master Loops: The winning architecture is a basic while loop that calls tools until the task is done. This replaces hundreds of rigid nodes with model flexibility.
Model Trust: Modern models are now good enough to follow instructions without deterministic enforcement. You can stop building complex guardrails and let the model explore.
Scaffolding Obsolescence: Any complex code you write to fix current model flaws will be useless in six months. Focus on the core loop instead of temporary fixes.

The Bash Supremacy

"Bash is the universal adapter."

Tool Minimalism: Claude Code relies on a few core tools like read and edit while centering on the shell. This allows the model to create and run its own scripts.
Training Data: Models are best at Bash because it is the most common human terminal language. Using standard shell commands maximizes the model's inherent reasoning.

Context is the Enemy

"The longer the context, the stupider the agent."

Sub-agent Forks: Claude Code uses tasks to spin off specific problems into separate contexts. This keeps the main reasoning loop clean and sharp.
Context Compression: Systems now summarize or drop middle tokens when capacity hits 90 percent. Managing what the model "remembers" is the primary engineering challenge.

Actionable Takeaways

The Macro Shift: The transition from Prompt Engineering to Context Engineering where the goal is keeping the model's workspace small and relevant.
The Tactical Edge: Replace your complex classification prompts with a single Bash tool. Let the agent write its own Python scripts to handle data transformations.
The Bottom Line: The winners in the agent space will not be those with the most complex logic. They will be the ones who build the best tools for the model to use.

Podcast Link: Click here to listen

So, welcome to the last workshop. You made it. Congrats. Out of like 800 people, you're the last standing sort of very dedicated engineers. Yeah, so this one's a weird one. I got in trouble with Anthropic on this one. Obviously because of the title. I actually also gave him the title and I was like, do you want to change it? He was like, no, I just roll with it. It's kind of funny. So yeah, this is not officially endorsed by Anthropic, but we're hackers, right? And Jared is super dedicated. He's and the other thing I also really enjoy is featuring like notable New York AI people. So don't take this as like one is the only thing that Jared does. He has a whole startup that you should definitely ask him about. But like you know I'm just really excited to feature more content for local people. So yeah, Jared, take it away. Thank you very much.

[Jared]: Thank you very much. And what an amazing conference. Very sad we're ending it, but hopefully it'll be a good ending here. And yeah, my name is Jared. This will be a talk on how Claude Code works. Again, not affiliated with Anthropic. They don't pay me. I would take money, but they don't. But we're going to talk about a few other coding agents as well. And kind of the high-level goal that I'll go into is me personally, I'm a big user of all the coding agents, as is everyone here. And they kind of exploded recently and as a developer I was curious what changed what made it finally what made coding agents finally be good. So let's get started.

[Jared]: I'll start about me. I'm Jared. You can find me I'm Jared Z on X on Twitter whatever. I'm building the workbench for AI engineering. So my company is called Prompt Layer. We're based in New York. You can kind of see our office here. It's like a little building. So it's blocked by a few of the other buildings. So we're we're a small team. We launched the product 3 years ago. So long for AI but small for everything else. And yeah, what kind of our core thesis is that we believe in rigorous prompt engineering, rigorous agent developing development and we believe that the product team should be involved, the engineering team should be involved. We believe if you're building AI lawyers, you should have lawyers involved as well as engineers. So that's kind of what we do. processing millions of LM requests a day.

[Jared]: And a lot of the insights in this talk come from just conversations we have with our customers on how to build coding agents and stuff like that. And also feel free throughout the talk we can make this casual. So if there's anything I say if you have a question feel free to just throw it in. And I spend a lot of my time kind of dog fooding the product. It's kind of weird the job of of a founder these days because it's half like kicking off agents and then half just using my own product to build agents and feels weird but it's kind of fun. And yeah, the last thing I'll add here is I'm a big enthusiast. We literally rebuilt our engineering org around Claude Code. I think the hard part about building a platform is that you have to deal with all these edge cases and oh we're uploading data sets here it doesn't work and you could die a death by a thousand cuts. So we made a rule for our engineering organization if you can complete something in less than an hour using Claude Code. Just do it. Don't prioritize it. And we're a small team on purpose but it's helped us a lot and I think it's really taken us to the next level. So I'm a big fan and let's dive into how these things work.

[Jared]: So this is what as I was saying the goal of this talk. First, why have these things exploded? What is the what was the innovation? What was the invention that made coding agents finally work? If you've been around this field for a little bit, you know that a lot of these autonomous coding agents sucked at the beginning and we all tried to use them. But it's it's night and day. we'll dive into the internals and and lastly we like everything in this talk is oriented around how do you build your own agents and how do you use this to do AI engineering for yourself. So let's just go talk about history for a second here. How did we get here? everybody knows started with remember the workflow of you just copy and paste your code back from chat GPT back and forth and that was great and that was kind of revolutionary when it happened.

[Jared]: Step two, when Cursor came out, if we all remember, it was not not great software at the beginning. It was just the VS Code fork with the command K and we all loved it. But, uh, now now we're not going to be doing command K anymore. Then we got the Cursor assistant. So, that little agent back and forth and then Claude Code. And honestly, in the last few days since I made this slide, maybe there's a new version we could talk about here. And at the end I'll talk about like kind of what's next. But this is how we got here. And this is really I think the Claude Code is kind of this headless uh not even this this new workflow of not even touching code. And it it has to be really good. So why is it so good? What what was uh what was the big breakthrough here? Let's try to figure that out.

[Jared]: And again throw this in one more time. These are all my opinions and what I think is the breakthrough. Maybe there's other things but simple architecture. I think a lot of things were simplified with how the agent was designed and then better models, better models and better models. I think the a lot of the breakthrough is kind of boring in that it's just Anthropic releasing a better model that works better for these type of tooling calls and these type of things. But the simple architecture relates to that. So we can dive into that. the architecture and and this is our little you'll see prompt wrangler is our little mascot for our company. So we made a lot of graphics for these slides but basically give it tools and then get out of the way is what a oneliner of the architecture is today.

[Jared]: I think if you've been building on top of LMS for a little bit this has not always been true. Obviously tool calls haven't always existed and tool calls is kind of this new abstraction for JSON formatting and if you remember the GitHub libraries like JSON former and stuff like that in the olden days but give it tools get out of the way. The models are built for these things and being trained to get better at tool calling and better at this. So the more you want to overoptimize and every engineer including my especially myself loves to overoptimize and when you first have an idea of how to build the agent you're going to sit down and say oh and then I'm going to prevent this hallucination by doing this prompt and then this prompt and then this prompt don't do that just a simple loop and get out of the way and just delete scaffolding and less less scaffolding more model is kind of the tagline here and you know This is the leaderboard from this week. Obviously, these models are getting better and better.

[Jared]: We could have a whole conversation and I'm sure there's been many conversations about is it slowing down? Is it plateauing? It doesn't really matter for this talk. We know it's getting better and they're getting better at tool calling and they're getting better optimized for running autonomously. And don't this is I I think Anthropic calls this like the AGI pill to way to think about it is don't try to overengineer around model flaws today because a lot of the things will just get better and you'll be wasting your time. So here's the philosophy, the way I see it of Claude Code, ignoring embeddings, ignoring classifiers, ignoring par matching. The we had this whole rag thing actually Cursors bring back a little bit of rag and how they're doing it and they're mixing and matching. But I think the genius with Claude Code is that they they scratched all this and they said we don't need all these fancy uh paradigms to get around how the model's bad. let's just make a better model and then let it let it cook and just leaning on these tool calls and simplifying the tool calls which is a very important part part instead of having a workflow where the master prompt can break into three different branches and then go into four different branches there there's really just a few simple tool calls including GP instead of rag and yeah and that's kind of what it's trained on. So these are very optimized tool calling models.

[Jared]: So this is the zen of Python if if you guys are familiar if you do import this in Python. This is I love this philosophy when it comes to building systems and I think it's really apt for how Claude Code was built. So really just simple is better than complex, complex is better than complicated, flat is better than nested. This is this is all you need to this is the whole talk. This is all you need to know about how Claude Code works and why it works specifically that just in we're going back to engineering principles such that simple design is better design. I think this is true whether you're building a database schema but this is also true when you're building these autonomous coding agents. So let's I'm going to now kind of break down all the specific parts of this coding agent and why I think they're interesting.

[Jared]: So the first is the constitution. Now a lot of the stuff we kind of take for granted even though they started doing it a month or two ago or maybe three or four months ago. So this is the cloud MD codeex or others use agents MD. The interesting thing I think I assume most of you know what it is. It's again it's where you put the instructions for your library. But the interesting thing about this is it's basically the team saying we don't need to overengineer a system where the model first researches the repo and Cursor like Cursor 1.0 as you know makes vector DB locally to understand the repo and kind of does all this research. They're just saying, "Ah, just put a markdown file. Let the user change stuff when they need. Let the agent change stuff when they need very simple and kind of goes back to prompt engineering, which I'm a little biased towards because Prompt Layer is a prompt engineering platform, but everything's prompt engineering at the end of the day or context engineering. Everything is how do you how do you adapt these general purpose models for your usage?" And the simplest answer is the best one here, I think.

[Jared]: So this this is the core of the system. It's just a simple master loop. And and this is actually kind of revolutionary considering how we used to build agents. Everything in Claude Code and and all the coding agents today, codeex and and and the new Cursor and AMP and all that, it's just one while loop with tool calls just running the master while loop calling the tools and going back to the master while loop. This is basically four lines of what it's called. I think they call it N0 internally. At least based on my research, but while there are tool calls, run the tool, give the tool results to the model, and do it again until there's no tool calls and then ask the user what to do. The first time I did this, the first time I used tool calls, it was very shocking to me that the models are so good at just knowing when to keep calling the tool and knowing when to fix their mistake. And I think that's one of the most interesting thing about LM just they're really good at fixing mistakes and being flexible. And the more just going back, the more you lean on the model to explore and figure it out, the better and more robust your system is going to be when it comes to better models.

[Jared]: So, so these are the core tools we have in Claude Code today. And to be honest, these change every day. you know, they're doing new releases every few days, but these are the core ones that I found most interesting to talk about. There could be 15 tomorrow, there could be down to five tomorrow, but this is what I find interesting. So, first of all, read. yeah, they could just do a cat. But what's interesting is read is we have token limits. So, if you've used Claude Code a lot, you've seen that sometimes it'll say this file's too big or something like that. That's why it's worth building this read tool. Grep glob. This one's very interesting too because it goes against a lot of the wisdom at the time of using rag and using vectors. And I'm not saying rag has no place by the way either. But in these general purpose agents, GP is good and and and GP is how users would do it. And I think that's actually a highle point here. As as you're as I'm talking about these tools, remember these are all human tasks. They're not we're not making up a brand new tool for the model to use. We're kind of just mimicking the human actions and what you and I would do if we were at a terminal trying to fix a problem.

[Jared]: Edit. Edit makes sense. I think the interesting thing to note in edit is it's using diffs and it's not rewriting files most of the time. way faster, way way less context used, but also way less issues. If if I asked you to if I if I gave you these slides and asked you to review the slides and you read it and had to write down all the slides for me in your new revisions versus if you could just cross out things in the paper, the crossing out is way easier. Diff is kind of a natural thing to prevent mistakes. Bash. Bash is bash is the core thing here. I think you could probably get rid of all these tools and only have bash. And the first time I saw this when when you run something in claw code and Claude Code creates a Python file and then runs the Python file then deletes the Python file. That's that's the beauty of why this thing works. So bash is the most important.

[Jared]: I'd say web search, web fetch. The interesting thing about these is they move move it to a cheaper and faster model. So for example, if you're building a some sort of agent maybe on your platform and you're building an agent and it needs to connect to some endpoints, some list of endpoints, might be worth to bring that into a kind of sub tier as opposed to that master while loop. That's why this is its own tool. To-dos, we've all se seen to-dos. talk about it a little bit more later, but keeping the model on track, steerability, and then tasks. Tasks is very interesting. It's context management. It's how do we how do we run this long process, read this whole file without cluttering the context? Because the biggest enemy here is when your context is full, the model gets stupid for lack of better words. So basically, bash is all you need.

[Jared]: I think this is the one thing I want to drill down. The amazing thing about there's two amazing things about bash for coding agents. The first is that it's simple and it does everything. It's it's very robust. But the second thing that's equally important is there's so much training data on it because that's what we use. It's not it's the reason that models are not as good at Rust or less common programming languages just because there's less people doing it. So it's really the universal adapter. You thousands of tools, you could do anything. This is that Python example I gave. I I I always find it so cool when it does the Python script thing or creates tests and I always have to tell it not to. But it all these shell tools are in it. And this is I mean I find myself using Claude Code to spin up local environments where normally I'd have like five commands written down on some file somewhere and then they get out of date. It's really good at figuring this stuff out and running the stuff you'd want to do. and it specifically lets the model try things.

[Jared]: So yeah, the other suggestions here and the tool usage I think there's a little bit of a system prompt that tells it which to use and when to use which tool over which and this changes a lot but the these are kind of like the edge cases and the corners you find the model getting stuck in. So reading before editing they actually make make you do that using GP the tool instead of the bash. So if you look at the tool list here there's a special GP tool. There could be a lot of reasons for that. I think security is a big one and sandboxing but then also just that token limit thing running independent operations in parallel. So kind of pushing the model to do that more. And then also like these trivial things like quoting paths with spaces. It's just the common common things. I'm sure they're just dog fooding a lot at Anthropic and they find it and they're like, "All right, we'll throw it in the system prompt."

[Jared]: Okay, so let's talk about to-do lists. Now again, a very common thing, but was not a common thing before. The the So this is actually I think a to-do list for from some of my my research for this slide deck. But the really interesting thing about to-do lists is that they're structured but not structurally enforced. So, here are the rules.

One task at a time.
Mark them completed.
Keep working on the in progress if there's block blocks or errors and kind of break up the tasks into different instructions.

[Jared]: But the most interesting thing to me is it's not enforced deterministically. It's purely prompt based. It's purely in the system prompt. It's purely because our models are just good at instruction following now. And this would not have worked a year ago. This would not have worked two years ago. There's tool descriptions at the top of the system prompt. We're kind of injecting the todos into the system prompt. there's they're not but it but it's not enforced in actual code and again maybe there's other agents that take an opposite path. I just found this pretty interesting that this at least as a user makes a big difference and it doesn't even see it seems it was it seems like it was very simple to implement almost a a weekend project someone did and seemed to work. could be wrong about about that as well, but so yeah, it's literally a function call. It's the first time you ask something, the reasoning exports this to-do block, and I'll show you what the structure is on the next slide.

[Jared]: There's ids there. There's some kind of structured schema and determinism, but it it's just injected there. So here's a example of what it could look like. You get a version, you get your ID, a title of the to-do, and then it could actually inject evidence. So, this is seemingly arbitrary blobs of data it could use. And the ids are hashes that it could then refer to title, something human readable, but this is a just another way to structure the data. And in the same way that you're going to organize your desk when you work, this is how we're trying to organize the model. So I think there's these are kind of the four benefits we're getting.

We're forcing it to plan.
We get to resume after crashes.
Clog code fails.
UX is a big part of this. As a user, you know how it's going. It's not just running off in a loop for 40 minutes without any signal to you. So UX is non-negligible. Even though UX might not make it a better coding agent, it might make it better for us all to use.
The steerability one.

[Jared]: So here's two other parts that were under the hood. Async buffer, so they called it H2A. It's kind of the IO process and how to decouple it from reasoning and and how to manage context in a way that you're not just stuffing everything you're seeing in the terminal and everything back into the model, which again context is our biggest enemy here. It's going to make the model stupider. So we need to be a little bit smart about that and and how we do compact and how we do summarization. So here you see when it reaches capacity it kind of drops the middle summarizes the head and tail. Then we have the that's the context compressor there. So what is the limit 92% it seems like something like that. And and how does it how does it save long-term storage? That's actually another kind of advantage of bash in my opinion and having a sandbox. I would even make a prediction here that all your all chat GPT windows, all clawed windows are going to come with a sandbox in the near future. It's just so much better because you can store that long-term memory. And I do this all the time. I have I have Claude Code skills for deep research and stuff like that. And I'm always instructing it save markdown files because the shorter the context, the quicker it is and the smarter it is.

[Jared]: So this is what I'm most excited about. We don't need DAGs like this. We I'll give you I'll give you a real example. So some users at Prompt Layer different agents like customer support agent basically everybody was building DAGs like this for the last two two and a half years. And it was crazy. Hundreds of nodes of okay this if this user wants a refund route them to this prompt if they want this and a lot of classifying prompts. The advantage of this is you can kind of guarantee there's not going to be hallucinations or guarantee there's not going to be refunds to people who shouldn't be having refunds or kind of that pro it solves the prompt injection problem because if you're in a prompt that purely classifies it as X or Y injecting doesn't really matter especially if you throw out the context. Now we kind of brought back bring back that attack vector but the but the major benefit is we don't have to deal with this web of engineering madness and it just it's 10x easier to develop these things 10x more maintainable and it actually works way better because our models are just good now. So this is this is kind of a takeaway is rely on the model. when in doubt, don't don't try to think through every edge case and think through every if statement. Just rely on the model to explore and figure it out.

[Jared]: And I was actually two days ago, I think, or yesterday, sometime this week, I was doing an experiment on our dashboard to add like trying these browser agents. And I wanted to see if I could add little titles to all our buttons and it would help the agent navigate our website automatically. And it actually made it worse, surprisingly. And maybe I could run it again and maybe I did something wrong with this test, but it made the agent navigate Prompt Layer worse because it was getting distracted because I was telling it you have to click this button, then you have to click this button and then it's it didn't know what to do. So, it's better to rely on exploration.

[Speaker]: You have a question? Yeah, I'll I'll push back a little bit, please. I'll admit any scaffolding we create today to resolve the idiosyncrasies of limitations will be that'll be obsolete 3 to 6 months even if that's the case they help a little bit today I how do you balance that like wasted engineering to solve a problem we only have for three months it's a great question so just to repeat uh the question is basically what is the trade-off between solving the actual problems we have today and if you're relying on the model that can't do it yet but it'll be able to do it in three months, right?

[Jared]: It's case by case. It depends what you're building. If you're building a chatbot for a bank, you probably do want to be a little bit more comp be careful. To me, the happy middle ground is to use this agent paradigm of a master while loop and tool calls, but make your tool calls very rigorous. So I think it's okay to have a tool call that looks like this or looks like half of this in the same way that Claude Code uses read as a tool call or GP as a tool call. So for the edge cases, throw it in a structured tool that you can then eval in version and stuff like that. And I could talk I'm going to talk a little bit more about that later, but throw it in that structured tool. But for everything else, for the exploration phase, leave it to the model or throw some system prompt. So it's a trade-off and it's very use case dependent, but I think it's a good question.

[Jared]: Thank you. So yeah, just back to Claude Code. We're we're getting rid of all this stuff. We're saying we don't want MLbased intent detection. We don't want reax. We don't want the I mean it uses reax a little bit, but we don't want reax baked into it. We don't want classifiers. And and there was a long time we actually built a product for Prompt Layer. We never released it because there's only a prototype of using a MLbased like a nonlm based classifier in your prompt pipeline instead of LMS. A lot of people have a lot of success with it, but it it feels more and more like it's not going to be that helpful unless cost is a huge concern for you. And even then cost is the smaller models is going less and less as kind of financial engineering between all these companies pays for our tokens.

[Jared]: So Claude does also this smart thing I think with the trigger phases. you know, you have think, think hard, think harder, and ultra think is my favorite. And this lets us use the reasoning budget, the reasoning token budget as another parameter that the model can adjust. And this is actually the model can adjust this, but this is how we force it to adjust. And as opposed to you could make a tool call for hard planning. And actually, there's some coding agents that do this. or you can let the user specify it and then just on the fly change it. So this is this is one of the biggest topics here. Sandboxing and permissions. I'm going to be completely honest, it's the most boring part of this to me because I just run it on YOLO mode half the time. It's some people on our team actually dropped all their local databases. So you do have to be careful. So you know we don't yolo mode with our enterprise customers obviously but I but but I think this stuff is it feels like it's going to be solved but but we do need to know how to works a little bit.

[Jared]: So there's a big issue of in prompt injection from the internet. If you're connecting this agent that has shell access and you're doing web fetch that's a pretty big attack vector. So there's some containerization of that. There's blocking URLs. You could see Claude Code's pretty annoying about can I fetch from this URL? Can I do this? And it kind of puts it into a sub agent. And yeah, most of the most of the complex code here is in this sandboxing and permission set. I think there's this whole pipeline to gate bash command. So it depending on the prefix is how it goes through the sandboxing environment and a lot of the other models work differently here. But this is how Claude Code does it. I'll explain the other ones later at the end.

[Jared]: The next topic of relevance here is sub aents. So this is going back to context management and this this problem we keep going back to of the longer context the the stupider our agent is. This is a this is an answer to it. So using sub aents for specific tasks and the key with the sub aent is it has its own context and it feeds back only the results and this is how you don't clutter it. So we got the researcher these are just four examples researcher docs reader testr runner code reviewer in that example I was talking about earlier when I added all the tags to our website to let the agent do it better. I obviously I use a coding agent to do that and I said read our docs first and then do it and it's going to do this in a sub agent. It's going to feed back the information and the the key thing here is the forks of the agent and how we aggregate it back into our main context.

[Jared]: So here's an example. I think this is actually very interesting. I want to call out a thing or two here. So task is what a sub aent is. We're giving task two things. Description and a prompt. The description is what the user is going to see. So you're going to say task find default chat context instantiation or something. And then the prompt you're going to give a long string which is really interesting because now we have the coding agent prompting its own agents. And I've actually used this paradigm in agents I've built for our product. If you can you can just have the agent stuff as much information as it wants in this string. And if we're going back to relying on the model if this task returns an error now stuff even more information and let it solve the problems. It's better to be flexible rather than rigid. If I was building this I would consider switching a string to maybe an object here depending on what you're building and maybe let it give actually more structured data.

[Speaker]: Yes. So I can see this prompt has quite a couple sentences. Is that in the main agent? Is that taking the context of the main agent or is there some sort of intermediate step where the sub agent double reads over you know like what the main agent is doing and then generates right?

[Jared]: So the question is does the task just get the prompt here or does it also get your chat history? Is that the question? The question is is are all of the I have my main agent. Is all of this in the system prompt of the main agent to inform how that prompts the sub agent? No. No. Like it's not in the system. It's in the whole context. Is the all of this context of the main agent the task it calls or or you're saying the structure for the task this whole JSON right or yes. So this is a tool call. So the tool called structure of what a task is is in the maiden agent. And then these are generated on the fly. So as you want to run a task, it's generating the description and the prompt. Task is a tool call. They could be run in parallel and then they're returning the results of it. Hopefully that helps.

[Jared]: So we could go back to the system prompt. So there's some leaks of the Claude Code system prompt. So that's what I'm basing this on. You can find it online. Here are some things I I noted from it. Concise outputs. Obviously don't give anything too long. No here is or I will just do the do the task the user wants. Kind of pushing it to use tools more more instead of text explanations. Obviously, I think when we we've all built coding agents and when we do it, it usually says, "Hey, I want to run this SQL." No, push it to use the tool. Matching the existing code, not adding comments. This one does not work for me, but running commands in parallel extensively and then the to-dos and stuff like that. There's a lot that you can nudge it to do with the system prompts. But as you see, I think there's a really interesting point to the earlier question you had about where what's the trade-off between DAGs and loops. A lot of these things you could see are feel like they came from someone using it clawed code and saying, "Oh, if only it did this a little less or if it did this a little bit more." That's where prompting comes in because it's so easy to iterate and it's not you're not it's not a hard requirement but if only it said here is a little bit more. It's okay to say it sometimes but all right skills.

[Jared]: Skills is great. It's a slightly newer. I've I honestly got convinced of it only recently. So good. I built these slides with skills. it's basically I think in the context of this talk about architecture, let's think of it as a extendable system prompt. So in the same way that we don't want to clutter the context, there's a lot of different type of tasks you're going to need to do where you want a lot more context. So this is how we give Claude Code a few options of how it could tap into more information. Here are some examples. I use this for I have a skill for docs updates to tell it my writing style and and my product. So, if I want to do a docs update, I say use that skill. Load in that skill. Editing Microsoft Office Microsoft do Microsoft Word and Excel. I I don't use this, but I've seen a lot of people using it. It kind of like decompiles the f it's really cool. But it lets Claude Code do this design style guide. This is a common one. Deep research. I the other day I threw in a like article or GitHub repo on how deep research works and I said rebuild this as a Claude Code skill works so well it's amazing.

[Jared]: So unified diffing I think this is worth its own slide. It's very obvious probably not too much we need to talk about here but it makes this so much better and it makes the token limit shorter. It makes it faster and makes it less prone to mistakes like I gave with that example when you rewrite an essay versus marking it with a red line. It's just better. I highly recommend using diffing in any agents you're doing. Unified diff is a standard. When I looked into a lot of these coding agents, some actually built their own kind of standard and like with slight variations on unified diff because you don't always need the line numbers and but unified diff works.

[Speaker]: You had a question to go back to skills. I are I don't know if anyone's seen the Claude the Claude Code warns you and in yellow text if your quad indeed is like greater than 40k characters and so I was like okay I'm up. Let me break this down into skills. So I bet spent some time and then Claude ignored all of my skills and so I put them in some. So what am I? I don't know. Skills feel globally misunderstood or like not I don't know I'm missing something. Help me understand.

[Jared]: Yeah. So the the question was on okay so Claude Code system cloud MD it tells you when it's too long. So you move it into skills and then it's not recognizing the skills and not picking it up when it's needed. Yeah. take that up with the Anthropic team I'd say. But that's also a good example of maybe the system prompt that was the intention like skills you need to invoke them and like the agent itself shouldn't like just call them all the time, right? It does give a dis description of each skill to the model or it should tell it okay here's like a oneliner about each skill.

How Claude Code Works - Jared Zoneraich, PromptLayer

How Claude Code Kills the Scaffolding

The Death of the DAG

The Bash Supremacy

Context is the Enemy

Actionable Takeaways

Others You May Like

⚡️ Reverse Engineering OpenAI's Training Data — Pratyush Maini, Datology

When AI Agents Start Hiring Humans: The Meatspace Layer Explained

David George on the State of AI Markets

How Claude Code Works - Jared Zoneraich, PromptLayer

How Claude Code Kills the Scaffolding

The Death of the DAG

The Bash Supremacy

Context is the Enemy

Actionable Takeaways

Join 10,000+ smart readers on our AI newsletter and stay ahead of the curve

Others You May Like

⚡️ Reverse Engineering OpenAI's Training Data — Pratyush Maini, Datology

When AI Agents Start Hiring Humans: The Meatspace Layer Explained

David George on the State of AI Markets