AI Engineer

January 5, 2026

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic

The Bash-Pilled Future: Why Anthropic is Betting on the Terminal

By Thariq Shihipar

Date: October 2023

This summary is for builders moving past chat wrappers into autonomous systems that execute complex tasks. Thariq Shihipar explains why the terminal is the ultimate agentic interface and how to build loops that actually work.

💡 Why is a 50-year-old Unix primitive: the secret to modern AI agents?
💡 How can code generation: solve problems for non-technical users?
💡 What is the "Swiss Cheese" defense: for agentic security?

Thariq Shihipar from Anthropic argues that the most powerful agents aren't just smart models. They are models with a file system and a terminal. This workshop explains how the Claude Agent SDK uses Unix primitives to build autonomous loops.

The Terminal Advantage

"Bash is the most powerful agent tool."

Infinite Tool Composability: Bash allows agents to store results in files. It lets them pipe those results into new commands without custom glue code.
Unix Primitives Rule: Using standard tools like grep allows the agent to discover its own path. This mimics how human engineers solve problems.
Low Context Overhead: You give the agent a terminal instead of 50 tool definitions. This keeps the model focused on the task.

Code Generation for Everyone

"We use codegen to generate docs, query the web, and take unstructured action."

Dynamic Problem Solving: An agent can write a custom Python script to solve a one-off data task. This is more flexible than a rigid workflow.
The Spreadsheet Hack: An agent can write a SQL query to find specific data. This solves the context window limit by moving the work to the data layer.

The Verification Loop

"If you can verify its work, it is a great candidate for an agent."

Deterministic Guardrails: Use linting to check the agent's output. This provides a hard signal that the model can use to self-correct.
Sub-agent Orchestration: Use smaller agents to check the work of the main agent. This prevents context pollution.

Actionable Takeaways

🌐 The Macro Shift: Moving from "Model-as-a-Service" to "Environment-as-a-Service" where the harness matters as much as the weights.
⚡ The Tactical Edge: Replace your bespoke API tools with a single bash tool. Use a well-structured file system.
🎯 The Bottom Line: The next year belongs to builders who stop treating LLMs as chatbots. They will treat them as system administrators.

Podcast Link: Click here to listen

Okay. Yeah, thanks for joining me. I'm still on the West Coast time, so it feels like I'm doing this at like 7:00 a.m. So yeah, but glad to talk to you about the Claude agent SDK. So yeah, I think this is going to be a rough agenda, but we're going to talk about what is the Claude agent SDK? Why use it? There's so many other agent frameworks. What is an agent? What is an agent framework? How do you design an agent using the agent SDK or just in general? And then I'm going to do some live coding or Claude is going to do some live coding on prototyping an agent. And I've got some starter code. But yeah, the whole goal of this is we got two hours. We're going to be super collaborative, ask questions. This is also going to be not like a super canned demo in the sense that we're going to be thinking through things live. I'm not going to have all the answers right away. And I think that'll be a good way of building an agent loop I think is really very much like kind of an art or intuition.

But yeah, before we get started, just curious, a show of hands, like how many people have heard of the Claude agent SDK? Okay, great. Cool. How many have used it or tried it out? Okay, awesome. Okay, so pretty good show of hands. Yeah, so I'll just get started on the overview on agents. I think that this is something that people have seen before, but I think it still is taking some time to really sink in how AI features are evolving. So I think when GPT 3 came out, it was really about single LLM features, right? You're like, oh, like, hey, can you categorize this like return a response in one of these categories.

And then we've got more like workflow like things, right? Hey, like can you take this email and label it or like, hey, here's my codebase like index via rag. Can you give me like the next completion or the next the next file to edit, right? And so that's what we'd call like a workflow where you're very structured. You're like, hey, like given this code, give me code back out, right? And now we're getting to agents, right? And the canonical agent to use is cloud code, right? Cloud code is a tool where you don't really tell it. We don't restrict what it can do really, right? You're just talking to it in text and it will take a really wide variety of actions, right? And so agents build their own context, like decide their own trajectories, are working very very autonomously, right?

And so, yeah, and I think as the future goes on, agents will get more and more autonomous. And we, yeah, I think it's like we're kind of at a break point where we can start to build these agents. They're not perfect, but it's definitely the right time to get started. So, yeah, Cloud Code, I'm sure many of you have tried or used. It is I think the first true agent right like the first time where I saw an AI working for like 10 20 30 minutes right so um yeah it's a coding agent and the cloud agent SDK is actually built on top of cloud code and the reason we did that is because basically we found that when we were building agents at Anthropic we kept rebuilding the same parts over and over again.

And so to give you a sense of what that looks like, of course, they're the models to start, right? And then in the harness, you've got tools, right? And that's sort of the first obvious step, like let's add some tools to this harness. And later on, we'll give an example of trying to build your own harness from scratch, too, and what that looks like and how challenging it can be. But tools are not just your own custom tools. might be tools to interact with your file system like with cloud code. Did the volume just go up or were they not holding it close enough?

Okay. Now anyways got tools tools you run in a loop and then you have the prompts right like the core agent prompts the prompts for the things like that. And then finally you have the file system right and or not finally but you have the file system. The file system is a way of context engineering that we'll talk more about later, right? And I think one of the key insights we had through cloud code was thinking a lot more through the context not just a prompt, it's also the tools, the files and scripts that it can use. And then there are skills which we've rolled out recently and we can talk more about skills if that's interesting to you guys as well. And then yeah things like sub agents web search you know like like research compacting hooks memory there all these other things around the harness as well and it ends up being quite a lot.

So the cloud agent SDK is all of these things packaged up for you to use right and yeah you have your application. So I think to give you a sense of yeah to give you a sense of maybe why the cloud agent SDK is yeah like like so yeah people are already building agents on the SDK a lot of software agents you know software reliability security triaging bug finding site and dashboard builders if These are extremely popular. If you're using it, you should absolutely use the SDK. I guess office agents, if you're doing any sort of office work, tons of examples there. Got some like, you know, legal, finance, healthcare ones. So yeah, there are tons of people building on top of it.

I want to Oh, yeah. Okay. So, why the cloud agent SDK, right? Like why did we do it this way? It's why did we build it on top of cloud code? And we realized basically that as soon as we put cloud code out, yeah, the engineers started using it, but then the finance people started using it and the data science people started using it and the marketing people started using it and yeah, I think it just we just realized that people were using cloud code for non-coding tasks and we felt and and as we were building, you know, non-coding agents, we kept coming back to it, right? And so, it's a like, and we'll go more into why that just works, why we you could use cloud code for non-coding task. Spoiler alert, it's like the bash tool.

But yeah, it's it was something that we saw as an emergent pattern that we want to use and we've built our agents on top of it, right? And these are lessons that we've learned from deploying cloud code that we've sort of baked in. So, tool use errors or compacting or things like that, stuff that is very can take a lot of scale to find, you know, like what are the best practices we've sort of baked into the cloud agent SDK. As a result, we have a lot of strong opinions on the best way to build agents. Like I think the cloud agent SDK is quite opinionated. I'll talk over some of these opinions and why like why we chose them, right? But yeah, one of the big opinions of the bash tool is the most powerful agent tool.

So okay, what what are like what I would describe as the Anthropic way to build agents, right? And I'm not I'm not saying that you can only build agents using the API this way, right? But this is like if you're using our opinionated stack on the agent SDK, what is it? Right? So roughly Unix primitives like the bash and file system and you know we're going to go over prototyping an agent using cloud code and my goal is really to show you what that looks like in real time right like why is bash useful why is the file system useful why not just use tools yeah agents I mean you can also make workflows and we'll talk about that a bit later but agents build their own context thinking about code generation for non-coding like we use codegen to generate docs, query the web, like do data analysis, take unstructured action.

So there's a lot of like this can be pretty counterintuitive to some people and again in the prototyping session, we'll go over how to use code generation for non-coding agents. And yeah, every agent has a container or is hosted locally because this is cloud code. It needs a file system, it needs bash, it needs to be able to operate on it. And so it's a very very different architecture. I'm not planning to talk too much about the architecture today, but we can at the end if that's what people are interested in in or sorry by architecture I mean hosting architecture like how do you host an agent and like what are best practices there? Have you talked about that at the end?

yeah so well let me pause there because I feel like I covered a lot already. any questions so far on the agent SDK agents yeah like what you get from it can you can you explain what code generation for non-coding means exactly yeah this is like basically when you ask cloud code to do a task right like let's say that you ask it to find the weather in San Francisco and like you know tell me what I should wear or something right? Like what it might do is it might start writing a script to fetch a weather API, right? And then start like maybe it wants it to be reusable. Like maybe you want to do this pretty often, right? So it might fetch the weather API and then get the like maybe even get your location dynamically right based on your IP address and then it will like you know check the weather and then maybe like call out to like a sub agent to give you recommendations. Maybe there's an API for your closet or wardrobe, right? It's like so that's an example. I think that like it's kind of for any single example we can talk over how you might use code codegen. A lot of it is like composing APIs is like the high level way to think about it. Yeah.

workflow versus agent like for repetitive task or you know like a process a business process that is always the same. Do you will still prefer to build an agent versus a fully deterministic workflow? Yeah. So, we do have Oh, sure. Yeah. Yeah. Um, so the question the question was about workflows versus agents and would you still use the cloud agent SDK for workflows? Is that right? Yes. And and so I mean we I just we just sort of tell you what we do internally basically and what we do internally is we've done a lot of like GitHub automations and Slack automations built on the cloud agent SDK. So, you know, we have a bot that triages issues when it comes in. That's a pretty workflow like thing, but we've still found that, you know, in order to triage issues, we want it to be able to clone the codebase and sometimes spin up a Docker container and test it and things like that. And so, it's still ends up being like a very like there's a lot of steps in the middle that need to be quite free flowing. And then you like give structured output at the end. So, yes.

All right, we'll take one more question and then we'll keep going. So, yeah, in the blue. Yeah. Uh so could you talk about security and guardians like if if you know you're using cloud agent SDK and you know you lean towards using bash as the you know all powerful generic tool and is the onus on building the agent builder to make sure that you know you're preventing against like common attack vectors or is that something that the model is is is doing itself? Yeah. So I think this is sort of like the Swiss chief. Oh yeah. Okay. So the question was permissions on the bash tool, right? Or like how do you think about permissions and guardrails the like in like when you're giving the agent this much power over you know your its environment and the computer, how do you make sure it's aligned, right? And so the way we think about this is what we call like the Swiss cheese defense, right? So like there is like on every layer some defenses and together we hope that it like blocks everything, right?

So obviously on the model layer we do a lot of alignment there. We actually just put out a really good paper on reward hacking. Super recommend you check that out. So like definitely I think cloud models like we try and make them very very aligned, right? And so yeah there's the model alignment behavior then there is like the harness itself, right? And so we have a lot of permissioning and prompting and like we do a pass par parser on the bash tool for example so we know fairly reliably like what the bash tool is actually doing and definitely not something you want to build yourself. And then finally the last layer is sandboxing right so like let's say that an someone has maliciously taken over your agent what can it actually do we've included a sandbox and like where you can sandbox network request and sandbox file system operations outside of the file system.

And so, yeah, ultimately that's what they call like the lethal triacto, right? Is like, um, like the ability to like execute code in an environment, change a file system, excfiltrate the code, right? I think I'm getting the lethal trifecta a little bit wrong there, but like the idea is basically like if they can excfiltrate your like information back out, right? That's like they still need to be able to extract information. And so if you sandbox the network, that's a good way of doing it. If you're hosting on a sandbox container like Cloudflare modal or you know E2B Daytona like all of these like sound sandbox providers they've also done like some level level of security there right it's like you're not hosting it on your personal computer or on a computer with like your prod secrets or something so yeah lots of different layers there and and yeah we can talk more about hosting in depth so okay so I'm going to talk a little bit about bash is all you need you

Um, I think this is something that Oh, yeah. Um, this is like my stickick, you know? I'm just going to keep talking about this until everyone like agrees with me. Or like I think this is something that we found at Anthropic. I think it is sort of something I discovered once I got here. Bash is what makes code so good, right? So, I think like you guys have probably seen like code mode or programmatic tool use, right? like the different ways of like composing MLPS cloudfl put out some blog post on that we put out some blog posts the way I think about code mode is like or bash is that it was like the first code mode right so the bash tool allows you to you know like store the results of your tool calls to files store memory dynamically generate scripts and call them compose functionality like tail graph it lets you use existing software like fmp or libra office right so there's a lot of like interesting things and powerful things that the batch tool can do.

And like think about like again what made cloud code so good. If you were designing an agent harness, maybe what you would do is you'd have a search tool and a lint tool and an execute tool, right? And like you have end tools, right? Like every time you thought of like a new use case, you're like, I need to have another tool now, right? Instead now cloud just uses grap, right? And nodes your package manager. So it runs like npm run like test.ts or index.ts s or whatever, right? Like it can lint, right? And it can find out how you lint, right? And can run npm run lint if if you don't have a llinter. It can be like what if I install eslint for you, right? So, um this is like you know like I said the first programmatic tool calling first code mode, right? Like you can do a lot of different actions very very generically, right?

And so to talk about this a little bit in the context of non-coding agents, right? So let's say that we have an email agent and the user is like okay how much did I spend on ride sharing this week a you know like it's got one tool call or generally it's got the ability to search your inbox right and so it can run a query like hey search Uber oryft right and without bash it it searches Uber oryft it gets like a hundred emails or something and now it's just got to think about it. You know what I mean? And I I think like a good like analogy is sort of like imagine if someone came to you with like like a stack of papers and like hey, how much did I spend on ride sharing this week? Can you like read through my emails? You know, I mean like that that would be really hard, right? Like uh you need very very good precision and recall to do it.

Or with bash, right? Like let's say there's a Gmail search script, right? It takes in a query function. And then you can start to save that query function to a file or pipe it. You can GP for prices. You know, you can then add them together. You can check your work too, right? Like you can say, okay, let me grab all my prices, store those as like in a file with line numbers and then let me then be able to check afterwards like uh was this actually a price? Like what does each one correlate to? Right? So there's a lot more like dynamic information you can do to check your work with the bash tool. So this is like just a simple example but like hopefully showing you sort of the power of like the composability of bash right so I'll pause there any questions on bash is all you need the bash tool any any thing I can make a little bit clearer do you have stats on how many people use yolo mode stats on yolo mode we probably do I mean internally we we don't but that's just I think we just have a higher security posture.

yeah, I'm not sure. I can probably pull that. Any other questions on bash? Okay, cool. Yeah, just to give you like some more examples like let's say that you had an email API and you wanted to uh, you know, like go through like fetch my like tell me who emailed me this week, right? So, you've got two APIs. You've got an inbox API and a contact API. This is like a way you can do it via bash. You can also do it via codegen. This is kind of like enough bash that it is codegen, right? Like bash is a ostensibly codegen tool. And then yeah like let's say that you wanted to you had a video meeting agent, right? You wanted to say like find all the moments where the speaker says quarterly results in this earnings call, right? You can use ffmpeg to like slice up this video, right? you can use jq to like start analyzing the information afterward.

So yeah, lots of like def like powerful ways to use to use bash. So I'm going to talk a little bit about workflows and agents. Yeah, you can do both. You could use build workflows and agents on the agent SDK. Yeah, agents are like cloud code. If if you are like building something where you want to talk to it in natural language and take action flexibly, right? Then that's why you're building an agent, right? Like you want you have an agent that talks to your like business data and you want to get insights or dashboards or answer questions or write code or something like that's an agent, right? And then a workflow is kind of like, you know, we do a lot of GitHub actions for example, right? So you define the inputs and outputs very closely, right? So you're like, "Okay, take it a PR and give me a code review."

And yeah, both of these you can use agent SDK for. When building workflows, you can use structured outputs. We just released this. You can, yeah, Google agent SDK structured outputs. But yeah, so you can do both. I'm going to primarily be talking about agents right now. A lot of the things that you can like learn from this are applicable to workflows as well. So, yeah, we'll we'll talk about this. Uh, wait, show of hands. How many people have designed an agent loop before? Okay, cool. Okay, great. Great. Um, so yeah, I mean, I think the number one thing the metalarning for designing an agent loop to me is just to read the transcripts over and over again. Like every time you see see the agent running, just read it and figure out like, hey, what is it doing? Why is it doing this? can I help it out somehow? Right?

And we'll do some of that later, right? So we'll build an agent loop. But here is the three parts to an agent loop, right? So first it's gather context, right? Second is taking action and the third is verifying the work, right? And this is like not the only way to build an agent, but I think a pretty good way to think about it. Gathering context is like you know for cloud code it's grepping and finding the files needed, right? You know for an email agent it's like finding the relevant emails, right? And so these are all like pretty yeah like I I think thinking about how it finds this context is very important and I think a lot of people sort of skip the step or like underthink it. This can be like very very important. Uh, and then taking action how does it like do its work? Does it have the right tools to do it like code generation, bash these are more flexible ways of taking action, right?

And then verification is another really important step. And so the basically what I'd say right now is like if you're thinking of building an agent, think about like can you verify its work, right? And if you can verify its work, it's like a great like candidate for an agent. If you can't verify its work, like it's like you know coding you can verify by lending, right? And you can at least make sure it compiles. So that's great. if you're doing let's say deep research for example it's actually a lot harder to verify your work one way you can do it is by citing sources right so that's like a step in verification but obviously research is less verifiable than code in some ways right because like code has a compile step right you can also like execute it then see what it does right so I think like thinking on you know like as we build agents the ones that are closest to being very general are the ones with the verification step that is very strong right So I I think there was a question here. Yeah.

So when where do you generate a plan of the work? Yeah. I mean you you might question Oh yeah sorry the the question was when do you generate a plan before you run through it. So like in cloud code you don't always generate a plan. But if you want to you'd insert it between the gathering context and taking action step, right? And so plans sort of help the agent think through step by step, but they add some latency, right? And so there is like some trade-off there. But yeah, the agent SDK helps you like do some planning as well. So yeah. Yeah. Can you like make the agent create that to-do list for like 100% sure that it will create that to-do list and run by it? Uh yeah. So the question was will the agent create the to-do list? Uh yes. If you're using the agent SDK, we have like some to-do tools that come with it and so it will like maintain and check off to-dos and you can display that as you go. So yep. Um, any other questions about this right now? Okay, cool.

Okay, so I'm going to quickly talk about like like how do you do this stuff? You like what are your tools for doing it, right? And there are three things you can do that you have tools, bash and code generation, right? And I I think traditionally I think a lot of people are only thinking about tools and yeah, basically one of the call to actions is just figuring out like thinking about it more broadly, right? So tools are extremely structured and very very reliable, right? Like if you want to sort of have as fast an output as possible with minimal errors, minimal retries, tools are great. cons, they're high context usage. If anyone's built an agent with like 50 or 100 tools, right? Like they take up a lot of context and the model it kind of gets a little bit confused, right? There's no like sort of discoverability of the tools. And they're not composable, right? and and I say tools in the sense of like if you're using you know messages or completion API right now that's how the tools work of course like you know there's like code mode and programmatic tool calling so you can sort of blend some of these but there's bash so bash is very composable right like static scripts low context usage it can take a little bit more discovery time because like let's say that you have whatever you have like the playright MCP or something like that or sorry the playright CLI the playright like bash tool you can do playright-help to figure out all the things you can do but the agent needs to do that every time right so it needs to like discover what it can do which is kind of powerful that it helps take away some of the high context usage but add some latency there might be slightly lower call rates you know just because like it has a little bit more time to it needs to like find the tools and what it can do.

But this will definitely improve as it goes. And then finally, codegen highly composable dynamic scripts. They take the longest to execute, right? So they need linking possibly compilation. API design becomes like a very very interesting step here, right? And I and I'll talk more about like best like how to think about API design in an agent. But yeah I think this is like how we like the the three tools you have and so yeah using tools think you still want some tools but you want to think about them as atomic actions your agent usually needs to execute in sequence and you need a lot of control over right so for example in cloud code we don't use bash to write a file we have a write file tool right because we want the user to be able to sort of see the output and approve it and we're not really composing write file with other things, right? It's like very atomic action. Sending an email is another example. Like any sort of like non-destruct like destructible or sort of like you know unreversible change is definitely like a a tool is a good place for that.

Then we've got bash. Uh so for example there are like composable actions like searching a folder using GitHub linting code and checking for errors or memory. And so yeah you can write files to memory and that can be your bash like bash can be your memory system for example right so and then finally you've got code generation right so if you're trying to do this like highly dynamic very flexible logic composing APIs like you're doing data analysis or deep research or like reusing patterns and so yeah we'll talk more about code generation in a bit any questions so far about like the SDK loop loop or tools versus bash versus codegen. Yeah. Yeah. Uh I was going to ask you are you going to have any readymade tools for like offloading results offloading tool called results like into the file system or like let's say goes to bash and then context explodes. Does it like typed a command that like do everything up?

Or or otherwise just like long outputs polluting your history. Sure. Yeah. Yeah. I I think that's a good common practice. I think we I I remember seeing some PRs about this very recently on on cloud code about handling very long outputs and I I I don't know exactly like I I think I think we are moving towards a place where more and more things are being like just stored in the file system and this is like a good example. Yeah, like it's storing like long outputs over time. I think like generally prompting the agent to do this is a good way to think about it. Or even if you have I think like something I just do always now is like whenever I have a tool call I I save it like the results of the tool call to the file system so that you can like search across it and then have the tool call return the path of the result. Just because like that helps it like sort of recheck its work. So yes.

Do you find that you need to use like the skills kind of structure to help Claude along to use the bash better or out of the box? You know, that's not necessary. Yeah. So, the question was about skills and like do we need skills to use bash better? Yeah, for context skills maybe I can Okay, skills. Okay. Yeah, skills are basically a way of like you know allowing our agent to take longer complex tasks and like sort of load in things via context, right? So some like for example we have a bunch of DOCX skills and these DOCX skills tell it how to do code generation to generate these files, right? And so yeah, I think overall skills are yeah, basically just a collection of files. They're also sort of like an example of being very like file system or bash tool pilled, right? Because they're really just folders that your agent can like CD into and like read, right? And so yeah, they give like what we found the skills are really good for is pretty like repeatable instructions that need a lot of expertise in them. Like for example, we released a front-end design skill recently that I really really like and it's really just sort of a very detailed and good prompt on how to do front-end design. But it comes from like our best, you know, like AI front-end engineer, you know what I mean? And he like really put a lot of top thought and iteration to it. So that's one way of using skills.

quick question. Why use that front skill? Sure. It's pretty good. Thanks for publishing it. I want to understand there are multiple MP files like MP is also there and it is also at the user level and then there are skill files like is there like a priority order should some stuff be relegated to claw.md and some other stuff should only come to skill.md? H so the question was about skill.md versus claw.md and how to think about that right and I think like I I will say all of these concepts are so new you know I mean like even cloud code is like released it like eight or nine months ago right like and so skills were released like two weeks ago like I like I won't pretend to know all of the best practices for for everything right I think generally skills are a form of progressive context disclosure closure and that's sort of a pattern that we've talked about a bunch right like with like bash and you know like preferring that over like you know purely like normal tool calls is like it's a way of like the agent being like okay I need to do this let me find out how to do this and then let me read in this skill empty right so you ask it to make a docx file and then it like cds into the directory reads how to do it writes some scripts and keeps going so yeah I think like there's still some intuition to build around like what what exactly you like define as a skill and how you split it out.

But uh yeah, I think yeah, lots of best practices to learn there still. Um yeah, so yesterday we talked about the future of skills over time. Do you see these as ultimately becoming part of the model and some of the skills this is just a way to bridge the gap for now? Yeah. Yeah. So the question was are skills ultimately part of the model? Are they a way to bridge the gap? I missed Barry's talk at Barry and M's talk yesterday, but yeah, I think roughly the idea is that the model will get better and better at doing a wide variety of tasks and skills are the best way to give it out of distribution tasks, right?

but I I would broadly say that like it's really really hard especially like you know if you're like not at a lab to like tell where the models are going exactly. My general rule of thumb is like I try and like rethink or rewrite my like agent code like every 6 months. Just cuz I'm like things have probably changed enough that I've like baked in some assumptions here. And so like I think that like our agent SDK is built to as much as possible sort of advance with capabilities, right? Like the bash tool will get better and better. We're building it on top of cloud code. So as cloud code evolves, you'll get those wins out out of the gate. But at the same time like you know things are so different now like than they were a year ago in in terms of like AI engineering, right? And I think like a general best practice to me is sort of like, hey, we can write code 10 times faster. We should throw out code 10 times faster as well. Um, and I think thinking about like not so like hedging your bets on like where is the future right now, but like what can we do today that really works, right? And like like let's get market share today and not be afraid to throw out code later. Um, if you're a startup, this is arguably your largest advantage that you have over competitors. They're like, you know, larger companies have like six-month incubation cycles. And so they're always like stuck in the past of like the agent capabilities, right? And so your advantage is that you can like be like, hey, the agent the capabilities are here right now. Let me build something that uses this right now, right? So, um, yeah.

any any other questions on for we're talking about skills in bash. Okay. It seems like there are a lot of skill questions. So um yeah I I think at the back someone you might have to shout. Yeah. So why would you use a skill versus an API? They look very similar to that Python program there could be a package, right? Yeah. The question was why use a skill versus an API? Good question. I I think that like um when you like these are all forms of progressive disclosure basically to the agent to figure out what it needs to do. And I'll go over like examples of like you just have an API, right? In in our like in our prototyping session. It's totally like use case dependent, right? Like just I think like I don't have a like I don't think there's a general rule. I think it's like read the transcript and see what your agent wants. If your agent always wants like thinks about the API better as like a API.ts file or something or API.py file, do that. You know, that's great. Like I think skills are like like sort of an introduction into like thinking about the file system as a way of storing context, right? And they're a great abstraction. Um, but there are many ways to use the system. Um, and I I should say that like something about skills that like you need the bash tool, you need a virtual file system, things like that. So the agent SDK is like basically the only way to really use skills to like their full extent right now. So yeah.

Back there. Can we expect a marketplace for skills? Yeah. The question was can we expect a marketplace for skills? So yeah, clock code has a plug-in marketplace that you can also use with the agent SDK. We're evolving that over time, you know, like it was like a very much a v0ero. And by marketplace, I'm not sure if people will be charging for this exactly. It's more just like a discovery system, I think. But yeah, that exists right now. You can do SL plugins in cloud code. And and you can find some. So, yeah. Yep. What's your current thinking about when you're going to reach for like the SDK, you know, to solve a problem? When? Yes. The question is when do I use the SDK to solve a problem? if I'm building an agent basically I I think that like my overall belief is that like for any agent the bash tool gives you so much power and flexibility and using the file system gives you so much power and flexibility that you can always ek out performance gains over it right and so yeah in the prototyping part of this talk we're going to like look at an example with only tools and an example without with you bash and the file system and compare those two. And yeah, that's what I mean by being bashful to build. I'm like I I just like start from the agent SDK, you know, and I think a lot of people at Enthropic have started like doing

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic

The Bash-Pilled Future: Why Anthropic is Betting on the Terminal

The Terminal Advantage

Code Generation for Everyone

The Verification Loop

Actionable Takeaways

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas

Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic

The Bash-Pilled Future: Why Anthropic is Betting on the Terminal

The Terminal Advantage

Code Generation for Everyone

The Verification Loop

Actionable Takeaways

Join 10,000+ smart readers on our AI newsletter and stay ahead of the curve

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas