AI Engineer

January 6, 2026

Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel

The Durability Moat: Building Reliable AI Agents with Vercel Workflow

By AI Engineer

Date: October 2023

Quick Insight: This summary is for builders moving beyond brittle AI scripts toward production-grade autonomous agents. It explains how Vercel’s Workflow DevKit provides the durability layer needed to run complex, multi-day tasks on serverless infrastructure.

💡 How can serverless functions handle AI tasks that take hours or days to complete?
💡 Why is deterministic orchestration the secret to debugging agentic loops?
💡 How do you integrate human-in-the-loop approvals without losing execution state?

Building AI agents is easy in a notebook but a nightmare in production. Peter Wielander from Vercel explains how their new Workflow DevKit solves the "reliability gap" by treating agent steps as durable, cached events.

Top 3 Ideas

🏗️ Stop Restarting Your Agents

"[Everything you do in a workflow you can resume at any point.]"

Automatic Caching: Every step input and output is stored in a persistent data store. This ensures that a single network failure does not force a total restart of a complex multi-step task.
Checkpoint Logic: Think of this as a save-point in a video game for your code. If a serverless instance hits a timeout, the next instance picks up exactly where the last one finished.
Observability Spans: Each step is automatically tracked in a web UI. You can inspect every LLM call and tool execution in real-time to find exactly where an agent went off the rails.

🏗️ The Week-Long Serverless Function

"[A workflow could wait for a week and not consume any resources.]"

Compute Efficiency: The `sleep` function suspends execution without holding onto active compute. You can now build agents that perform daily checks for a year without paying for idle server time.
Human Interruption: Webhooks allow agents to pause and wait for manual user approval. This creates a bridge between autonomous AI and necessary human oversight for high-stakes actions.

🏗️ Orchestration Without Side Effects

"[The workflow orchestration layer needs to be deterministic.]"

Compiler Safety: A TypeScript plugin ensures the main logic remains pure and repeatable. This prevents the "ghost in the machine" bugs that occur when orchestration logic accidentally triggers side effects.
Side Effect Isolation: Tool calls are wrapped in a `useStep` directive. This separates the messy reality of the external internet from the clean logic of your agent’s decision-making process.

Actionable Takeaways

🌐 The Macro Transition: We are moving from "fire-and-forget" prompts to durable execution environments where state is as important as the model itself.
⚡ The Tactical Edge: Wrap your existing tool calls in the `useStep` function to gain instant retry logic and execution history.
🎯 The Bottom Line: Reliability is the primary moat in the agent market. Builders who adopt durable workflows will move to production while others are still debugging local scripts.

Podcast Link: Click here to listen

Thank you all for coming. Hello. Hello. Got you. I don't know about you, but my ride agents, I like focusing on the capabilities and the features, and I like not thinking about all of the extra effort that goes into getting something that works locally into production. And something that's very useful for that is a workflow pattern. And that's why we developed the workflow def kit which is what we're talking about today. Presumably if you're here you've had similar issues. And today we are going to turn an agent, a coding agent, into a workflow supported coding agent throughout this session. So this slide. So we have an open-source example ready to go. This is on the versel/examples repository. So you can clone that and check out the VIP coding platform app insight. We're going to be using this app for today's demo. And after we're done we get first class observability built in and also durability reliability. We get a lot of extra features like resumability and draft kit makes it very easy to add human in loop workflows and similar things.

So if you think about our general agent loop that we've all seen before, we mostly have calls back and forth from another LAM to our to calls and our backend code, right? Which would include MCP servers, human approval, any kind of async tasks. And the usual way to go about this is to wire up some cues and a database especially if you are doing longunning agents right that might run for hours and you want to scale and you're running on serverless for example you want something some kind of reliability layout in between which usually is filled by cues and then you'll also need to add a lot of error and retry code you'll need to store all the messages that people are sending and also between states and you probably also need to add some kind of observability layer in between. All of those things we are going to do today using only a single library which is the workflow development kit. It's open source and it runs with any of your TypeScript front ends or backends and can also run on any cloud. We're going to be deploying to Brazil today, but this could just as easily run or any of your cloud stacks orbase or any of your custom stacks.

All right, so who here has heard of the workflow pattern or has used a workflow library before? Show of hands. All right, that's less than half. I'm going to quickly explain what a workflow pattern is to make it clear what we're doing and then in about 2 minutes we're going to go into the code. So a workflow pattern is essentially a sort of orchestration layer that separates your code into steps that can run in isolation and can be retried and have their data persisted and a orchestration layer that we call a workflow within platforms have different names for that. And in our case here, the workflow parts would be whatever the loop is that calls the reason the LM calls and then goes back to the tool calls and then back to the LM calls. And the steps would be our actual tool calls and our LM calls.

Right, so looking at the agenda for today, we're going to be jumping jumping into the code. We're going to add workflow development kit which is going to be quite fast and then we have a lot of time to talk about cool additional features that it adds like resumable streams out of the box how to suspend and resume at any point and how to add web hooks for human in the loop processes. At the end there's going to be ample time for Q&A but there is a reason that you're here in the workshop and not looking at this online which is that you can ask questions. Please do so at any point. Feel free to raise your hand or just shout out the question. And we'll get rid of all right so as I said we're working off the baselample repository and we're going to be working off of the conface branch. Why this branch? I stripped a bunch of the access code on the example to make sure that we can focus on the most important parts. And every checkpoint from this workshop will have its own branch. So if you're not coding along directly, you can also check out the steps step by step and then check the diffs and see what changed between.

So I have already run npm dev locally on this platform just to show you what it looks like. I'm going to run a simple query. So this is a coding agent, right? It's like a code editor but without the code editing and it can take a prompt, generate some files and it'll eventually show you a iframe with the finished app that is deployed. So it's mostly UI with a few simple tool calls that we'll look at in a second. And the file system and output runs over the cell sandbox, but you could just as easily run this locally.

Looking at the code, I'm going to go and check out our actual branch. Looking at the code, we have one endpoint that accepts our chat messages, right? Then it does some some regular sort of model ID checking with us to see whether the model is supported. And in the end it's going to simply create an agent. Oh yes. What was that branch one more time? The branch was conf. Yeah. And you can see we'll we'll move on to these com/2- etc. Just look for the numbers and you'll find all the checkpoints. Yes. So our main endpoint just accepts some messages and calls the AI SDK agent which is essentially the same thing as a stream text call. We'll pass some tools and internally it'll just loop for stream text to call stream text call and then it'll stream the all of the chunks generated back to the client in a format that is easy for the client to understand. This is all sort of AISDK regular code that you could replace with a different library if you want. That is mostly there to support the UI. But again all of the actual agent stuff is very simple and happens here.

Oh let's also take a look at the tools that we have. We have four tools create sandbox get sandbox URL. These are very simple. They just wrap bell sandbox.create and get URL and similar with run command essentially reps sandbox.r run command and generate files will generate a file from a simple prompt. And we're going to take a look at one of these root calls as an example. We have a we have a prompt that looks somewhat like, you know, a markdown file. Sort of what to do, what not to do. And my hotkeys are not working. And back to the tool call. We also have an input schema that's a Z schema for what BI is supposed to pass. This is all very standard. and then an execute run which wraps sandbox one command with some error handling. So that's essentially our entire agent code setup and then in the front end we just call use chat from AI SDK to consume the stream and display things in the UI.

So let's get started adding workflow to this. Any questions before I get started? Cool. All right. So step one is we're going to run npm install for workflow and add workflow which will give us the latest version. Workflow is the main library and add workflow are some helpers for some some rappers for that work well with the workflow development kit. So now that we have this installed, we are running a Nex.js app here. So we're going to extend the compiler to compiler workflow code by doing use workflow or with workflow. Which we can import from workflow next and that'll set up next.js. Yes. Verifying question. You are in the example applied coding directory. Yes.

So adding this will let the compiler know to also compile our web separately which we'll get into more in a second. And then for convenience, we can also add a TypeScript plugin to our TS config same package and that'll give us some better autocomp completion for our workflow code. And so we talked about a workflow being having an orchestration layer and having a number of steps. What we're going to do first is we're going to write the orchestration layer. In our case, that is essentially just the agent, right? Does the loop that calls steps back and forth. We're going to add a new file. You can call it whatever you want. And we're going to take our agent call and move it over there. And I'm going to call this our code workflow, which is going to be all of our workflow code. And then I'm going to go and auto complete a bunch of imports. Thank you, AI. So we're just passing most of the arguments that we would otherwise get from here over there. And this completes the refactor. Essentially having done nothing but pull out some of the workflow code into our file.

So this is where it gets interesting. Now that this is a code, this is a separate function, we can use the use workflow directive which will mark this for our compiler as a workflow function. So what this does is under the hood, yes. Oh, sorry. under the hood it compiles all of the code related to the function into a separate bundle and it ensures that there is no imports to anything that would have side effects because the workflow orchestration layer needs to be deterministic. So it can be rerun in a in a deterministic fashion and there's no worries about state pollution. So now that we have this we would need to now mark our LLM calls as steps and because the calls are happening inside the agent this is a little bit harder to do here and so we ended up writing a durable agent class which is essentially the same thing as agent with a use step marker in the actual LM calls that it does under the hood.

So, now that we have this set up, we're going to await the actual streaming and let's see if there's anything we need to do. Checking for errors. Oh, yes, we need a stream to write to. So, previously we could just write to the stream that the API handler gave us. Now we're going to have to create a new stream to write to. We export a get writable function from workflow which is which gets a stream implicitly associated with the workflow to write to. And we're going to get get into that a little bit more in a second. But for now, we'll just pass that to our agent. And we're going to see if this is right type. Presumably not. And then finally back in our actual workflow we need to call our workflow in a way that the framework understands which for us is a call to start with the arguments being passed separately which is essentially telling it to start a new workflow on this on this function and start can be imported from workflow/ API

Okay, so now we essentially have the workflow fully hooked up and a lot of this was just pulling out some of the codes and adding a directive. Yeah. has volunteered to help anyone who's like following along and has some debugging questions. Just reach out. I'm on the team as well. So let me know if you guys are following one. I'll be around to help. And finally, this start call returns a run instance that has the stream that just end up writing to that we can return to the UI. So this completes our workflow definition. And now we also said that we would need to mark things as steps. The durable agent class already marked the LLM calls as steps. But our tools right now are not marked as steps. Thankfully, this is very easy. In the execute function for each of these tools, you can just write use step and that will let the compiler know that this is a separate chunk to of code to execute in a separate instance. Right? If this is deployed to production, this would run in a separate serless instance and the inputs and outputs would be cached if it already ran and it would be retrieded if it failed. So I'm going to go and go through the other tool calls and also add use step to these. Thankfully we only have four of them. And that should complete our transformation.

So now we can go and run the mpm dev. See if this works as expected. We're going to reload our page. And it seems like nothing changed. Let us actually run a query. And we can see that it's still streaming as expected. So for us developing locally right all we had to do is pull out a function and then add some directives. But now if I deploy this to any adapter again was or an AWS adapter or maybe you have your own this will run in isolation with durability and all of those good things. And something that's really nice for local development also is that if I go and if I go and I'm going to go into the same folder here and I'm going to run npx workflow web which is this cli call to start a local web UI to inspect our runs. And you can see that our run is currently still running. And every step everything that is marked as a step will have a span here and you can inspect the inputs the outputs and any associated event. And we can see that our workflow just completed I think and yeah this gets built in. Yes. And just for clarification every time you're prompting your vibe coder that is is one instance of the workflow that runs to completion. So then so each one is Yeah. It's exactly Yeah. And you could model this in any way you want. You can also model your entire like an entire user sort of session as one workflow and have the workflow sort of do a loop wait for the next query and then again you know we can run code for weeks if we need to essentially and I'm going to go into some tools for that in a second.

So now that we have this set up you can see that on the right side we do not get any sort of helpful feedback. But if I visit this link and see that our app has likely been created correctly or or it failed because of some errors and either way we're not getting any output on the right side. So the reason this is happening is that we are streaming the agent output to the client but our tools aren't actually doing any stream calls right now. So what we could do is similarly in our tool calls we could get the writable which would which will get the same writable instance as any other as the workflow itself. There is an infinite amount of streams you can you can create and consume in a workflow. And you can also like you can tag them with a certain name and then fetch them from there. But this will get the default instance. And once we have a writable we can actually connect to the writable by getting the writer. And now we can write any kind of information to the to the iPhone to be consumed. I think we want something like data create sandbox I think is what I hooked up in the UI and then we'll call ID we want the sandbox ID do it here. So this is me just writing a data packet that our UI knows how to consume.

So now that I did this and I if I reload the app and start this again, we'll see that at least the sandbox create call presumably gets filled in correctly at the start. Yeah, you said that there are stream that you can create and what do you mean by that? Right. Yes. So a stream the workflow sort of the adapter they use for workflows in local development right this would just just be a file in production this might be a reddus instance supports the workflow calling it to create a new stream for example in reddus right and then passing that stream back and so anytime you call get writable it'll create a stream for example again in radius with the ID of that workflow and it'll pass that So any step can attach to that and any client can attach to that and in local host this would be written to a file and read from a file. Sorry you're setting up right now. Right. So pre previously we had a a API handler that took some messages called the agent and then streamed back messages from that API handler. Now we have an API handler that calls it starts a workflow and it'll pass back the stream that this workflow creates.

What this allows us to do also I think that was not working correctly. I'm going to restart the server just to see if that's the case. Anything else so far getting this set up to where where needs any help with getting set up? Seems good. good point you made something this allows us to do is that the stream is not bound to the API handler. This means that at any point we can resume this stream. If you lose connection to your API handler and then the user reconnects this stream still exists and we could reconnect to the stream to resume the session. This is also part of the durability aspect where everything you do in a workflow you can resume at any point. I'm going to restart this query and hope that it works this time. Yeah. So, now that I hooked up this data packet, you can see this special UI handling for creating a sandbox works. But even after it's done, it's not showing up that it's done. This is because we're only writing the initial loading state packet. So, I could go through all of our tools and I could add more packets and just, you know, make the UI richer. But I'm going to go and check out a different branch which is the conf- sleep branch. Just the next step which already has these actually I'll go for the workflow one. Sorry, conf /2-workflow which already has all of these writer. Calls populated. There's no difference otherwise. So now that all of our tools have these right calls, the stream would again presumably look the same as it did when we started out in this app.

All right. So, now that we have streams working again, everything is working as expected and we have more observability and we can deploy this with durability. I talked about resumable streams before. We're going to see if we can get this stream to resume so we have durable sessions. So, the only thing we need to do to make that work is to go to our API endpoint. And what where we get the run instance, we're also going to return the workflow ID as a as additional additional information. So I can I can return run.r run ID for example. This is just again any way you do this is fine. I'm adding it as a header here because we're already returning a stream. But anyway you pass the ID to the UI is something that the UI can then use to resume the stream from. So what we do from here is the UI should be able to decide whether to whether it has a run ID and whether it should resume a stream. So we're going to go and create a new endpoint. Let's call it ID for type slashexisting ID. Then we're going to make a folder stream and we're going to add a route handler.

So this is just next.js configuration for adding an API route at slash chat/ ID/ stream and we're going to auto complete with AI. What we're essentially doing is we get the ID from the params and then all we're going to do is call get run in the workflow API which gets a the run instance and then we can return the same stream that we return in the other endpoint just without calling the actual agent only only doing the stream and this whole project I think that should be good. We're also taking our start index which is very helpful from the AI. We can get a readable stream from a certain start point. I think this is why it's auto computed. So if you're trying to resume a stream like midway, you can pass a, you know, which chunk you were on when you initially left off. So now that this is done, I'm going to comment out these things we don't currently need. we need the UI to support this conditional of whether to resume or whether to start a new chat. So I'm going to go to our chat front end and I'm going to go pull in some code from a different branch for simplicity which is the it's on the four-streams branch which I'm going to just show for completion. we do a use chat call already in the UI to consume the stream and we all we added now is a transport layer which is this big block here that has some middleware for the stream that says that if I'm trying to start this call I'm going to check first whether we have an existing run ID and if so I'm instead going to do a reconnect by calling this different API endpoint instead I'm sort of handwaving over this a little bit because it's client side handling for for traditionals. If there's more questions about this, please feel free.

All right. So, that gives us resumable streams. And I'm also going to demo what if we wanted to deploy this and see it in production. So I'm going to call this and then we can check out a production preview example. In the meantime, the next we're going to do is we talk about events and resumability. The workflows because they run the way it runs is that every step runs on its own serless instance in production. The actual work workflow orchestration layer is only called very briefly to facilitate a step runs. What this allows us to do is to have a workflow to spend for any amount of time. a workflow could wait for a week and not consume any resources. This is built into the workflow development kit in a way where we can in inside a workflow anything tag with use workflow we can simply call sleep three days for example and that would also wake us that will pause the workflow for three days and then resume where we left off. If someone was trying to reconnect to a stream for example, right? This was sleep an hour, the stream would just reconnect again to the same endpoint and things would resume from there. So we don't lose anything by losing the instance that runs the code because we can always restart it, resume from where we left off. And this is useful for AI agents because we can to a tool call. we can have the UI as the AI agent have a call that says sleep any amount of time and then use it to make an agent that that essentially uses a crown job where it says every day read my emails and do this thing right so that would be sleep one day yes when the when the kind of agent goes down that means all the state goes with it down yes so when it when it sleeps no when it sleeps for three days no then it kind of paused but when that would be killed output is for some reason where does the state go of that? So the the way it works is that any step call is cached. So when you when an input goes to a step call, we register that as an event and we run the step and if the step completes, we cach the output and say this step has been run to completion. Right? So if it was if it was something like this where we run the agent first, right? Let's say we run the agent and we run a bunch of steps. the state of the workflow function at this point in time would be saved and all of the outputs from all of the stat calls would be saved and at the time where we restart the workflow from this specific line of code it'll rehydrate the entire state and it'll just go from here. And this happens so that again we don't have to replay any of the code of it in a way that that does any actual resource consumption. Yeah, so we can use this to make an agent that has it's essentially a crown job again. And we can use it to make agents that run for weeks or interact with any of your sort of like information over over a very long time horizon. And while I've been talking, we have deployed our current app to the cell. So I can check out this preview branch for example and you can see the app is now live online and working just as it usually does. And yes it works perfectly. And if I then again I can do I can use the UI to inspect this at any point. If I call workflow inspect web or just workflow web with the dash backend for cell and dash and preview parameters for example that'll just let us let it know where our deployment is to be found and then that'll spawn up the same UI and now we can check on this run run run that's running in production and you can see we're getting the same kind of information here. Yeah. So, this is sort of Oh, I'm not going to cancel the run. I could cancel the run. Let's cancel it. This is to show that the way it works in local locally is the exact same way that it works in production from a conceptual standpoint. Which is the UX we are aiming for.

All right. I talked a little bit about sleep and suspend. Let us go and write this sleep tool call. It's going to be very simple. I'm going to go and copy the I mean I already here but I'm going to copy this and write it from scratch. We're going to write a sleep pool call. I'm just going to call it sleep.ts. And we're going to turn down the input schema to be something like time out milliseconds and the actual run command to be none of this and instead just call sleep. Because sleep is already a step that we export from workflow in workflow library. We don't need to call we don't need to mark this function as use step but this will now let's see if this is oh this should be a number. There you go. Can you see that again? Why don't you need the use step? Oh so this is already a step that we export from workflow. It's going to be the observability will also show it as a step which we'll see in a second. And this should just work assuming the prompt is good which we're going to modify to be say something like see of this. Yeah, do this. Only use this tool if the user directs you to do so. All right. And get a double quote here. There we go. And so now that this sleep call is set up, that should be all that we need to do. We'll call it run sleep command and sleep tool. And we're going to add this to our tools list. And I think I confused our compiler a little bit or at least TypeScript. This seems to work great.

Okay, now we have the tool. And we also want the UI to sort of display when it's sleeping. So I'm going to add I'm going to add a another not a function to log sleep. This is the reason we're doing this is we cannot write to a stream directly from a workflow because then it wouldn't be deterministic anymore because every run of the workflow would write to the stream again. Yeah. So I'm trying to run the project. I had to create a versel API key for the AI gateway. Did that did that getting error that says header is missing from the request. Do you have the YC option enabled in the project settings? You skipped this in the beginning but oh yes. Yes. Our because this code uses sandbox you would need to log into this. My mistake this should be running locally. If you don't use this sandbox which I will we'll have a branch that doesn't use this sandbox for after the talk. For now you might at this point I'll just do it afterwards. It's fine. So here I'm just going to add another call to writable and we're going to call we're gonna let's see we're going to need local ID and so now this is just going writes to the stream and that should allow us to show it in the UI correctly. Let me see if I figured the UI to correctly interpret this packet. There is no data sleep type which I think might wait. Yes. All right. So, now that I have this, I can go start our app again and

And so it loads. We can try out the second prompt here which is sleep for 30 seconds and then return a just to show that it's going to correctly interpret the sleep call and then sleep. It's not showing the data packet here sadly, but we can go to the web UI and we can show it has been it's engaging in the sleep call and this is going to return after 30 seconds. All right, so that's sleep and there's one final thing one final feature that I want to show you u which is web hooks and the ability to resume from web hooks easily. implementing web host is usually quite difficult or a headache and in our case I'm going to check out the conf /5-hooks branch and show you that we can in the same fashion as we do sleep we can add a new tool that I'll just I'll just show you where the actual tool call is just a a log call and then we create a web hook which is a function we export from the workflow And we can then log the web hook URL to the client or anywhere else and await web hook and this will suspend for as long as necessary to someone to click on this URL and then let's see if we can the server is running and I can show you this running hopefully reload this and

Wait for wait for human approval before starting and call Pokemon index. Let's see if it happen this correctly. Been changing branches, so I might need to restart my server. And the way this works under the hood is that again we'd be creating a a URL and we're going to sleep the the workflow until a call comes into that endpoint. And this comes with I'm going to run this query. This comes with a lot of extra features like I could also do respond with if I wanted to. This is a full API API request handlaw. I could respond with a request object. I can treat this as a again API endpoint. I could also check the body against the result schema for example and then only resume once that matches. So this gives you full control. But the nice thing is it does hook up the URL internally and you can see that it's paused waiting for a human to click on this link and if you're running in local host it's a local host link running in production it will be whatever your deployment URL is. Yes about both sleep and human approval those are like a workflow is is purely steps and steps always run to completion right so so sleep is a step it's not like the suspension of you know like some sort of like it's not a suspension of the the execution it's like it's it's a step no it is so we model is the step in terms for the observability and for the for how you call it but it is an internal feature that completely suspends the workflow and all steps nothing is running while to sleep. You can also do sleep and another step and you can promise them if you want. It works as a step call in that sense that it's a execution that takes a certain amount of time. And you can use promise await syntax to model that but again it completely suspends unless there is anything else running at a time and the same for the web hook. it's modeled as a step for the observability but it completely suspends unless you have auto code running at the time. So just from my understanding if you have an agent running with a workflow it keeps running. Yeah. You connect to it again let's say through another session and you would call sleep in this session does that like the previous one just like whatever it was doing just goes down. So if you have two sessions so let's say we we have a coding session right and it already built an app and then it's sleeping for a week right and then we reconnect to the stream is that the no the thing is let's say I kick off a work workflow and it's calculating like the numbers of pi just keeps on right but I connect to the same sandbox and then I call sleep will it stop calculating pi so the way you would do this in a workflow is again let's let's see how we would code this you have a sandbox there sleep in the sandbox well you can connect to this sandbox you connect again to the sandbox and some thread call sleep does the whole sandbox go so the the sandbox is it's basel sandbox which is a sort of just imagine it as an EC2 instance so this is just a a helper for us to spin up an instance to run this coding agent like run the code in order to store the files If you met this differently you wouldn't have to use sandbox and the sleep call doesn't happen as a as a bash call for example then two different right like an orchestration thing and then when you're actually in this box you you call sleep in a sandbox you're okay so there are two different right so so there is sleep that you could call from a terminal in the sandbox as a as a terminal command or there sleep from the workflow which suspends the workflow Yeah so we have we have these features for for web hooks right and we can see that after I clicked on the URL it resumed and then coded me a Pokédex that is all of the features we're going to in the session and I think we have ample time for Q&A about 20 minutes at least please go for how would I spin up claude code session with this a cloud co session remotely or are you no kind of run and kick it off as an agent doing certain stuff is that possible and then kind of orchestrate that as agents that is possible so cloud code is if you're talking about the app like a tonal app right cloud code then that doesn't use a lot of the workflow features internally so it's hard to isolate that or know where the oxidation There is you could write your own version of cloud code or take the plot code source code and add workflow and step for the calls and that would then run as as a workflow in the cloud. There's no way to say like okay I have my steps you know spin up claw work kind of code type this command and wait for anything that would be a versel workflow but how would I actually boot drop it like code it is one command told right so you know what you're asking if you so if you're calling cloud code in a so made as a confusion of like where this is running right for a coding agent here if the coding agent runs make the right for like creating creating a folder that make the command runs in a step but it runs against a like in a sandbox there sandbox being a VM and so this VM state is not managed by the workflow itself so if you call cloud code on the VM that's essentially treated like an SSH session but if you run any any agents or steps within the workflow right those steps are going to be resumable and observable through the workflow pattern Another question, how do I control what my agent has access to from going out to the internet doing stuff? This would be whatever you're whatever you're already doing for the agent, right? If you if you in the end you're going you're going going to be doing tool calls and stream calls to the LM provider, right? that is that is in your code presumably already and whatever you're already using to control permissions there like your tool calls for example right if your tool call allows you to delete a resource in S3 for example then you as call can write whatever code you want in the usual way that it's my job to implement it but it's not that it has some wrappers by the end Yeah, all in the sandboxes. Workflows is a general orchestration layer for durable execution and doesn't necessarily provide a sandbox for running code or like running third party code or running agent code or making files. That's something that the sandbox is good for because every sandbox instantiation is a new VM that only lasts for as long as your session lasts. Yes. Yeah. So if I'm running workflows and I'm like creating a lot of agent workflows through my brother how does that do does that get queued up on your system? How does that get run? Is there rate domain or currency and controls that we can use? Yes. So this is this is this goes into sort of some of the patterns that all of this is going to be supported and for the most part is supported right now which is that if you're deploying for example to go right and as usual if you do nextjs every deploy is a separate like live URL right that if you call it spawns up a serverless instance and so your workflows are bound to the deployment. So if you have something that something very nice that you get here is if you an agent and it runs for a week but you deploy five times in during this week those new deploys are going to be isolated from the original workflow and the original workflow is going to run to completion and then any new workflow will run on the new deployments and we'll also allow upgrading between those. So if you have a a workflow that runs for a year, right? Because it's like every month give me a summary of so and so, right? But you have new code and you want the workflow to you know take its current state and use the new code for the workflow. There's going to be an upgrade button in the UI that checks for compatibility between the old workflow and the new workflow by checking all of the step signatures and all of the existing events and then you can upgrade the workflow. Or you can currently already cancel and rerun with the with the new workflow. Is there a timeout for those workflow steps? Oh yes. So if you're doing serless right and whatever platform you're on whether it be like lambda or something else or or your serless functions are going to have timeouts. The nice thing is that every step runs in its own serless function. So the timeouts only apply to the stats. So if one of individual step you have runs the risk of running more than five minutes maybe 15 minutes depending on platform then you can split into two steps or if it's if it runs the timeout right it'll fail it'll retry maybe the will be faster and you'll see in the UI that oh this step is being retrieded after 15 minutes a

Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel

The Durability Moat: Building Reliable AI Agents with Vercel Workflow

Top 3 Ideas

🏗️ Stop Restarting Your Agents

🏗️ The Week-Long Serverless Function

🏗️ Orchestration Without Side Effects

Actionable Takeaways

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas

Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel

The Durability Moat: Building Reliable AI Agents with Vercel Workflow

Top 3 Ideas

🏗️ Stop Restarting Your Agents

🏗️ The Week-Long Serverless Function

🏗️ Orchestration Without Side Effects

Actionable Takeaways

Join 10,000+ smart readers on our AI newsletter and stay ahead of the curve

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas