
by The Opentensor Foundation | Bittensor TAO
Chutes (SN64) is building a decentralized AI compute and agent platform on Bittensor. Its distributed incentives and Trusted Execution Environments (TE) deliver superior, private, and cost-effective AI services, challenging overcapitalized centralized players.
This episode answers:
John, Algory, Florian, and Vate from Chutes (SN64) detail their journey from free compute to a revenue-generating powerhouse. Their decentralized AI inference and agent hosting is financially sustainable, secure, and efficient, challenging traditional centralized models.
"We became the biggest open-source model provider in all of OpenRouter... at our peak... 160 billion tokens which is just a massive number."
"OpenAI covers 1.5% of its expenses with revenue; Chutes covers 30-40% in just a year."
"I tell you I definitely won't be trusting any trusted execution environments. Unless Aphine miners have been trying to break them."
Podcast Link: Click here to listen

which is the top submit on Bit Tensor in a variety of ways. A team that I've worked for, John Never Sleeps, I worked with John Never Sleeps, and Aphine, the I've been building of late, is also built on top of shoots.
I want us to talk pretty high level in this conversation, like this is a seinal novelty search in 2026. It's the first one for us. And where do you want to start? There's so many things since we last had you up on stage.
Yeah, I guess what was it like February probably since the last time we... No, no, no, not February. We've only had you up once. Well, I think we did the pre-announcement one briefly, but anyways, a lot has happened since then.
Yeah, I think we registered on was it December 21st of 24? So, we're just over a year old now. One of the things we wanted to do with shoots is sort of bring Bit Tensor into the broader AI community.
The first thing we had to do is basically build a system that would meet the AI world kind of where they are, and the easiest way to do that is basically inference and compute. So the first step was basically proving that it could be done.
When we launched, we offered all the models for free. Targon had some for free as well, and we used that sort of to gain some traction and get listed on Open Router and things like that. So first step was can we even build the thing and does it work?
We proved yes, it does work, and actually we proved it so well that we became, as far as I know, we became the biggest open-source model provider in all of Open Router. I mean at our peak when everything was still free, we did on our top day something like 160 billion tokens, which is just a massive number.
I mean, and that that was all Bit Tensor. It was all permissionless miners, decentralized network, everything just worked and it was magical. Then of course we had to get to the point where okay, well now we've proven that it can work, but now we need to actually make it sustainable.
So starting I think our first day of actually having any revenue at all was March 8th. And I think our total revenue for that day was like 200 bucks or something. And it's been a long slog from there. Lots of ups and downs, lots of market fluctuations, let's say.
Like you said, these bare markets are sort of a stress test, but it really makes us focus on achieving that goal of sustainability even faster. So we're continuing down our mission of basically bringing AI or sorry, bringing Bit Tensor into the broader community, general purpose compute inference, the best most clear open-source definitive secure inference on the planet basically between the TE stuff and having our entire stack be open source.
We're pumping all of that money just back directly into the ecosystem so that we can actually build a flywheel that can maintain indefinitely and eventually hopefully become fully autonomous.
I don't want to talk too much because we have just a ton of high-level overview stuff that everyone wants to talk to. We have Algo is going to be talking about sort of our revenue numbers, our partnerships, subscription growth, all that kind of stuff.
Continuing down the path of sort of meeting people where they are, we're going to be talking about sandboxes and providing secure CPU compute services and building easy ways for people to use the existing tools that they already have like Claude Bot or OpenClaw or Molt Bot, whatever you want to call it, stuff like that.
One of our largest use cases is roleplaying. The numbers are basically coding and role play is 90% of our traffic, something like that. Complex here has built an amazing product that we're going to showcase there as well. So just a ton and ton of stuff.
I'm going to stop talking because people have heard it too much. There's another angle here too which I hope you go over John which is how much of Bit Tensor is built on shoots.
Yeah. I think that's one of the major focuses that Algo is going to be talking about here. I don't know if he wants to dox himself or not but hey how's it going? But yeah it's amazing and that's one of the key things about shoots is like it's not just providing compute.
I mean we we have a ton of additional specifically built mechanisms in place to help the subnets achieve what they want to achieve whether it's blocking network connections except for specifically whitelisted domains so that you can have video downloads but still offline predictions, custom validation so people don't just embed entire data sets into their model weights or something like that with trust remote code to extract the answer.
I mean, we build a lot of solutions into shoots to make sure that all the subnets that would like to use shoots are successful without having to do all that leg work themselves. But yeah, definitely something we're going to talk about here shortly.
It does keep us busy though. I will say that. Busy is something that we we've always got going on for us. So thank you John just for the brief introduction. I go by Algory around here. Tai as well is my first name.
I'm just going to give a little intro here. You can see on VET screen just our little slide cards, but just an I quick outline of some of our revenue trends and then some introduction into kind of the sales side where we've been working with some new clients and integrating new aggregators and providers to work with shoots to get our models out to the general public in a variety of ways.
Some of the projects that we've engaged with and then as you mentioned our cross subnet integrations and how we're functioning within Bit Tensor as a whole. So just to to begin with you should see on V screen there I just got a a nice chart just kind of show our overall revenue growth from beginning to to now.
We're it's kind of gone through about three phases that we've sort of broken it down to. So our initial phase starting from March when monetization was first enabled up until about June of 2025 was pretty small, low, daily revenue, but, you know, building.
So, we're looking at, you know, $100 to $500 days. A lot of most models were free still at this point, but we were starting to move towards monetization, subscriptions, and PGO. Then from there, we started to ramp up from about June to August. those are of 2025 got a lot more growth there brought in more subscribers and more medium tier and enterprise style PGO clients.
We jumped up a bit there. We started having more like 6,000 to $11,000 days by the end of that period. Then recently around September we had we did a payment system migration and along with that it was when we really started to mature into larger scale daily revenue.
From that point we've been in the 9,000 to $20,000 a day revenue category. Yesterday we hit I think 22,000 for for a single day which was a peak for us which is great. Of that revenue from the just the last 90 days, we've got a a total revenue of 1.3 million.
It's probably a little bit higher than that based on the way everything's reporting, but right in that realm. Of that, it just purely from subscription and fiat pays to go, we're looking at around 441,000 from just consumer side and then from enterprise like business-to-business enterprise clients that we've got on boarded is right around 378,000.
So cash for the last 90 days just USD is in that 800,000 range. So obviously very happy with that growth is moving really well. So sorry thought I heard something about talk. Let me just keep going. Then just to be for something that a metric that we came up with and we're actually very proud of at this point is that our revenue currently is accounting for roughly 30 to 40% of our daily expenses in outgoing which you know depending on your perspective on the market can seem either good or bad but in in generally speaking for a new business like ours with the growth that we're experiencing is a pretty incredible thing.
When you compare it to a provider like OpenAI, for example, their average yearly is about 20 billion income, but their yearly outgoing is about 1.4 trillion. So they're looking at about 1.5% of their expenses are covered by revenue. So the fact that we're able to cover 30 to 40% already at just a year, is something we're very proud of.
Let me just clarify one little thing there. The what we mean by that I guess is the our first revenue goal is to fully offset all minor emissions and that means we're covering like 40% of all minor emissions whether it's sold or not is irrelevant. We're trying to make sure all of the emissions for miners are fully offset by revenue.
Even if you know some of that is held the the goal is all of it. So that's that's kind of what we mean there by that number. Right. Sorry for any confusion there, but that that is Yeah, exactly. That's what I mean.
But we're trying to just do if you don't mind if I jump in here. There's been some color on this because since the beginning of detail, there's actually been some debate around the way in which you should see minor missions. John and your crew have you know been on the team of don't burn even though you're paying them too much, right?
You know, you could have burnt them down to X percentage, so they made nothing above what they what they made in mining, but you saw it as actually distributing out to your community. And that they didn't have to sell that and they would just become aligned. Is that like a bit of a prompt there for you, but like how do you see that?
Well, the way I look at it is very much like an investment. Like if you were to do a web two company and give out shares in in a company, you wouldn't necessarily expect the employees to immediately go liquidate all of those. Maybe they can, you know, if they're not reserve stock options or something like that, they could just immediately go liquidate them.
But in theory, your employees, they, you know, if we if we treat miners like part of the company, because they are, I mean, they they run the whole product. If we treat miners like they're, you know, employees of the company and part of the team, you want to let them make that investment decision themselves, right?
So, it's sort of ownership in the company if you want to look at it that way, right? It's ownership in the subnet. What you choose to do with it is sort of up to you. But you know the way I look at it is someone who is just casually buying you know subnet alpha tokens that's great and wonderful and we appreciate that but if someone is actually working 8 10 12 hour days you know managing infrastructure finding deals fixing problems with discs when they go bad whatever the case may be you know if these people are spending time actually optimizing the platform and helping us with bug reports and doing all this kind kind of stuff.
You know, they they are the best investors we could possibly have and I would rather give the extra alpha even if it's, you know, quote unquote a waste or, you know, excess, whatever. They're they're part of the team, you know, and they they should be able to do whatever they want with like the protocol was designed in a particular way. I just I don't see why we would do anything other than reward the participants with the heaviest engagement in that way.
I wanted to say too, other other than like when TA was, you know, like three times higher than what it is now, a lot of the excess or I guess waste you could say wasn't necessarily like a ton of waste. When miners are getting paid out like five times what they're paying, then yeah, you're wasting money there.
I think it's important for subnets on Bit Tensor to not just focus on the best value for their token, right? If you have to burn a huge percentage of your network, you're not really running much of a network there. You have to put a heavy hand on it and you have to control you know the direction it's going and a lot of the emission is kind of pointless to your miners. like it's not a reward for them.
So I think it's important that you know we aren't trying to burn you know like 90% or something but rather we're trying to look for more ways to make our existing emission more efficient complex. Were you uh you're on the team shoots right now?
Yeah. Were you initially? No. So initially yeah so initially I helped with some of the security stuff on Graval the GPU validation. I've known John for probably like a decade now and you know he got me into Bit Tensor and brought me into this community and then he told me he was doing a subnet.
I wasn't really too interested at the at the time, but watching it grow and wanting to become more involved, we kind of worked out you know what I could work on and what I could bring to it. How many of the people that work on shoots, John, were first miners, now they're holders, now they're community members, and now they they participate in building everything inside the ecosystem?
I I couldn't venture to guess. I mean I could say for our team that complex and the two dudes Florian I I sort of Bit Tensor Pilled he he was not really in Bit Tensor or crypto or anything. He was just sort of more in the AI space and then I you know derailed his entire life trajectory I guess with Bit Tensor and then in a good way.
And then paper money. Let's see. I don't think you did. Did you do any sort of crypto stuff before? I mean, we worked at a a company together in the past. So, I knew him through sort of real life. But and zero crypto any experience before.
So, we've got a bit of a mix here. But I would say, you know, in general, I I would say a lot of the a lot of the subnets and even if it's not necessarily subnet ownership and stuff like that, but like direction of subnets comes from miners for sure.
Because they're in it day in and day out. So, we're we're always adapting from minor feedback personally at shoots and I can see that through the ecosystem as well. It's easy to forget because it seems so long ago that a lot of the subnet owners on betensor today were first miners of the original mechanisms on betensor who are now subnet owners who have miners that are part of their teams and so forth.
You can't I started on noose, right? Noose n news research subnet 6 way way way back in the day, but I wasn't as early as you know like Rob on when there was only one subnet and stuff like that. So I I I was a little bit of a late joiner in that sense. Algo, I've completely derailed you as usual.
It's not novel search without derailing. is you were saying about 30 to 40% just to give people context on on what that means when they when you said like minor emission versus revenue here. So this is 40% of minor emission. Correct. Right. So currently and in our recent trend we're hitting around that number between 30 and 40% daily of of minor emissions.
So our objective is today it'll be higher with the largest in history, right? And and what do you know about the capital costs there? So how much are you overpaying miners? So you're making 40%. You're you're paying miners 100%. How much are they selling? 60%?
That's a good question. I don't actually let me see. We don't we don't track that. But I assume like our assumption is basically that number can be 100% and we shouldn't care. I mean that that's kind of our operating guideline is we should assume that 100% of that is going to expenses.
Now most of our well I don't want to say most a lot of our hardware is actually just from idle compute from data centers. So to them, I mean, you know, there are times if you look at the entire inventory of nodes, there are times where we're only paying something like 70 cents an hour for an H200.
Surely that is lower than the cost of Yeah. I mean, even just the cost of electricity to run those things, right? So, you know, we we try to make it as frictionless as possible and we're working on like standalone VMs that data centers can just like pixie boot into whenever they have idle capacity.
We're going to try to make it as frictionless as possible so that you literally just do nothing. You know, your machine goes idle, you spin up a VM, you need it back, you you know, you advertise to us that you're done with it and you spin it back down. Then, you know, even if we're paying at that point 10 cents per hour, it's still greater than zero, right?
That's Sorry. Sorry, John. No, no, go ahead. Go ahead. Well, what I was going to say is the, you know, we could try to really fine-tune exactly how much hardware we have. We do to a certain extent. We have um, you know, caps on the number of instances we allow per model.
We've been sort of steadily deleting underutilized models and sort of consolidating you know older versions of models into just the newer ones and we're constantly updating the engines to make them faster and better configs and stuff like this. So, you know, we're we're continuously trying to drive down the required expense, but if we forced it to just that required expense, one thing that will happen is people won't want to put their idle compute on and then if we have like a Deep Seek 4 come out, we just won't have the inventory to be able to launch it.
So having some you know like base level of idle compute is actually hugely important for us so that we can launch these new models when they come out and traffic will shift if we you know we're constantly like burning or fine-tuning or limiting capping in other ways we just wouldn't have that flexibility and these models get released you know there's like a at least a model a week I mean there's way more than that but models of no either image models, you know, text to speech or or LLMs, there's something new constantly.
The cost there I couldn't say because half of it's just idle that would be getting zero anyway. And how about utilization? Like, you know, if I run an H200 on shoots, how much of the time will I be serving requests?
If it's an ADX H200 node, it will be almost certainly fully busy 247. Almost almost constant basically. Wow. And so you're being paid for that. Then at the incentive level, you guys can optimize this, right?
So you can you can build the incentives for the type of compute that is required that that makes the most amount of money. That is but you know obviously you're you're trying to build a platform here where there's always up time for for things but hypothetically you you could build this so that you know the incentive goes to exactly where it needs to be to have that that compute online to service the most amount of requests that are making the most amount of money for shoots and it's just basically at this point it's just an optimization problem.
Exactly. We actually just rolled out a a brand new incentive mechanism maybe like a month and a half ago or so. I don't know. Time just is an illusion to me at this point. But we rolled out a new mechanism where every single shoot has a dynamic incentive.
What we mean by that is basically you know there's only a certain amount of capacity that a particular model can serve before it starts either like the engine will just your your time to first token would be minutes in which case no one would want to use it and all clients would eventually cancel their requests. There would be timeouts all over the place. it would just be a disaster.
So we always have like hard caps on the number of concurrent requests any single node can handle and that's usually somewhere between like 40 and 60 depending on the model. So from that we can track what the utilization of every single instance is at all times. Then we can also see when whenever an instance is is going beyond that and they have to start rejecting requests.
So what we do now is basically the incentive per instance is based on you know what what the utilization is at the time and if it goes above 80% we start adding some incentive to it. So more miners will spin it up and if it starts rate limiting we really add incentive to it because then it means we're not actually serving the capa the capacity that is actually needed.
The only other thing that we need to add into this formula is basically exactly what you said which is to make it a revenue optimization problem which is say we have two competing models. Both are at 100% capacity. One of them makes us you know one penny per hour. One of them makes us $100 per hour. We should definitely be giving more incentive to the one that makes us more money.
We have all the the backend means to do that. It's just not something we've done yet. Mostly at this point we've been trying to you know Florian has built some amazing tooling around like he built a price optimizer for us that will you know using actual metrics give us what the input versus output price should be to make sure we're in the lead position on open router for routing preferences and to make sure that we're actually going to be hitting whatever the instance cost would be like in terms of like dollars per hour per GPU.
So yeah, I mean for us it's it's all about fixing that problem and when we see a model that no longer makes sense financially to run either because it's idle or because it just doesn't have any profit margins, we just start reducing the overall capacity on that and try to get people to shift over to other things. It's a very very it's a shifting landscape to be sure.
It could be a tricky thing. Well, I was just going to point to the fact that, you know, on the on the screen here we have OpenAI not able to cover its costes, expenses 1.5%, it's losing money like crazy. I'm sure Anthropic is also losing money like crazy. I'm sure basically every inference provider in the world is losing money like crazy because it's over capitalized.
Where this converges is to who can optimize the compute and optimize availability and utilization right like that is actually you know when you get to the full commod commodification of inference that's that's all it is. Can can you get every machine can it be available for inference at any time?
There's there's no better way to do it than direct incentive mechanisms, right? You know, if you're if you're open AI, you go hire a BD development, they go get some compute from Nvidia, and sure, that gives them their edge because they can shake hands, but but eventually, are they optimizing that compute? Are they are they provisioning it to to as per to perfect as as perfect as you guys are?
Do do you see that the tech that you're building can leapfrog these these companies in terms of efficiency? I mean constant there there is one one aspect that nobody's thinking about is like I mean those centralized companies they're just like building huge data centers somewhere like they're building a huge data center as big as Manhattan somewhere in the US and I mean the US is like it's half of the year is winter and half of the day is night and they are centralized they can't like in in run the inference from anywhere because they've got their data center there.
We've got miners all over the world and like the the cheapest excess energy there is is like excess energy from renewable sources like you know solar energy, wind energy. So basically we could even like we're not sure about this but we could even have this advantage that like we we get less carbon emission and cheaper like energy costs because miners just go and rent machines wherever it's just like daylight and summer.
That's kind of isn't Elon Musk or Mark Zuckerberg built a Manhattan city-sized data center. You guys built one that spreads entirely over the entire spreads over the entire globe. It's a network and and it will be bigger because it doesn't have the limitations of needing to be into one one one state completely destroying the energy usage of that of that location and being more efficient.
I mean this is where like you know we're converging towards Bitcoin mining for inference that's always been the goal for subnets right to get towards that commodification it's harder because the commodities are more complex they require a lot more overhead they require a lot more val validation verification distribution but you know shoots is is one of those subnets and there are a number now that are sort of approaching that that fine line you know between like the the the direct translation of energy into some in this case intelligence.
That's the goal. I mean, you should hear some of the stories from the miners that where they they found like crazy pricing on H200's by the Fukushima power plant because, you know, that it's like a forbidden zone and stuff. I don't know. Like, but and to your point, I mean, if we can get the price of inference down to the actual watt hour price, that that's sort of the goal.
That's basically what, you know, Bitcoin has done is with AS6 and stuff like this. We're trying to do that. We still have a long ways to go. There are other sort of more fundamental things like we're hopefully there will be a shift in the actual underlying architecture of these models.
So it's not just log you know exponential complexity with the context size like you can see Neotron 3 is a good example where they have like a hybrid Mamba transformer architecture so the model is super fast regardless of the context. When we start seeing more things like that I think the you know sort of like the per watt pricing of inference is going to become more of a reality.
Then you see chips like Cerebrros and stuff like this with with their their actual AS6 for transformers. The I mean those aren't really that flexible if the architecture were to shift. But yeah, that's definitely what we're doing. I mean that's what the miners are good at, right?
It's not our job to go find the cheapest GPUs on the planet and get that incentive and actually make a profit. We don't really care where you source your GPUs or how you get them or what the story is behind them. You know, provide us the compute and you'll get the incentive. So it's a nice problem that we don't have to solve.
Because I can't even imagine the headache of trying to source you know like a a massive data center. If I wanted to go buy a 100,000 GPUs right now today and a data center and have power and you know and and and like it just it would be a nightmare and sensor completely solves this and it's a uniquely human problem too because if you see what's happened to OpenAI with Nvidia right they they just dropped the deal you don't have to do that you don't need to see the leather jacket I so we're collapsing here this you know towards this commodification in the end of this live probably going to talk about well here we go some integrations that are that are new that's fantastic but how about on the other side of integration you know do do you think that it's beneficial for shoots to have subnets that mine shoots does that make sense has it happened will it happen like do does shoots collapse to using a commodity as compute and then and then you know say celium builds up and and and mines mine shoes I know that's happening right now.
Do you think that that would happen? I think I think it absolutely could happen. The there are sort of two roadblocks and they're not really roadblocks. I would say they're more just, you know, technical challenges to overcome. So fun problems to solve but one of them is that you know we're trying to shift more and more towards exclusively TE which basically requires that it's a bare metal machine and the subnets currently I mean yeah I know that there were was at least one that was sort of in the works thinking about doing like fully bare metal but that that's something that could be done and then and Then TE would just work because we would have one subnet that is responsible for getting TE hardware and then they could just run the VM. It doesn't matter what they do with that hardware.
You know, you could use it for other purposes. You could serve it to Basilica or whatever. But you know, if you wanted to then run, so if we had a a fully bare metal subnet that could just drop in the shoots VM or whatever VM, the Targon VM, I mean, I, you know, that that could work to be sure and that would help. So here's why that's I think that's interesting. I'm derailing you guys again.
When you look at the comput subness on Bit Tensor, the biggest problem they have is that they're they're basically paying their miners continuously, right? They need to solve a delivery problem, a distribution problem. So they're they're a continuous they're they're an always on market and the demand is not always on. Same thing same problem that you have, right?
So you you have continuous incentives and you need always on providers. So that's why for instance always on customers that's why subnets are great customers right and that's why you know pay per month is is is great for for shoots but this is and I've been trying to push the the computers to think about being providers to always on incentive mechanisms because then they can basically distribute well we have always on compute we can give it to shoots who's going to pay us always on for inference and those two things mesh well together and then the subnets that are level up they they they have to solve the distribution problem for you and so forth and so on.
That is one of the things I think people have a misconception about shoots is is people just say you know we're a compute subnet and I mean to a certain extent we are you know to be sure but we're really much more of I would say like a service provider right we build the entire orchestration the monitoring the failover the autoscaling the billing the you know single signon integrations like we're very much a service provider provider more so than just a compute provider.
Having an integration if there was a bare metal compute provider would be perfect for us because then you know we can build our service on top of that and it's still going to be up to the miners. You know the miners on shoots currently their problems to solve are basically sourcing hardware and then optimizing their strategy to make sure that they actually do meet the demand 247, right? and and and make it the most cost-effective hardware they can for the needs at the time.
So, for example, if we have a shoot that can either run on 4090s or A100s, it's like, sure, you could go use your A100 on that, but that's way more expensive than just using like a consumer GPU because the shoot definition says I can run on consumer GPUs. You don't need to use an A100. So there's like a there's a resource identification problem to solve and then there's like a allocation of those resources problem to solve and the allocation is sort of what the the shoots miners are meant to be doing is like finding the cheapest suitable compute and scaling appropriately.
Well, let's continue. Let's talk traction. Sure. Yeah, here. So, and yeah, just to to speak to what John was talking about too with making sure the cards are allocated correctly. That's a lot of that's dictated through node selectors and how that's set up within shoot definitions. it can be based on there are ways to optimize it from a from a user perspective as well.
That's important when it comes to people deploying private shoes that there are ways to optimize their node selectors so that they're getting they're getting the best price for their shoot deployment and they're also making sure they're not wasting you know higher level compute that isn't necessary. So it's it's already something that's considered and and and in play. But so just to to continue on with with the from the traction point of view here we've been working with some new providers and and obviously with the deployment of our our TE infrastructure which paper money can get a lot more into the details of how it functions in its back end but a lot of interest has come because of that and so a few aggregator services that we started working with one in particular red pill AI is a security focused aggregator.
We've engaged with them and they're going to be offering our TE models and we're trying to bring shoots to the broader market. Even though within Bitensor it's amazing the way that we're able to integrate with all the other subnetss and and I'll show again some of these are some of the most recent ones that we've started integrating with but we have a whole list that we're in in partnership with but another aspect is trying to get out to to the broader market that isn't aware and make them aware of Bit Tensor and make them aware of shoots and so through these aggregators is one of the ways we've been been doing that.
Also kind of on a neat project side, for anybody who's into gaming, Pax Histori is a a client that we've recently started working with. They're really neat. They It's like a for anybody who's played Risk. It's very much like Risk, but AI powered. So, it's a a real-time strategy map game where you pick a country that you control and all of the other countries that surround you are powered by LLMs.
When you make political or militaristic decisions against those nations, the the LLM responds to, you know, combat you or whatever. It's a really neat use case for these models that kind of differs from our typical bread and butter, which is the coding and the the the role playing. It's really neat working with them and they're integrating our models into like a certain tier of their product.
So, I just they've been really cool, a great team, nice guys to work with, and just encourage anybody to check that out if you're into that kind of game. And why do they want to work with you guys? Well, we have, we host a nice open source model. They were they previously they've