Subnet 56 :: Gradients :: Bittensor End-to-end AI Model Training Suite

The Gradients team unpacks their journey from an "impossible" idea to building the world’s best AI model training platform on Bittensor, detailing their new open-source tournament and a landmark victory over a top-tier model from Qwen.

Beyond Basic Training

"That's where Gradients comes in, which is making the intelligence actually intelligent and usable for humans as opposed to just understanding language."
Gradients is an end-to-end suite for post-training AI models, taking base models that understand language and turning them into specialized, high-performing tools. The platform offers a spectrum of training techniques:

Instruct & Diffusion: The foundational bread-and-butter for text and image model fine-tuning.
DPO (Direct Preference Optimization): Teaches models to prefer certain answers over others, allowing for fine-grained control over tone, length, and style. Gradients is one of only two platforms in the world offering this.
GRPO (Group Reward Policy Optimization): A state-of-the-art technique that allows users to define custom reward functions, enabling models to be trained for highly specific, complex behaviors.

The Open-Source Flywheel

"What's the difference between [a dominant proprietary miner] and OpenAI really? It doesn't feel like it's well in keeping [with Bittensor's ethos]. And so that kind of takes us to Gradients 5.0, which is it's time to come into the light for miners."
Driven by customer data privacy concerns and a desire to align with Bittensor's open-source spirit, Gradients has pivoted to a revolutionary new model. Instead of miners submitting black-box models, they now submit their full training scripts.

World Cup Style Tournament: Miners compete in a tournament where validators run their code. This ensures competition is based on algorithmic cleverness, not just brute-force compute.
Accelerating Innovation: Open-sourcing the winning scripts creates a flywheel effect. New miners can learn from the best, while top miners are incentivized to continuously innovate, leading to an aggregation of the best techniques. This addresses the steep learning curve that previously gated entry.

Beating the Titans

"A model that is trained natively on Bittensor is better than Qwen 3 Instruct. I think it's reasonable to say that claim."
Gradients put its platform to the ultimate test: fine-tuning the Qwen 3 base model to see if it could outperform Qwen's own official instruct version. The result was a resounding success.

Gradients Instruct 8B: The model trained on Gradients beat the official Qwen 3 Instruct by a "decent margin" on zero-shot benchmarks, particularly in math and instruction following.
The AutoML Machine: This victory proves the power of the Gradients incentive mechanism. The platform has effectively created a machine that continuously optimizes itself, crushing benchmarks set by some of the world's top AI companies. The next goal: prove it’s the best 8B model on the planet.

Key Takeaways:

The podcast highlights that the frontier of AI innovation is shifting from pre-training to post-training, where specialized, fine-tuned models deliver the most value. Gradients is positioning itself as the definitive platform for this new era, proving that a decentralized network of incentivized experts can collectively outperform centralized tech giants.
Training is a Solved Problem. For users and developers, the message is clear: stop building custom training loops. Gradients offers superior performance out-of-the-box, turning the complex art of model training into a simple API call.
Open Source is the Ultimate Competitive Moat. By making top training scripts public, Gradients accelerates its own innovation flywheel, creating a continuously compounding advantage that closed-source competitors cannot replicate.
The Best 8B Model is Now from Bittensor. Gradients has moved beyond theoretical benchmarks to produce a state-of-the-art model that beats a leading industry player. This is a powerful proof-of-concept for the entire Bittensor ecosystem.

Link

This episode reveals how Gradients, a decentralized AI training platform on Bittensor, has engineered a system to outperform industry giants and is now open-sourcing its winning strategies to create a self-improving flywheel for model optimization.

Introduction to Gradients: The AI Intelligence Layer

Wandering Weights, representing the Gradients team, introduces the platform as the crucial "middle to the end of the stack" for AI model development. Gradients takes models that already understand language and makes them truly intelligent and useful for specific tasks. This process, known as post-training, transforms a generalist model into a specialist capable of following instructions, answering specific questions, or understanding a company's product database.

The platform offers a simple user interface for creating training jobs for both Large Language Models (LLMs) and diffusion (image) models.
Users can select any model and dataset from Hugging Face, define the task (e.g., instruction following), and start training with just a few clicks.
For diffusion models, the platform now includes an auto-captioning feature, simplifying the process of training a model on custom images, such as creating personalized avatars.

The Team and Rapid Development Velocity

The Gradients team, composed of researchers with publications in top-tier AI conferences like NeurIPS and ICML, emphasizes their hunger and drive over academic accolades. This intensity is reflected in their development pace over the last eight months.

The team has executed five major releases, writing nearly half a million lines of code across 4,000 commits.
Wandering Weights credits the miners for over 50% of the success, highlighting the collaborative and high-freedom environment within the Bittensor ecosystem.
"We are building something from scratch that's ambitious and we have freedom to do so and I think you know with that plus the hunger is just great and we are enjoying it and making progress."

Advancing the State-of-the-Art: DPO and GRPO

Gradients has integrated two cutting-edge training techniques that push beyond simple instruction-following, allowing for more nuanced model alignment.

DPO (Direct Preference Optimization): This method teaches a model to prefer a "chosen" answer over a "rejected" one for a given prompt. It is critical for aligning models with human preferences, such as controlling for tone, verbosity, or safety. Gradients is one of only a handful of platforms globally, alongside DataBricks, to offer this functionality.
GRPO (Generalized Reward Policy Optimization): A more novel technique inspired by the DeepSeek model, GRPO allows users to define custom reward functions to guide model behavior. This provides immense flexibility, enabling developers to reward a model for specific output formats or for solving complex problems.
Strategic Implication: Gradients plans to introduce programming containers for reward functions, allowing for complex, programmatic rewards. This opens the door for researchers to experiment with novel alignment techniques and for companies to create highly customized models.

Performance Benchmarks: A Decisive Victory

To validate its core claim of being the best training platform, the Gradients team conducted an extensive experimental study, training over 180 model-dataset pairs and comparing performance against major centralized platforms like DataBricks, GCP, and Hugging Face.

The results were overwhelmingly in favor of Gradients, which produced models with the lowest loss on unseen test data.
Against its closest competitor, Hugging Face (on tiny models), Gradients won 83% of the time. Against all others, it was a "clear house."
This superior performance holds true across all tasks (translation, math, code, reasoning) and model sizes up to 70B parameters.
Actionable Insight: The speaker makes a bold claim: "If you're a minor on another subnet and you're doing anything that requires training... stop doing that and come and do it on gradients." This positions Gradients not just as a service, but as a fundamental infrastructure layer for the entire Bittensor ecosystem.

Gradients 5.0: The Pivot to Open Source

Despite proven performance, the team encountered a critical barrier to enterprise adoption: data privacy. Clients were hesitant to send proprietary data to anonymous miners. This, combined with a philosophical commitment to Bittensor's ethos of open intelligence, led to Gradients 5.0.

The new model requires miners to submit their training scripts as open-source code repositories rather than just the final trained model.
This transparency allows customers to see exactly how their data is being handled and builds trust.
It also prevents a scenario where a single, dominant miner operates a proprietary "black box," which would be antithetical to the goal of decentralized AI.

The Open Source Tournament: A Competitive Flywheel

The open-source model is structured as a continuous, World Cup-style tournament to incentivize innovation and identify the best training techniques.

Miners submit their code, and validators run the scripts on a fixed compute budget across a series of tasks (Instruct, DPO, GRPO).
The tournament proceeds through group stages and knockout rounds, culminating in a "boss round" where the finalist must outperform the previous tournament's winning script.
This mechanism forces continuous improvement and allows new techniques to be rapidly discovered, shared, and aggregated across the network.
For Researchers: The open-source scripts are a goldmine, revealing the complex hyperparameter tuning, kernel optimizations, and data handling strategies that define state-of-the-art AutoML.

Breakthrough Result: Outperforming Qwen 3 Instruct

The episode culminates with a major announcement: Gradients has produced a model that outperforms a leading model from a major AI lab.

Using their platform and a custom dataset, the team fine-tuned a Qwen 3 base model.
The resulting model, Gradients Instruct 8B, beats the official Qwen 3 Instruct model on zero-shot benchmarks, particularly in math and instruction following.
Zero-shot refers to a model's ability to answer a question or perform a task without being given any examples in the prompt, a true test of its generalized knowledge.
Strategic Implication: This result proves that a decentralized network of competing miners can collectively produce a model superior to one developed by a top-tier, centralized AI company. The team now aims to prove Gradients Instruct 8B is the best 8B parameter model on the planet.

Future Vision: Cost-Effectiveness and Ecosystem Integration

The conversation with host Const explores the future trajectory, focusing on expanding capabilities and deepening integration within Bittensor.

Video and Beyond: The underlying structure of Gradients can be extended to other modalities like video, object detection, and other "bread and butter" machine learning tasks.
Cost as a Differentiator: Unlike venture-funded AI labs that can afford to be inefficient, Gradients is built on an economic model that forces cost-effectiveness. The platform's pricing is already significantly lower than competitors like Google Cloud Platform.
Ecosystem Integration: The long-term plan is to run the tournament's compute workloads on other Bittensor subnets (like Shard), creating a symbiotic relationship where Gradients becomes a major customer of decentralized compute, further strengthening the entire ecosystem.

Conclusion

This episode demonstrates Gradients' evolution into a self-optimizing, open-source ecosystem that verifiably outperforms centralized AI leaders. For investors, this signals a clear path to revenue through superior, cost-effective technology. Researchers can now access and build upon a repository of the world's most advanced open-source AutoML scripts.

Subnet 56 :: Gradients :: Bittensor End-to-end AI Model Training Suite

Others You May Like

ElevenLabs CEO: Why Voice is the Next AI Interface

Rollup TV: Aria Protocol, Money Moves Fast EP 8, Crossmint

Sacks, Andreessen & Horowitz: How America Wins the AI Race Against China

Subnet 56 :: Gradients :: Bittensor End-to-end AI Model Training Suite

Join 4,000+ smart readers to get access to all our research and tools for free.

Others You May Like

ElevenLabs CEO: Why Voice is the Next AI Interface

Rollup TV: Aria Protocol, Money Moves Fast EP 8, Crossmint

Sacks, Andreessen & Horowitz: How America Wins the AI Race Against China