Opentensor Foundation

March 28, 2025

Subnet 3 Templar –World’s First Distributed, Permissionless, Incentivized Open Source AI Training

This podcast dives into Bittensor's Subnet 3, Templar, detailing its groundbreaking mission to decentralize large-scale AI model training using economic incentives. Featuring insights from lead developer 'distributed' and Const from the Opentensor Foundation, it explores the tech, the brutal challenges, and the community spirit powering this ambitious project.

The Necessity of Decentralized AI Training

"Cracking decentralized training is the only way that a disperse decentralized group of individuals can pool compute and co-own a model together and overtake these Frontier Labs."
"If it's not possible for the community to train machine learning models that are at the scale of the frontier Labs, there's simply no way for us to ensure that that Revenue [is] distributed out."
Problem: Frontier AI labs (OpenAI, CCP-backed DeepSeek) dominate foundation model training due to immense, centralized compute power, raising concerns about control, copyright issues (like Sam Altman's alleged infringement), and fair value distribution.
Solution: Templar aims to pool global, decentralized compute resources, enabling co-owned, open-source models that can challenge the giants and ensure value flows back to contributors. It's positioned as the only way for open source to truly compete at the highest level.

Templar's Incentivized Training Mechanism

"In Templar... each miner is training one Global model... The miner's job is to summon a gradient that beats the validator."
"Validators and miners are communicating through these R2 buckets... they test that the miners have produced a gradient that affects produces the loss on sample of the data set."
How it Works: Miners receive deterministic data slices ("pages") and must train intensely within short time windows (84 seconds). They upload resulting gradients (model updates) to Cloudflare R2 buckets.
Validation: Validators download these gradients, verify they were submitted on time (using R2 timestamps checked against Bittensor block times), and assess if they effectively reduce the model's loss (error rate) more than a baseline effort. Poor performance or lateness results in slashing (loss of potential rewards).
Uniqueness: Templar is the first system to implement direct economic incentives within a permissionless, decentralized AI training run, forcing robustness through real-world adversarial pressure.

Forging Strength Through Adversity & Community

"Permissionless training is hard on steroids... with AI it's almost like it's 5% [fault tolerance] or something... miners... just have to be 100% perfect."
"The biggest turnaround... That moment changed Templar into Frontier Community... all of a sudden it stopped being 'Let's hack this guy,' it's like 'This is us, let's build this thing together.'
Brutal Learning Curve: Early days involved constant exploits (gradient manipulation, gradient theft via "bucket copying", free-riding) requiring rapid iteration (~200 runs) and a shift from a "company" mindset to radical transparency and community collaboration.
Incentive Alignment: The introduction of token economics (Deta) transformed adversarial miners into stakeholders. Owning a piece of the network incentivized them to help fix bugs and improve the system rather than just exploit it. This open ownership model proved critical.

The Road Ahead: Scaling and Optimization

"We will be creating the largest large language models in the world... co-owned and with co-contribution... there is incredible value there."
"Decentralized training is actually beating... can actually beat centralized training pound-for-pound... When we get [asynchronous training] that is such a massive boost... we're probably doing a lot better than the centralized."
Near-Term Goal: Perfect the current 1.2 billion parameter model runs, demonstrating stable, reproducible loss curves. Insights gained here are crucial for tackling larger models.
Scaling Ambition: Leverage the system to train 70B+ parameter models, aiming to produce the world's best and largest open-source LLMs.
Efficiency Edge: Ongoing optimizations, like asynchronous training and accumulation, could make Templar's decentralized approach faster than traditional centralized training methods, potentially attracting even Frontier Labs to use the network. Competition is shifting towards optimizing bandwidth and gradient calculation techniques.

Key Takeaways:

Templar represents a major leap in AI, proving permissionless, incentivized training is viable despite extreme technical hurdles. The fusion of open source ethos, crypto-economic incentives, and relentless community iteration is forging a powerful alternative to centralized AI dominance.
Permissionless Works: Templar validates that truly open, decentralized AI training with economic incentives is not just theory—it's running, learning, and stabilizing now.
Incentives Align: Token ownership fundamentally shifts dynamics, turning potential adversaries into collaborative builders invested in the network's success.
The Future is Co-Owned: Templar paves the way for globally co-owned, state-of-the-art AI models, potentially outcompeting even the most well-funded centralized labs and offering a more equitable model for AI development.

Link: https://www.youtube.com/watch?v=3jNpaHvJuyE

This episode delves into Templar's pioneering decentralized AI training on Bittensor, revealing how incentivized, permissionless compute pooling aims to rival centralized AI labs and forge community-owned foundation models.

The Case for Decentralized AI Training

Distributed opens by framing the core challenge: Frontier AI labs (like OpenAI or DeepMind) dominate foundation model training due to massive, centralized compute resources. He argues that decentralized training, where a dispersed group pools compute and co-owns the resulting model, is the only viable path for open-source AI to compete and potentially surpass these labs. The vision is to create a "supercomputer" from global contributions, capable of training AGI-scale models and generating significant value, potentially through monetization via token-gated access or servicing frontier labs themselves.
Distributed asserts, "Cracking decentralized training is the only way that a disperse decentralized group of individuals can pool compute and co-own a model together and overtake these Frontier Labs."

Challenging Centralized Control and Copyright Concerns

The conversation highlights the ethical and control issues inherent in centralized AI development, referencing Sam Altman's alleged copyright infringement with Ghibli art. Distributed emphasizes the need for alternative, community-driven model creation to prevent such actions by powerful, centralized entities.
The ability for the community to build state-of-the-art models is presented as crucial for ensuring fair value distribution, especially when models inevitably learn from vast amounts of existing data, including creative works.

Templar: An Incentivized Marketplace for Foundation Models

Templar is presented not just as a decentralized training protocol, but as a "collectively run intelligence market" built on Bittensor. It leverages Bittensor's consensus mechanisms to incentivize the creation of state-of-the-art foundation models through DETA (Digital Entity Tokenized Asset), turning the process into a marketplace.
In Templar, miners collaboratively train one global model. The core problem tackled is validating miner contributions effectively in a permissionless environment.

The Mechanics of Templar's Decentralized Training

The process involves miners receiving a "slice" (subset) of data, determined randomly but deterministically using the block hash, preventing foreknowledge. Miners have a tight window (e.g., 7 blocks or ~84 seconds) to train intensely on their assigned data slice and upload resulting "gradients" – mathematical representations of the learning adjustments needed for the model based on that data – to storage (specifically R2 buckets).
R2 buckets are cloud object storage services, used here for their high bandwidth and free data egress (outflow), facilitating the transfer of large gradient files. Validators then assess these gradients. The system demands near-perfect uptime and performance from miners due to the sensitivity of distributed training, with significant penalties (slashing) for failures.

The Immense Challenge and Importance of Coordinated Compute

Constantijn interjects, emphasizing the profound difficulty of this endeavor. He notes that training a model even in a single data center is hard; doing so across distributed, potentially adversarial peers, while implementing direct incentivization during the training run, is unprecedented and exponentially more complex.
He reiterates the "why": combining global compute resources is fundamental to Bittensor's original vision and essential to prevent AI from becoming a purely centralized enterprise. This coordination problem is positioned as a core reason for Templar's significance.

Validating Miner Contributions: Ensuring Quality and Honesty

Distributed explains the validation process further. Validators must determine if a miner's submitted gradient genuinely improves the global model more effectively than a baseline effort.
Initially, a naive approach involved validators sampling the miner's data slice, measuring loss, applying the miner's gradient, and checking if the loss decreased sufficiently. However, this method proved suboptimal, a point elaborated on later regarding miner exploits.
The goal is for miners to produce gradients that demonstrably outperform the validator's own potential training on that data slice.

Lessons Learned: The Brutal Difficulty and Power of Community

Distributed shares candidly about the extreme difficulty ("hard on steroids") compared to blockchain development, citing the near-zero tolerance for errors in AI training (unlike BFT's 51% or 67% thresholds).
He recounts a pivotal moment where expressing vulnerability, prompted by Constantijn, transformed Templar from a top-down project into a true "Frontier Community." This shift fostered collective ownership and collaboration among miners, moving from an adversarial "let's hack this guy" mindset to a shared "let's build this thing together" ethos, proving essential for progress.

Miner Dynamics: Exploits, Adversarial Pressure, and Adaptations

The conversation highlights the constant adversarial pressure from miners, who relentlessly probe for weaknesses. Examples include:

Gradient Manipulation: Sending gradients with large numerical values (NaNs - Not a Number) to disrupt or capture the validator state (a Christmas Day exploit).
Bucket Copying: Since miners initially shared read keys for their R2 buckets, some miners simply copied successful gradients from others instead of computing their own.
Free Riding: Exploiting slow validation times, some miners would achieve a few good scores early in a window and then shut down their machines, avoiding further compute cost.

This led to stricter validation, heavy slashing penalties, and continuous protocol refinement. Distributed acknowledges the miners' role: "Miners miners miners like these guys amazing you know we live and die by them but they will also drive you insane."

Iterative Development and the Impact of Token Economics (DETA)

Constantijn emphasizes Bittensor's rapid iteration capability (using Python, Docker Watchtower) allowing teams like Templar to deploy directly to production and learn quickly from real-world attacks.
He notes the evolution from naive initial designs to the current, battle-hardened system, achieved through ~200 experimental runs.
He introduces the concept of DETA (Digital Entity Tokenized Asset), explaining how subnet-specific tokens align miner incentives with the network's success.
As miners become owners/shareholders via DETA, their incentive shifts from purely exploiting the system to strengthening it for collective benefit, fostering a collaborative community.

The Road Ahead: Scaling, Optimization, and Open Research Questions

Distributed states they are "98% there" but face remaining challenges for scaling:

Miner Divergence: Understanding why some miners deviate significantly from the global model, which could destabilize training with larger models requiring more participants (currently capping at top K miners).
Model Saturation: Determining optimal batch sizes to avoid wasting resources without underutilizing the model's capacity, crucial for scaling efficiently to larger models (e.g., 70B parameters).
Speed Optimization: Exploring asynchronous training and accumulation, where gradient calculation and synchronization happen in parallel rather than sequentially. This could significantly boost speed, potentially making decentralized training faster than centralized methods using optimizers like AdamW (a standard adaptive learning rate optimization algorithm widely used in deep learning).

Future Ambitions: Training State-of-the-Art Models and Contributing to Science

The immediate goal is to train a high-quality 1.2 billion parameter model, proving the system's capability. Success here is seen as the key to unlocking larger models like 70B parameters, as the core mechanics will have been mastered.
Templar aims to become the premier platform for training the world's largest models and contribute novel findings back to the AI research community, leveraging insights from their unique, incentivized, permissionless environment.

Observing the Live Training Run: Chaos, Filtering, and Permissionless Reality

A look at the live training run's loss curves reveals two views: a noisy "minor view" showing high variance due to individual miner issues (going offline, errors, slow submissions), and a smoother "validator view" representing the filtered, aggregated progress of the global model.
Constantijn highlights this chaos as evidence of true permissionless participation – anonymous actors constantly testing the system, unlike permissioned setups.
The protocol automatically filters out faulty contributions, demonstrating its resilience.

Miner Insights: Competition, Optimization, and the Search for Alpha

Miners like Noah confirm the increasing competitiveness. The "alpha" (edge) is shifting from finding exploits to optimizing performance – training faster or producing better gradients.
Noah mentions experimenting with training on more data, though challenges remain in keeping local miner models perfectly synced with the global validator model. This points towards future optimization frontiers beyond just fixing bugs.

Technical Deep Dive: Synchronization, Timestamps, and Infrastructure Choices

The discussion touches on synchronization challenges caused by network latency, miners submitting gradients late, or deleting them prematurely.
Templar enforces strict time boundaries (T-min, T-max) using Bittensor's block timestamps, verified against R2/S3 object timestamps.
S3 (Simple Storage Service), like R2, provides verifiable timestamps, acting almost like a blockchain ledger for file uploads. This ensures validators and miners operate on a consistent view of gradients within specific time windows.
Using cloud storage like R2/S3 abstracts away complex peer-to-peer communication issues, allowing focus on the core incentivized training problem, though future iterations might allow miners to compete on bandwidth using different providers.

Revisiting the "Holy Grail": Why Incentivized Decentralization Matters

Constantijn reiterates why this is the "Holy Grail": it tackles the coordination problem that prevents disparate compute from challenging centralized giants (like CCP-backed DeepSeek or heavily funded OpenAI).
Unlike other decentralized attempts relying on trusted compute providers or consortia, Templar's incentivization layer transforms it into a "Bitcoin mining style" training environment.
This unlocks the potential for truly massive scale (trillion+ parameters) and creates a transparent, collectively owned AI development process, potentially even involving frontier labs as future participants (miners).

Broader Context: Venture Interest, Permissioned vs. Permissionless, and Future Security

Joseph, speaking as a venture investor, notes the significant funding (~$250M+) raised by numerous startups attempting permissioned decentralized training. He contrasts Templar's truly permissionless, incentivized approach as profoundly different and potentially more powerful.
He raises questions about future incentive mechanisms as the network grows and the bounty for sophisticated attacks (like injecting backdoors into models) increases. This highlights the ongoing game-theoretic and security challenges inherent in open, incentivized systems.

Tooling and Infrastructure: Towards Open Source Foundations

The conversation touches upon the need for robust, open-source tooling, analogous to Git/GitHub for code, but for managing model weights, gradients, and training states ("versioning substrate").
While tools like Weights & Biases (W&B) are convenient for logging, reliance on closed-source or centralized services introduces risks (e.g., rate limiting, capture).
The ideal future involves integrating with other decentralized infrastructure on Bittensor, like storage subnets (e.g., Hippocampus), to build a fully open-source stack.

Driving Innovation: From Exploits to Optimized Gradient Strategies

The discussion anticipates the next phase of competition. Once basic exploits are eliminated, miners will innovate on core performance. This could involve optimizing bandwidth, finding faster ways to compute gradients (potentially using advanced techniques like KFAC - Kronecker-Factored Approximate Curvature, a second-order optimization method), or even discovering novel learning algorithms spurred by the incentive mechanism.
The subnet evolves from a game of finding loopholes to a true market for efficient intelligence creation. Daniel's reported low loss score hints at such optimizations already emerging.

The Power of Open Ownership and Community

Constantijn concludes by praising Distributed's humility and willingness to cede control, fostering a strong community through open ownership.
This mirrors Bittensor's own success, where empowering the community (like MOGMachine creating TaoStats) leads to emergent value. Giving away power and embracing the sometimes chaotic nature of permissionless systems is presented as a superpower, enabling collective intelligence and resilience that closed systems lack.

Strategic Conclusion

Templar's progress showcases incentivized decentralized training's potential to challenge AI monopolies and democratize foundation model creation. Crypto AI investors and researchers must track its scaling milestones, optimization breakthroughs (especially in speed and efficiency), and the evolving game theory of permissionless compute markets for strategic insights and opportunities.

Subnet 3 Templar –World’s First Distributed, Permissionless, Incentivized Open Source AI Training

Others You May Like

Tech Executives: AI Has Changed SAAS Forever (Don't Fall Behind)

Can AI Be Creative? With AI Artists Mario Klingemann and Shavonne Wong

The Humble Truths Behind Bombastic AI Papers

Subnet 3 Templar –World’s First Distributed, Permissionless, Incentivized Open Source AI Training

Join 4,000+ smart readers to get access to all our research and tools for free.

Others You May Like

Tech Executives: AI Has Changed SAAS Forever (Don't Fall Behind)

Can AI Be Creative? With AI Artists Mario Klingemann and Shavonne Wong

The Humble Truths Behind Bombastic AI Papers