This episode reveals Covenant AI's ambitious strategy to build a sovereign, state-of-the-art foundation model on Bittensor by vertically integrating pre-training, compute, and reinforcement learning through a "holy trinity" of interconnected subnets.
The Prophecy of Bittensor: A Strategic Vision
- Iterations vs. Themes: Samuel introduces a core philosophy: specific iterations (like the original internet or early blockchains) can fail or be corrupted, but the underlying theme (freedom, liberty) ultimately succeeds.
- Bittensor as the Next Iteration: Covenant views Bittensor as the next critical iteration in this theme. Their goal is to use its "chaotic jungle" to produce emergent, powerful AI products that couldn't exist in Web2.
- The Holy Trinity: This vision is manifested through three interconnected subnets: Templar (pre-training), Grail (reinforcement learning), and Basilica (compute).
"We believe that Bittensor is or can be that iteration that pushes things forward and to us subnets are...meant to be tools by which the prophecy of Bittensor is manifested." - Samuel
Templar (SN3): The Frontier of Decentralized Pre-Training
- The Fallacy of Relying on Centralized Models: He warns against depending on models from entities like Meta or China, predicting they will eventually "turn the tap off" on open-weight releases (e.g., Llama, Deepseek).
- Strategic Imperative: If the community loses the ability to pre-train, it becomes dependent on centralized powers, defeating the purpose of decentralization. Templar's mission is to master this difficult skill for the ecosystem.
- Current Status: The 70B model run is active but facing stabilization challenges. The immediate goal is not just increasing size but producing a high-quality, state-of-the-art base model ready for the global stage.
Sparse Loco: A Breakthrough in Distributed Training
- Technical Definition: Sparse Loco is an optimizer for distributed machine learning that achieves over 99% communication compression. It uses top-k compression (sending only the most important gradient updates) and 2-bit quantization (reducing the data size of each update) to drastically lower the communication bottleneck.
- Overcoming the Bottleneck: Previously, merging model weights across hundreds of nodes was too slow and data-intensive to be feasible. Sparse Loco solves this, enabling training of massive models across geographically dispersed, non-datacenter-grade hardware.
- Performance: Researcher Amir explains that Sparse Loco outperforms existing state-of-the-art methods like Google's DLOCCO by uniquely combining compression techniques without sacrificing model performance.
- Strategic Implication: This algorithm is the "keystone" for decentralized AI training. It allows Bittensor to train the largest, most incentivized, and only truly permissionless models in the world, putting the network at the forefront of AI research.
Basilica (SN39): The Compute Substrate
- Initial Phase: For the next 2-3 months, Basilica will function primarily as a compute rental network. This phase is for "dogfooding"—stabilizing the core infrastructure and solidifying validation mechanisms by serving their own miners first.
- The Challenge of Compute Economics: Samuel states that simply selling decentralized compute is incredibly difficult, as the unit economics make it hard to generate returns that surpass miner emissions.
- The Long-Term Vision: Value-Added Services: The real goal for Basilica is to build a service layer on top of the raw compute. This includes offering unique, research-driven services that provide exponential value beyond simple hardware rental. Evan, a developer on the project, highlights the focus on building an "antifragile" and highly available network.
Grail (SN81): The Post-Training and Intelligence Layer
- The Shift in Compute Budget: The creation of Grail was motivated by the realization (prompted by models like Grok) that post-training now requires a compute budget as large as pre-training, making it a prime candidate for decentralization.
- The "Grill" Algorithm for Verifiable Inference: A key innovation from this work is "Grill," a lightweight and fast algorithm to verify that a miner used a specific model to generate an output.
- Verifiable Inference: This is a method to cryptographically prove that an inference came from a specific, untampered model. It prevents providers from bait-and-switching to cheaper, less capable models.
- Strategic Application: This technology, developed for Grail, will become a value-added service on Basilica, solving a massive, unaddressed problem in the AI industry.
- Roadmap: Grail will first be rolled out to test and battle-harden the Grill verification mechanism. Subsequently, it will begin RL fine-tuning, with the ultimate goal of taking a base model from Templar and dramatically improving its intelligence.
Covenant.AI: A Unified Vision
- Synergy Over Silos: The three subnets are not a loss of focus but a broadening of view. They are designed to work together: a model is pre-trained on Templar, powered by Basilica's compute, and then given intelligence on Grail.
- Leapfrogging Centralized Counterparts: The strategy is to escape the "local minima" of copying Web2 business models or simply gaming emissions. Covenant believes deep investment in research and creating new primitives (like Sparse Loco and Grill) is the only way to build sustainable moats and surpass centralized incumbents.
"We need to develop new primitives... We have done literally this with Grill. We plan to do more and this gives you the kind of flavor for the kind of value-added stuff we plan on doing." - Samuel
Conclusion
This episode outlines Covenant AI's vertically integrated strategy to build a world-class foundation model on Bittensor. By combining sovereign pre-training (Templar), specialized compute (Basilica), and advanced reinforcement learning (Grail), they are creating a powerful, synergistic AI development pipeline. Investors and researchers should closely monitor the interplay between these subnets.