This episode reveals the immense engineering challenge of building Firedancer, a high-performance Solana validator client, while trying to hit the moving target of Solana's rapidly evolving protocol.
Firedancer Roadmap and the Conformance Challenge
- Frankendancer: A hybrid client combining original Solana code with Firedancer's optimized components. It is currently live and secures about 10% of the network's stake. Michael notes its performance is comparable to or better than the original client on key metrics.
- Full Firedancer: The complete rewrite of the Solana validator client. It is not yet fully shipped. The primary obstacle has been solving the "conformance problem"—the need to ensure the new client is bit-for-bit identical to the original to prevent network disagreements and potential halts.
Michael explains the difficulty: "You have this huge code base and we're trying to do a rewrite, but we have to get it to match perfectly." This three-year effort has involved building extensive custom tooling and test vectors to ensure perfect replication, a task that has consumed the majority of the engineering team's resources.
Building for a Moving Target
- Constant Iteration: The Anza team (the core developers of the original Solana client, Agave) ships new features frequently. The Firedancer team must conform to this moving target, sometimes building components that are later deprecated or replaced.
- No Formal Specification: The Solana protocol lacks a detailed formal specification beyond the original whitepaper. The millions of lines of code in the Agave client serve as the de-facto specification. This forces the Firedancer team to reverse-engineer undocumented features and behaviors, even down to replicating specific rounding behaviors in the Rust programming language.
- The Alpenlow Wrench: Alpenlow, a fundamental redesign of Solana's consensus mechanism that removes Proof of History, is a prime example of this challenge. The Firedancer team invested months building a highly optimized Proof of History implementation, which will now be discarded. They face a constant trade-off: build for the future and risk delays if that future changes, or build for the present and accept doing redundant work.
Rethinking Performance: TPS vs. Efficient Block Packing
- The Goal is Economic Value: The ultimate goal is not just high TPS, but maximizing the economic value transacted on the network, for which fees are the best proxy. Increasing TPS is the path to increasing fees.
- The Compute Unit (CU) Limit: The network currently has a CU limit (similar to Ethereum's gas limit) to ensure all ecosystem participants, from RPC providers like Coinbase to the Agave client itself, can keep up. This cap prevents Firedancer from immediately deploying its full TPS potential on mainnet.
- Strategic Focus on Block Fullness: While the CU limit is in place, Firedancer focuses on what it can control: packing blocks as fully and quickly as possible. They have developed a scheduler that can fill blocks to 100% capacity in just 130 milliseconds, maximizing the value within the existing constraints.
The Centrality of the Compute Unit (CU) Limit
- Performance as a Solution: He suggests that if the network had extremely high throughput, validators would have less time and incentive to engage in "games" like delaying block production to cherry-pick profitable transactions. High capacity forces validators to process transactions continuously, improving latency and fairness.
- Firedancer's Thesis: "Our whole thesis with Firedancer is that blockchains haven't realized their potential yet." By building a chain that can handle a million TPS, the goal is to unlock new applications and utility, which may inherently solve many of the secondary problems that arise from network congestion and constraints.
Improving Solana: Targeting the Leader Pipeline and Scheduler
- The Scheduler Bottleneck: While cryptographic signature verification is expensive, it is highly parallelizable across multiple CPU cores. The real bottleneck is the scheduler, which must decide the order of transactions in a block. This process is inherently single-threaded, as it must prevent conflicts like double-spending from the same account.
- A Simpler, Faster Approach: Firedancer and, more recently, Agave have moved to a "greedy scheduler." This simpler algorithm processes transactions from highest to lowest value without complex dependency analysis, proving to be much faster and avoiding stalls. This is a key area where targeted engineering, not just massive re-architecting, yields significant wins.
Validator Modding and Private Order Flow
- Supporting a Modding Ecosystem: Firedancer is designed to support a vibrant ecosystem of mods, like those from Jito, in a secure, sandboxed environment. The goal is to be a neutral platform that allows for innovation without compromising the core validator's performance or security.
- Alternative TPUs and Parallels to PFOF: Michael draws a parallel between private TPU (Transaction Processing Unit) ports in Solana and the payment-for-order-flow (PFOF) model in traditional equities. He argues that allowing users to send orders to specialized entities can result in better execution prices, as those entities can price the flow more accurately than a public, adversarial market.
- Building Trust: While regulatory risk differs, he believes a reputation-based system can emerge in crypto, where entities that provide good execution without front-running will attract more order flow, creating a self-policing mechanism.
Driving Adoption: Operator Experience and Strategic Delegation
- Beyond Performance: Since the CU limit caps the immediate performance advantage, Firedancer focuses on operator-friendly features like well-documented configuration files, standardized metrics, and a unique graphical user interface (GUI) for validator monitoring.
- The Delegation Program: To incentivize the switch, Jump has a stake delegation program that provides a small financial boost to validators running Frankendancer. This helps offset the costs of switching hardware and operations, and crucially, it creates a vital feedback loop between the community and the development team.
Client Diversity and Two Types of Liveness Risk
- The "Good" Halt (Preventing Catastrophe): If a critical bug like an infinite mint is found in Agave, Firedancer clients (lacking the same bug) would reject the invalid transactions. This disagreement would halt the network, which is far preferable to a catastrophic loss of funds that could destroy the ecosystem.
- The "Bad" Halt (Minor Disagreements): Conversely, a minor, non-critical bug (e.g., a one-nanosecond timestamp rounding error) could also cause a disagreement and halt the network. In this case, client diversity makes the network less stable than a single-client network that would have ignored the minor error. This is why the team's focus on perfect conformance is paramount.
The Future of Solana: Is This the Final Form?
- Constant Evolution: He points to major upgrades like Alpenlow as proof that Solana is not static. "That is why Solana does so well is it's an evolving project that is kind of constantly keeping pace with the ecosystem and the needs of the users."
- Pockets of Excellence: While not every component is a "platonic ideal," parts of Solana, like its block propagation protocol Turbine, are exceptionally well-designed and have stood the test of time. The key is to continuously evolve and replace the less-than-ideal parts over time.
Conclusion
Firedancer's development highlights that raw TPS is secondary to the engineering trade-offs required to build for a live, evolving blockchain. For investors and researchers, the key metrics to watch are not just performance benchmarks but the gradual increase of the CU limit and the steady adoption of new clients.