The DCo Podcast
May 26, 2025

Ep 41 — Inside Dune: Decoding the Blockchain's Big Data

Dune CTO & Co-founder Matts Olen dives deep into how Dune evolved from a niche ICO-era tool into a cornerstone of crypto data analytics, now expanding its mission to provide infrastructure for builders with its new SIM APIs.

Dune's Evolution: Democratizing Data

  • "Our mission, you know, is to make crypto data accessible."
  • "Instead of trying to build like a pure closed source company around an open data set, we wanted to turn it around and build an open company on top of the open data set."
  • Dune started in 2018, born from the founders' struggle to get readable, aggregated data from blockchains, which are fundamentally write-optimized.
  • A pivotal shift to an open, user-generated content model fueled its growth, turning Dune.com into a community hub where thousands contribute dashboards, fostering organic growth without direct financial incentives for creators.
  • The platform supports transparency, allowing users to inspect the underlying SQL queries of any dashboard, a key differentiator from closed-source solutions.

Beyond Dashboards: Dune's Infrastructure Play with SIM APIs

  • "The blockchain is like a write-optimized database... if you want total users of Hyperliquid over time or of Uniswap over time, you can't do a query like that to the blockchain."
  • "We've actually taken this write-optimized database [EVM], we've forked it, and we've created a read-optimized database... a fork or a version of the EVM that is actually highly scalable, parallelizable, and answers reads."
  • Dune launched SIM APIs to address builders' needs for real-time data, moving beyond analytics into infrastructure. This helps dApps display essential information like token balances and transaction histories—data not directly queryable from nodes.
  • The technology behind SIM, from the Small XL acquisition, involves a read-optimized fork of the EVM, enabling fast, scalable data extraction for application backends.
  • This initiative targets major UX pain points like slow-loading crypto apps and clunky multi-chain navigation, aiming to provide performant, unified data access.

Navigating the Data Deluge & The AI Horizon

  • "In the future, I think the language of data will end up being English and not SQL. The majority of us will interact with databases all the time using English and not SQL."
  • "The best teams have a data strategy almost or use data in their strategy either as a growth lever or as a way to let's say validate what they're doing."
  • Integrating new chains, especially non-EVM ones like Solana, presents significant engineering challenges due to data volume and different programming models requiring new data schemas.
  • Dune believes natural language querying (text-to-SQL) is the future, though its current AI query feature is a work in progress. Major breakthroughs are expected from foundational AI model providers.
  • Dune focuses on builders and institutions needing product usage data, rather than traders seeking price data. They’ve deliberately avoided launching a token, prioritizing solving product challenges.

Key Takeaways:

  • Dune is doubling down on its mission to make crypto data accessible, not just for analysts but for the very applications shaping Web3. Their strategic expansion into infrastructure via SIM APIs signals a commitment to improving crypto UX at a foundational level.
  • Data Accessibility is Foundational: Blockchains are poor databases for reading aggregated data; off-chain indexing and processing, like Dune provides, are critical for analytics and good application UX.
  • Builders Drive Adoption: Dune's focus remains on empowering builders, providing tools that evolve from community-driven dashboards to real-time data APIs crucial for dApp development.
  • Pragmatism Over Hype: Dune prioritizes solving concrete product and infrastructure problems with a traditional business model, sidestepping tokenization to focus on core utility.

Podcast Link: https://www.youtube.com/watch?v=XQDYsVUTYjM

This episode unpacks Dune's journey from a community-driven analytics platform to an essential infrastructure provider, revealing how accessible, decoded blockchain data is becoming foundational for both builders and sophisticated investors in the crypto space.

The Genesis and Evolution of Dune

  • Matts Olen, CTO and co-founder of Dune, recounts the platform's origins in 2018, born from personal frustrations with accessing on-chain data while investing in ICOs (Initial Coin Offerings)—a fundraising mechanism for new cryptocurrency projects. He and his colleague, who later co-founded Nansen, recognized the blockchain as a "write-optimized" database, making aggregated data reads difficult. This insight led to Dune's creation, initially focused on indexing and decoding blockchain data, providing typed columns from byte codes for developers.
  • The pivotal shift occurred in 2019 when Dune embraced an open model. Matts explains, "Instead of trying to build like a pure closed source company around an open data set, we wanted to turn it around and build an open company on top of the open data set." This fostered the user-generated content platform known today, where a global community creates dashboards on diverse crypto topics. Dune has since grown to about 60 people, evolving its product from a single-player tool to a "multiplayer" environment where users collaboratively advance the space.
  • Actionable Insight: The evolution from a simple data tool to a community-driven platform highlights the power of open data models. For AI researchers, this open approach can provide diverse, labeled datasets crucial for training models on blockchain activity, while investors can see it as a sign of a resilient, community-backed ecosystem.

Fostering Community and Ensuring Data Quality

  • Dune's model relies on organic community contributions rather than direct financial incentives. Users build reputations and followings by sharing valuable dashboards. Matts Olen emphasizes that while ensuring 100% data correctness is challenging on any platform, Dune's transparency offers an advantage. "On Dune you yourself or someone you know or someone on Twitter can actually go and check the source," he states, highlighting that underlying queries and datasets are open for inspection.
  • While Dune hasn't heavily invested in built-in collaborative features like community notes, it relies on external social graphs (X, Farcaster, Telegram) for self-moderation among its "wizards" (skilled data analysts). Matts also sees a future role for AI in verifying data and dashboard integrity.
  • Strategic Implication: The verifiability of data sources is paramount for AI applications. Crypto AI researchers should prioritize platforms like Dune where data lineage and query logic are transparent, allowing for more robust model training and validation. Investors should note that community-vetted data can offer a higher degree of reliability.

Expanding Beyond Analytics: The Introduction of Sim APIs

  • Dune's focus has always been on "making crypto data accessible," initially for analytics. However, this mission has expanded. Matts Olen reveals Dune's recent move into infrastructure with Sim APIs, a product suite designed to help developers build applications. He explains that many basic app features, like displaying token balances or user activity feeds, require complex off-chain indexing because blockchains like Ethereum (a decentralized, open-source blockchain with smart contract functionality) are not optimized for such reads.
  • Sim APIs leverage Dune's expertise in data processing to provide real-time infrastructure, enabling wallets, DeFi (Decentralized Finance) applications (financial applications built on blockchain technology), and other crypto products to offer better user experiences with readily available data.
  • Actionable Insight: The development of robust data APIs like Sim is critical for the next generation of crypto applications, including those incorporating AI. AI researchers can leverage such APIs for real-time data feeds into predictive models or on-chain agents, while investors should see this as enabling infrastructure for more sophisticated and user-friendly dApps.

Tackling Crypto's User Experience (UX) Challenges

  • Matts Olen expresses his impatience with slow crypto applications, arguing that poor UX often stems from data retrieval inefficiencies rather than just complex blockchain protocols like EIPs (Ethereum Improvement Proposals). He believes fundamental engineering can solve many of these issues. "I want to really go to war against these slow pages," he asserts, underscoring the motivation behind Sim APIs.
  • The APIs aim to deliver high performance and cross-chain data aggregation, allowing users to see all their data in one view without manually switching networks. This addresses a common frustration point in the current multi-chain crypto landscape.
  • Strategic Implication: Enhanced UX driven by faster data access can significantly boost adoption for crypto applications. For AI-driven tools, this means quicker data ingestion and analysis, leading to more responsive insights. Investors should look for projects prioritizing performant data infrastructure as a key differentiator.

Dune's Strategic Direction and the Small XL Acquisition

  • Dune's mission to make crypto data accessible extends to empowering builders to onboard more users, leading them into the infrastructure domain. A significant step in this direction was the acquisition of Small XL in late 2023. Matts Olen hints at a "very groundbreaking product" emerging from this acquisition, further solidifying Dune's commitment to providing foundational tools for the ecosystem.
  • Actionable Insight: Acquisitions like Small XL signal Dune's strategic intent to deepen its infrastructure offerings. Crypto AI investors should monitor how these new capabilities might lower barriers for developing data-intensive AI applications on-chain, potentially unlocking new investment opportunities in AI-native crypto projects.

The Evolving Landscape of On-Chain Data Consumption

  • Matts Olen observes a divergence in data consumption: retail users often gravitate towards price-focused platforms like Deck Screener for quick alpha, while institutions, increasingly active as builders and investors, are more interested in "product data"—the actual usage metrics of on-chain applications. Dune has consistently catered to builders, a group now increasingly populated by institutional players.
  • He notes, "I think it's fair to say that the interest in onchain data is just higher than ever from builders from retail from institutions." This underscores a bullish sentiment for the crypto data sector.
  • Strategic Implication: The growing institutional interest in deep on-chain product metrics creates a demand for sophisticated analytics, where AI can play a significant role. Researchers can develop AI models to extract nuanced insights from this product data, while investors can use these insights for more informed capital allocation.

Navigating the Data Provider Ecosystem and Sustainability

  • Addressing the sustainability of token-incentivized data providers like The Graph (a decentralized indexing protocol for querying networks like Ethereum and IPFS), Matts Olen acknowledges the complexity of innovating on both product and economic models simultaneously. Dune, he explains, has chosen a "counter to the trajectory of the ecosystem" by focusing solely on solving the product problem and employing a traditional business model, avoiding tokenization.
  • "We're just trying to solve the like product problem in front of us and pay for the infrastructure on the other side," Matts states, highlighting Dune's pragmatic approach. He salutes those innovating on token models but finds the product challenge itself sufficiently demanding.
  • Actionable Insight: Investors evaluating data infrastructure projects should consider the long-term viability of their business models. Dune's traditional approach offers a different risk profile compared to token-dependent projects, which may appeal to those seeking more conventional investment structures in the crypto space. AI researchers benefit from stable, reliable data sources, regardless of the provider's economic model.

The Engineering Hurdles of Multi-Chain Data Integration

  • Integrating non-EVM (Ethereum Virtual Machine) chains like Solana (a high-performance blockchain supporting smart contracts and decentralized applications) presents significant engineering challenges. Matts Olen details Dune's experience with Solana in 2021, citing its unprecedented data scale (requiring systems to handle ~100x more data than Ethereum) and its different account and programming model (SVM - Solana Virtual Machine).
  • This necessitated scaling Dune's entire stack—from RPC (Remote Procedure Call) nodes (servers that allow applications to query blockchain data and submit transactions) to ingestion, decoding, and query engines. Furthermore, the SVM's unique data structures required redefining datasets to extract meaningful value, a challenge common to most non-EVM chains that innovate on their programming models.
  • Strategic Implication: The complexity of cross-chain data integration underscores the value of platforms that can abstract this away. For AI, consistent and comparable data across diverse chains is crucial for holistic market analysis. Investors should recognize the technical moat built by companies successfully tackling these multi-chain challenges.

Incorporating Off-Chain Data: A Measured Approach

  • Dune's primary focus remains on-chain data, though it has opened up API-based data ingestion, allowing for some off-chain datasets like Reservoir's NFT metadata. While they incorporate aggregated token price data from centralized exchanges, granular CEX data (e.g., order book depth, specific exchange prices at precise times) is not a priority.
  • Matts Olen attributes this to Dune's builder-centric, rather than trader-centric, user base. "A lot of the time we've shied away from these use cases that are very like trading focused or arbitrage focused," he explains.
  • Actionable Insight: While Dune's current focus is on-chain, the potential integration of more diverse off-chain data could significantly enrich datasets for AI analysis, particularly for models requiring broader market context. Researchers should monitor any shifts in this strategy.

The Competitive Arena for Crypto Data

  • Matts Olen views the competitive landscape as multifaceted. For user-generated dashboards, direct competitors are few (Flipside was mentioned). In selling data to businesses, Dune sees players like Goldsky and Alchemy. Their developer offerings (Sim APIs) face a different set of competitors. The presence of many companies in crypto data, he believes, validates the importance and value of the space.
  • Strategic Implication: A competitive data landscape drives innovation and can lead to better tools and richer datasets. AI investors and researchers benefit from this, as it fosters the development of more sophisticated data sources and analytical capabilities.

The Importance of a Data Strategy for Crypto Projects

  • The quality and accessibility of a project's data often reflect its overall strategy. Matts Olen observes that "the best teams have a data strategy almost or use data in their strategy either as a growth lever or as a way to let's say validate what they're doing." This starts at the smart contract level, with well-structured function calls and appropriate event emissions.
  • He contrasts this with projects where data considerations are an afterthought, citing the old MakerDAO contracts as an example where internal variable names were obscure and event emissions non-standard, making data analysis difficult initially. Dune often assists teams by connecting them with "Dune Wizards" or helping directly with dashboard creation.
  • Actionable Insight: For AI researchers, projects with a clear data strategy and well-instrumented smart contracts will provide higher-quality, more reliable data for model training. Investors should view a proactive data strategy as a positive signal of a project's transparency and operational maturity.

AI's Role in Shaping the Future of Data Interaction

  • While Dune offers a natural language query feature, Matts Olen acknowledges that the sheer volume of datasets makes it challenging for AI to perfectly traverse them yet. However, he is optimistic about the long-term impact of AI on data analysis: "I think the language of data will end up being English and not SQL."
  • Currently, many in the Dune community use AI tools like Dune AI, Claude, or ChatGPT to help craft SQL queries. Matts anticipates that major advancements in Text-to-SQL (technology that converts natural language questions into SQL queries) will likely come from foundational AI model teams like OpenAI, Google, and Anthropic, which Dune will then apply.
  • Strategic Implication: The advancement of Text-to-SQL capabilities will democratize data access, allowing less technical Crypto AI researchers and investors to perform complex on-chain analyses. This trend could significantly accelerate the application of AI in understanding blockchain ecosystems.

Reflections on Crypto's Journey and Matts Olen's M&A Philosophy

  • Matts Olen expresses surprise at crypto's current level of mainstream discussion, including White House talks and stablecoin legislation, a stark contrast to the "underground cyber funk hacker" feel of the early days. He believes the industry has progressed further than he initially envisioned.
  • Regarding Dune's acquisition of Small XL, it was enabled by their Series B funding (~$70 million). The process involved identifying teams building novel technology aligned with Dune's strategic direction—specifically, becoming more real-time, developer-focused, and accelerating market entry. "Will this team, will this technology, will this product pull our company in the direction we want to go?" was the primary diligence question. Integration is ongoing, focusing on leveraging Small XL's tech and fostering a unified "one team one dream" culture. Future acquisitions will be opportunistic.
  • Actionable Insight: Dune's M&A strategy, focused on acquiring technology to accelerate its roadmap towards real-time developer tools, signals a commitment to building robust infrastructure. This is beneficial for AI applications that require timely and comprehensive data.

Upcoming Innovations: Sim Indexing Product

  • Following the launch of Sim APIs, Dune is set to release a Sim indexing product by the end of June, built upon the acquired Small XL technology. Matts describes it as a "real-time data pipeline builder" that essentially creates a read-optimized fork of the EVM.
  • "You as a developer You define essentially pieces of smart contract code that goes into the EVM execution and pulls out the data that you want simply by like emitting an event," he explains. This allows for rapid creation of database tables and API endpoints, scaling across multiple chains with high backfill speeds, enabling developers to go from "zero to full application backend... in minutes."
  • Strategic Implication: This new indexing product could drastically reduce the complexity and time required to build data backends for dApps, including those with AI components. Crypto AI researchers and developers should watch for its release, as it may offer powerful new ways to access and process real-time on-chain data for AI model training and inference.

Balancing Pragmatism and Decentralization in Application Development

  • Matts Olen positions Dune's offerings, like Sim, as a pragmatic approach to building crypto applications. He argues that while full decentralization is an ideal, it often comes at the cost of user experience, particularly data availability and response times. "Essentially we are the pragmatic approach where we don't think you can decentralize the foundational like data building blocks in a way that still delivers great UX to users."
  • He notes that successful crypto apps often find a balance, using decentralized smart contracts but relying on more centralized, managed infrastructure for data delivery to ensure performance. While Sim is proprietary, the underlying blockchain data remains public and verifiable.
  • Actionable Insight: For Crypto AI investors and researchers, understanding this trade-off is crucial. While decentralization is a core tenet, pragmatic infrastructure choices can enable more powerful and responsive AI-driven applications in the near term. The key is that the underlying data remains open and verifiable.

Conclusion

Dune's evolution from an analytics tool to an infrastructure provider underscores the critical need for accessible, decoded on-chain data. Crypto AI investors and researchers must monitor these advancements, as they directly impact the quality of inputs for AI models and the development of data-intensive decentralized applications.

Others You May Like