Ventura Labs

April 22, 2025

David Fields: Bittensor AI, Data Structuring, Social Media Analysis, Subnet 33, ReadyAI | Ep. 37

David Fields, co-founder of ReadyAI, dives deep into structuring the world's data for AI, discussing ReadyAI's enterprise tools, Bittensor ecosystem dynamics under Detail, and the slow burn of corporate AI adoption.

The Power of Structured Data

"Structured data is really what makes AI models and AI applications sing."
"It's really about organizing the data and enabling the agent to more cost-effectively or more quickly with less tokens get at the actual information that it needs... the more garbage or unstructured information there is... the less likely it is they're going to come to the correct answer."
Accuracy & Cost: Structuring data drastically improves AI accuracy (especially in RAG systems using formats like Markdown) and cuts enterprise operational costs by reducing token consumption in queries (potentially 100-1000x).
Agent Enablement: For AI agents, structured data provides essential context, reduces "garbage in, garbage out" errors, and allows them to navigate complex tasks more reliably and efficiently.
Beyond Raw Data: While large context windows are expanding, structured data via vector databases and RAG remains superior for enterprise-grade accuracy, cost-efficiency, and model fine-tuning.

ReadyAI: Structuring Data for Enterprise

"Ready AI's mission is to structure the world's data and make it universally accessible for AI."
"We're going to be rolling this out in the coming weeks where there'll be real-time data aggregation of your social signals and be able to get structured data output. So, informing you of sentiment changing around your brand or around competitors brands."
Tools & Services: ReadyAI offers a jobs interface for data structuring (metadata tags, sentiment analysis) on public (Hugging Face) or private datasets and is launching an end-to-end enterprise product for real-time social media signal analysis.
Data Focus: They process varied data, including podcasts (creating open-source conversational AI datasets) and social media feeds (partnering with Subnet 13 for raw data), adding structure like tags and sentiment.
Partnerships & Model: Collaborating with Common Crawl to structure web data and build AI agents, ReadyAI models itself partly on Scale AI, assisting enterprises with data structuring and AI system builds, initially leveraging public/social data.

Bittensor's Evolution: Detail & Subnet Economics

"I think that you're seeing that people are valuing these subnets as more than just the sum of Bittensor... they're attributing value to the outputs to the commodities that the subnets are producing."
Detail's Impact: Bittensor's Detail (Dynamic TAO Allocation) upgrade shifted incentives, gradually increasing the weight of subnet-specific 'alpha' tokens, making delegation and holding these tokens crucial for rewards.
Valuing Outputs: Subnet token prices rising relative to TAO ("Sum Prices" > 1) suggest the market increasingly values the specific commodities (like structured data) produced by individual subnets, not just their TAO backing.
Ecosystem Maturation: Subnets face ongoing challenges in refining incentive mechanisms against gaming, while efforts like ReadyAI's miner optimization toolkit aim to lower entry barriers and attract talent. Subnet owners are now key to monetizing outputs.

Enterprise AI: Still Early Innings

"The transition of enterprise to being AI ready like we're in like... not even the first inning of most of the large companies that that we talk to."
Slow Adoption: Most large enterprises are significantly behind the curve in preparing for and adopting AI due to legacy systems, data silos, and inertia.
Data Hurdles: Key challenges include accessing siloed data across departments and major concerns about data privacy and ownership, particularly when using external AI providers like OpenAI.
The Opportunity: There's a vast unmet need for tools and services that help enterprises structure their disparate data, break down silos, and build secure, internal AI capabilities for actionable insights.

Key Takeaways:

Structured data isn't just nice-to-have; it's the foundation for accurate, cost-effective AI, especially as enterprises slowly wake up to its potential. ReadyAI is positioning itself with tools and partnerships to serve this need within the evolving Bittensor ecosystem. Detail is fundamentally changing subnet dynamics, emphasizing the value of specialized outputs.
Structure Unlocks AI Value: Raw data is cheap, insights are expensive. Structuring data massively boosts AI accuracy and slashes enterprise query costs (up to 1000x).
Enterprise AI Adoption Lags: Big companies are stuck in the "first inning" of AI readiness, battling data silos and privacy fears – a huge opening for structured data solutions.
Bittensor Values Specialization: Detail's economics and rising "Sum Prices" show the market rewarding subnet-specific outputs, shifting focus to monetizing these unique digital commodities.

Podcast Link: https://www.youtube.com/watch?v=ToqH3d5kuE8

This episode unpacks ReadyAI's mission to structure the world's data for AI, detailing how transforming raw information—from social media to podcasts—into actionable insights drives value for enterprises and the Bittensor ecosystem.

ReadyAI's Mission and Roadmap

David Fields, representing ReadyAI, outlines the company's core mission: structuring global data to make it universally accessible for AI applications.
He introduces ReadyAI's toolkit, starting with the "jobs interface" available on their site. This tool allows users (enterprise or individual) to structure data, generate metadata tags, perform sentiment analysis, and apply custom tagging to both private and public datasets (like those on Hugging Face).
David emphasizes that structured data is fundamental for optimizing AI model performance.
Looking ahead, ReadyAI is piloting an end-to-end product for businesses, focusing on real-time social signal aggregation.
This upcoming offering aims to provide enterprises with structured data outputs, enabling them to track sentiment shifts around their brand and competitors.
David notes this is currently being tested with about a dozen enterprises and will roll out soon.

Data Sources and Structuring Process

ReadyAI processes diverse data types.
Initially focused on structuring large volumes of varied podcast data, the subnet now handles organic queries submitted through the jobs interface, primarily involving Hugging Face datasets.
A key collaboration involves Subnet 13 (Data Universe), which aggregates raw social media data (mainly Twitter/X and Reddit).
ReadyAI then processes this data, adding structure through sentiment analysis and relevant metadata tagging.
David explains this structured output helps organize data for use in RAG systems or for fine-tuning AI models.
He clarifies ReadyAI's role as the processing layer that derives insights from the raw data aggregated by partners like Subnet 13.

The Case for Structured Data: Accuracy, Cost, and RAG

Accuracy: Large amounts of unstructured data in a context window can reduce the accuracy of the information retrieved, as the model struggles to pinpoint relevant details.
Cost: AI model interactions are priced per token (units of text like words or characters). Querying large unstructured documents repeatedly incurs high token costs. David states, "if you instead take that information and vectorize it and put it into a vector database using a rag implementation, you can cut that, you know, query cost, you know, down by, you know, 100 or a thousandx."
RAG (Retrieval-Augmented Generation): This technique involves retrieving relevant information from an external knowledge base (like a vector database containing structured data) before generating a response. David explains that using RAG with structured, vectorized data significantly lowers query costs and improves accuracy compared to relying solely on large context windows.
Vectorization converts data into numerical representations (vectors) that capture semantic meaning, allowing efficient similarity searches in a vector database.

Enhancing AI Agents with Structured Data

Structured data is crucial for the reliability of AI agents (autonomous programs performing tasks). David uses the example of Twitter-based AI agents like AIXBT, noting they sometimes provide incorrect information because they process a raw, untagged data firehose.
Structuring this data—tagging relevance, source trustworthiness, etc.—prevents "garbage in, garbage out."
He explains that structuring helps agents efficiently access the specific information needed, reducing token usage and improving the likelihood of correct outcomes.
This is vital as agents performing multi-step tasks (like research or making purchases) require high accuracy at each step to avoid overall failure.
David mentions the emergence of standards like MCP (Model Context Protocol) by Claude (Anthropic) aims to create consistent ways information is fed into agents, underscoring the need for organized data inputs.

Evolving Data Formats: The Importance of Markdown

David notes that ReadyAI is exploring different structured data formats beyond basic tagging.
He specifically highlights Markdown, a lightweight markup language with plain-text formatting syntax, as increasingly important.
Research shows feeding Markdown-formatted information into RAG systems significantly boosts accuracy compared to unstructured data.
ReadyAI aims to support an evolving set of structured data standards as AI models develop.
David emphasizes the goal is to continuously adapt the subnet to provide the most effective data outputs required by new AI techniques and standards.

Subnet 33 Evolution: Incentive Mechanisms and Expansion

Expanding the types of data ReadyAI's subnet (Subnet 33 on Bittensor) can process requires careful evolution of its incentive mechanism.
Bittensor is a decentralized network that incentivizes the creation and operation of specialized AI models (subnets) through its tokenomics.
David explains the core challenge is ensuring the mechanism accurately rewards miners for producing high-quality structured output across diverse data types.
ReadyAI has stabilized its mechanism after initial challenges and is now focused on generalizing how organic data is ingested and evaluated.
Future plans involve incorporating a time dimension into the incentive mechanism, evaluating not just the quality but also the speed at which miners structure data.
The goal remains aligning miner scores directly with the quality and utility of their output.

Incentive Mechanism Stability and Miner Ecosystem Growth

David confirms that ReadyAI's incentive mechanism has stabilized after overcoming early issues like "tag stuffing," where miners added irrelevant tags to game the scoring system.
He stresses the importance of the miner community and ReadyAI's efforts to grow it.
They recently released a "miner optimization toolkit" with a Docker image (a standardized unit of software packaging) downloadable from Docker Hub.
This toolkit simplifies the onboarding process, allowing miners to run on various hardware (from Raspberry Pi to Mac M-series silicon) with one click.
While onboarding is easier, David cautions that competition remains fierce, with top miners significantly outperforming standard models like GPT-4o.
The aim is to attract top talent by lowering entry barriers.

Enterprise Adoption: Privacy, Use Cases, and Monetization Strategy

ReadyAI is actively engaging with enterprises, leveraging the team's background (Disney, Google/AdSense acquisition) and a dedicated sales director.
David acknowledges enterprise concerns around data privacy and security.
Strategically, ReadyAI is currently focusing on structuring publicly accessible data, like social media signals and Common Crawl web data, avoiding sensitive customer data flowing through the public subnet for now.
For future handling of private data, they are exploring Trusted Execution Environments (TEEs)—secure areas within a processor ensuring code and data confidentiality and integrity—and privacy-preserving token techniques where data is anonymized before reaching miners.
The primary enterprise use case currently revolves around deriving insights (like sentiment analysis, competitor tracking, customer risk signals) from public social media data.

Business Model Insights: Comparisons to Scale AI and Product Strategy

David draws parallels between ReadyAI's business model and Scale AI, a company known for data annotation using human labelers.
He argues that enterprises are already comfortable sharing anonymized data with distributed human workforces (like Scale AI's annotators), suggesting the model of distributing data to decentralized miners on Bittensor isn't an insurmountable privacy hurdle for many use cases.
ReadyAI aims to emulate Scale AI by not only providing the structured data pipeline but also working directly with companies (like their Common Crawl project) to build custom AI models and systems leveraging that data.
Additionally, they plan to offer "wrapper" products—seamless end-user applications built on top of the subnet, where the underlying Bittensor infrastructure might be invisible to the user.

Partnership Deep Dive: Common Crawl Collaboration

The partnership with Common Crawl, a non-profit maintaining an open repository of web crawl data, serves as a key case study.
ReadyAI processed Common Crawl's public data (from their listserv, Discord, and the web crawl itself) using Subnet 33.
This structured data was used to build a highly accurate AI model/agent for Common Crawl's community of AI researchers.
This agent provides insights based on Common Crawl's data and is continuously updated in real-time as new information becomes available.
The model is accessible on Common Crawl's site, Discord, and Slack, demonstrating a practical application of ReadyAI's structured data pipeline.

Target Sectors and Showcasing Capabilities

While engaging various enterprises, David emphasizes that ReadyAI is also focused on showcasing the power of structured data.
The Common Crawl project is one example.
Another upcoming project involves collaborating with a prominent crypto Twitter creator to build a sophisticated AI agent (similar in concept to AIXBT but aiming for higher quality).
This agent will integrate diverse data sources, including social media and on-chain data, structured by ReadyAI's subnet.
David states the goal is to demonstrate tangible value and accelerate adoption by showcasing what's possible with high-quality structured data inputs.
He indicates this creator-focused agent will be announced in the coming weeks.

Navigating the Detail (DTA) Launch: Market Volatility and Strategy

David shares his perspective on the launch of Detail (DTA), Bittensor's major 2.0 upgrade introducing dynamic token allocation across subnets.
He praises the technical execution by the core team (Opentensor Foundation).
ReadyAI anticipated high volatility in the initial 30-60 days due to the fair launch nature of subnet tokens and extremely low liquidity.
Their strategy was to remain "heads down," focusing on building and delaying major public announcements until the market stabilized (~45 days post-launch).
This was due to the initial downward price pressure caused by the root proportion mechanism in a low-liquidity environment.
With the market now more stable and liquidity improved, ReadyAI is becoming more active publicly.
David expresses confidence in the subnet's trajectory moving forward.

Analyzing Detail Dynamics: Sum Prices and Tokenomics

David discusses sum prices, a metric comparing a subnet token's price to the price of TAO (Bittensor's native token).
He notes recent sum prices exceeding 1 (hitting ~2.04 at the time of recording), suggesting the market values subnet outputs (the "commodities") beyond just their share of TAO emissions.
This indicates confidence in individual subnets' execution and value proposition.
He explains the initial downward pressure stemmed from the root vs. alpha proportion.
At launch, 100% of validator rewards flowed through the "root network" (based on overall TAO stake), and these rewards (paid in subnet tokens) were often immediately sold back into TAO, suppressing subnet token prices.
Over 60 days, this shifted gradually towards the "alpha proportion" (based on stake delegated directly to the subnet's validators using the subnet's own token).
David mentions his subnet was around 40% alpha / 60% root at the time of recording, moving towards 50/50.
He cautions that while sum prices above 1 are positive, the low-liquidity environment means volatility will likely continue.
He also clarifies that Detail's design doesn't inherently pull sum prices back to 1; it's a free market dynamic.
High APRs (Annual Percentage Rates) on alpha tokens are possible but highly volatile, as they depend heavily on the amount of alpha staked.

Operational Impact of Detail

David states Detail hasn't fundamentally changed ReadyAI's day-to-day operations much, aside from managing team expectations regarding initial token price volatility.
Their focus remains on long-term building, and they have no immediate plans to sell their token holdings.
The main operational consideration was navigating the early low-liquidity, high-volatility period.

Long-Term Vision: Enterprise AI Readiness and Data's Role

Looking 5-10 years out, ReadyAI aims to make all the world's data AI-ready.
This involves creating open-source datasets to benefit the broader AI community and, crucially, helping enterprises navigate their transition to becoming AI-ready.
David observes that most large companies are "not even in the first inning" of AI adoption.
He highlights enterprise challenges: data silos across different systems and business units, concerns over data privacy (especially with models from companies like OpenAI perceived as not respecting data ownership), and the difficulty of integrating legacy systems.
The ultimate goal for enterprises, which ReadyAI wants to facilitate, is creating seamless systems where authorized employees can access cross-company insights instantly.
David believes AI holds the promise to make large organizations vastly more agile. "We think data... may be the most important piece of... what makes AI so powerful," he asserts.

Strategic Advice for Subnets: Horizontal vs. Vertical Integration

David reflects on a shift in the Bittensor ecosystem post-Detail.
While subnets were initially designed primarily as commodity producers for validators, Detail emphasizes the role of subnet owners (often the largest validators on their own subnet) in monetizing these commodities directly, driven by alpha token staking.
He suggests that even for horizontally-focused subnets like ReadyAI (providing infrastructure/data usable by many), there's a need to also build vertical products.
While the structured data pipeline is a horizontal offering, ReadyAI is also building specific applications (like the Common Crawl agent, the upcoming creator agent, enterprise wrappers) on top of it.
He advises other subnets to consider that while millions of businesses need foundational tools (like vector databases), the larger unserved market lies in providing tailored solutions and end-user products built upon the subnet's core commodity.

Conclusion

ReadyAI's strategy underscores structured data's pivotal role in boosting AI model accuracy and cost-effectiveness, particularly for enterprises navigating early AI adoption. Crypto AI investors and researchers should closely monitor the development of structured data pipelines like ReadyAI's and the evolving tokenomic incentives within Bittensor's Detail framework.

David Fields: Bittensor AI, Data Structuring, Social Media Analysis, Subnet 33, ReadyAI | Ep. 37

Others You May Like

Dylan Patel on GPT-5’s Router Moment, GPUs vs TPUs, Monetization

Subnet 56 :: Gradients :: Bittensor End-to-end AI Model Training Suite

Google DeepMind Lead Researchers on Genie 3 & the Future of World-Building

David Fields: Bittensor AI, Data Structuring, Social Media Analysis, Subnet 33, ReadyAI | Ep. 37

Join 4,000+ smart readers to get access to all our research and tools for free.

Others You May Like

Dylan Patel on GPT-5’s Router Moment, GPUs vs TPUs, Monetization

Subnet 56 :: Gradients :: Bittensor End-to-end AI Model Training Suite

Google DeepMind Lead Researchers on Genie 3 & the Future of World-Building