Machine Learning Street Talk
November 3, 2025

Why Humans Are Still Powering AI [Sponsored]

This episode dives into AI's "dirty secret" with the co-founder and CEO of Prolific, a human data infrastructure company. It reveals that behind the automated curtain, AI is fundamentally powered by a messy, complex, and invaluable layer of human intelligence.

AI's Human-Powered Engine

  • "There's a dirty secret in Silicon Valley... there is an absolutely huge importance on human data and human expertise, and that is completely glossed over."
  • "Fundamentally, artificial intelligence is founded in human intelligence... the human data element is often the least spoken about, maybe the least glamorous."
  • AI's foundation isn't just compute and algorithms; it's a "messy layer" of humans who label data, provide feedback for reinforcement learning (RLHF), and evaluate model performance.
  • Unlike AI, true human expertise involves a deep, abstract understanding that can spot when rules are being broken, a nuance that models struggle to replicate without high-quality human input.

Building the Human Intelligence Marketplace

  • "The reality is that humans are messy... the value that we add is the deep verification and vetting of these participants."
  • Prolific addresses the challenge of matching expertise to tasks by building deep profiles on its participants, moving beyond simple demographics to understand nuanced skills.
  • To ensure data quality, the platform uses a three-part system: identity verification, researcher feedback loops for ranking, and network analysis to detect and filter out users trying to game the system.
  • The model avoids treating participants as a commoditized supply chain, instead fostering long-term relationships through clear communication and incentives beyond pure finance to produce higher-quality data.

Augmenting Work, Not Replacing It

  • "These models are extremely powerful teachers and coaches, and junior engineers much more rapidly become as competent as senior engineers with this co-pilot training."
  • The future of work involves AI-human collaboration. Prolific’s platform is designed as an augmentation tool, connecting models with currently practicing professionals (e.g., active doctors evaluating a medical chatbot) rather than creating a class of full-time professional annotators.
  • AI is making the economic pie bigger. By giving more people access to powerful tools, it "wets their appetite" for creation, leading them to eventually hit a wall where they need to hire human experts, increasing demand for specialized skills.

Key Takeaways

  • The conversation paints a future where human expertise becomes more, not less, valuable. As AI automates rote tasks, the demand for nuanced, verified human intelligence to train, evaluate, and collaborate with these systems will explode.
  • Human data is the critical asset. The most valuable—and least glamorous—layer of the AI stack is human intelligence. Its scale, importance, and economic value will only grow.
  • The future is human-in-the-loop. The next phase of AI development will be defined by agent-human interaction, where automated systems can call upon verified human experts on demand for review and guidance.
  • Expertise will be licensed. The economic model is shifting toward a future where human expertise can be licensed, allowing individuals to earn passive income for contributing their knowledge to improve AI, much like Spotify pays artists for their music.

For further insights and detailed discussions, watch the full podcast: Link

This episode reveals the critical, often-overlooked human layer powering AI, exploring how verifiable human expertise is becoming the most valuable commodity in an increasingly automated world.

The Hidden Human Layer in AI

  • Speaker Insight: The CEO emphasizes that the entire AI stack, from data provision to model performance assessment, is inextricably linked to human input.
  • Strategic Implication: For investors, this highlights a critical vulnerability and opportunity in the AI value chain. Projects and platforms that can reliably source, verify, and orchestrate high-quality human data, like Prolific, represent a foundational infrastructure play.

Orchestrating Human Expertise at Scale

  • The conversation explores the immense challenge of matching specific problems with the right human expertise. The host draws a parallel to his previous company, which focused on human-led code reviews, highlighting the universal business problem of connecting specialized knowledge to specific tasks. The CEO of Prolific explains that managing this process is far more complex than a simple API call.
  • The core value of Prolific lies in its deep verification and vetting of participants, building detailed profiles to route the most appropriate tasks to the right individuals.
  • He stresses that treating participants as a commoditized supply chain to be aggregated at the lowest cost is a flawed model. Instead, Prolific focuses on creating a "win-win-win" dynamic between the data collector, the platform, and the participant.
  • "Ultimately we believe that the highest data quality is produced by people who are properly incentivized like understand the impact of their their work and are are going kind of beyond just the financial incentives."

Building a Trustworthy Human Data Network

  • Verification and Onboarding: Every participant is ID-verified to confirm their identity and location.
  • Feedback and Ranking: A continuous feedback loop from researchers on data quality is used to rank participants, ensuring that those who consistently provide the best data are prioritized for future tasks.
  • Network Analysis: The platform analyzes the participant pool as a network to identify and filter out pockets of bad actors or those attempting to game the system.
  • Incentive Design: The CEO references the Prisoner's Dilemma, a classic concept in game theory where two individuals acting in their own self-interest do not produce the optimal outcome. He explains that Prolific structures interactions as long-term relationships with multiple touchpoints and high communication, which incentivizes cooperation and honesty over the "single-shot" incentive to cheat.

Beyond the Mechanical Turk: Fostering Human Connection

  • The discussion references the Mechanical Turk, a famous 18th-century chess-playing machine that was secretly operated by a human. The analogy is used to describe the desire to abstract away the human element behind a clean API. While Prolific aims to provide this simplicity, its core philosophy is to avoid completely obscuring the humans involved.
  • Prolific actively works to "get out of the way as a middleman" by building direct communication channels between researchers and participants.
  • This peer-to-peer feedback and messaging fosters empathy and trust, helping participants understand the impact of their work and ensuring researchers appreciate the value of the data being provided.
  • Actionable Insight: For decentralized AI researchers, this model is critical. Building systems that foster direct, transparent, and long-term relationships with data contributors can create more robust and trustworthy datasets, a key challenge in decentralized networks.

Solving the Specification Problem in Data Collection

  • The host raises a common issue with freelance platforms like Upwork: the "specification problem," where the effort required to define a task with sufficient detail outweighs the benefit of outsourcing it. Prolific addresses this by focusing on both the quality of the audience and the quality of the task design.
  • The platform assists researchers in specifying their needs with high granularity (e.g., distinguishing between different specialties within biology).
  • Many projects on Prolific are long-running, with initial phases dedicated to training participants on the specific context and requirements, ensuring they understand what constitutes high-quality data for that particular task. This longitudinal approach builds a shared understanding that improves data quality over time.

Augmenting Work, Not Replacing It

  • Prolific's vision for the future of work is one of augmentation, not replacement. The platform is optimized to tap into a broad audience of active, real-world users rather than creating a class of professional annotators who may become disconnected from their original fields.
  • For example, when evaluating a medical chatbot, Prolific seeks out currently practicing healthcare workers, not former ones, to ensure the feedback reflects real-world context.
  • The platform is also exploring how to train general participants to become "high taste evaluators," equipping them with the skills to overcome common biases in preference evaluation, a valuable skill for training frontier models.

Scaling a Marketplace of Intelligence

  • Like all marketplaces, Prolific faced the classic "chicken and egg" problem of balancing supply (participants) and demand (researchers). The CEO compares their growth strategy to Uber's city-by-city expansion.
  • Each segment of expertise (e.g., US general population, UK software developers) is treated like a "city" that needs to be scaled from an initial "atomic network" to a liquid market capable of providing data within hours.
  • The matching algorithm, described as the company's "secret sauce," is compared to "two towers" algorithms used by platforms like TikTok. These algorithms excel at matching user context with content context. In Prolific's case, it matches the task context with the human context to surface the optimal participants for any given project.

The Geopolitical Risk of Centralized AI

  • The conversation shifts to the significant risk of AI infrastructure and frontier models becoming centralized and controlled by a handful of US tech companies. The CEO expresses concern that while the talent in these labs is global, the economic value ultimately flows back to the platform owners.
  • He issues a "wakeup call" for the UK and Europe to play a more significant role in the AI lifecycle, from owning training infrastructure and data centers to fostering more local model development.
  • Crypto AI Perspective: This resonates deeply with the core crypto ethos of decentralization. The centralization of AI platforms presents a systemic risk that decentralized compute, data, and model ownership protocols aim to solve. Investors should watch for projects that offer viable, decentralized alternatives to this concentrated power structure.

AI as a Catalyst for Human Expertise

  • Contrary to the narrative that AI will eliminate jobs, the CEO argues it will increase the demand for specialized human expertise. He refutes the idea that junior software engineers will become obsolete, suggesting instead that AI co-pilots will act as powerful teachers, accelerating their path to senior-level competence.
  • He believes there is an "elastic demand" for creations like software, and AI tools will simply make it easier for more people to build, ultimately increasing the overall market.
  • The future will likely involve a "human-in-the-loop" system where AI agents perform tasks but route critical steps to human experts for verification—a "Prolific type plug-in" for AI.
  • "I think the pie has gotten bigger because so many people are just they're getting their appetites wetted... and then they hit a brick wall because they realize they don't actually understand it deeply enough. So they need to bring the experts in."

The Future of Data: Licensing Human Intelligence

  • The episode concludes with a forward-looking vision for a new economic model for human data. As AI companies license content from publishers like The New York Times and Reddit, a similar model could emerge for licensing human expertise.
  • This would create an ongoing, passive incentive for individuals who provide data to improve AI models, analogous to how Spotify pays artists royalties.
  • This could lead to the creation of a "digital twin" of one's expertise that can be licensed out, or a system where users are compensated for contributing to the improvement of large, centralized models.
  • Relevance for Crypto AI: This concept aligns perfectly with Web3 principles of data sovereignty and user-owned economies. Crypto-based systems could provide the transparent, automated infrastructure needed to manage these micro-payments and licensing agreements at scale, creating a true marketplace of intelligence.

Conclusion

This discussion underscores that as AI models advance, the demand for verified, high-quality human data will not shrink but intensify. For investors and researchers, the key takeaway is that the infrastructure for sourcing, verifying, and compensating human intelligence represents a critical and undervalued layer of the AI stack with immense growth potential.

Others You May Like