How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Sherman Wu, who leads engineering for OpenAI's developer platform, unpacks the strategy behind serving both a massive consumer app and a foundational API. The conversation explores OpenAI's evolution from a "one model to rule them all" philosophy to embracing a future of specialized models fine-tuned on proprietary data.

The Dual Mandate: API vs. First-Party App

"We want ChatGPT as a first-party app... We also want the API. At the end of the day, it comes back to the mission of OpenAI, which is to create AI and then to distribute the benefits as broadly as possible."

OpenAI operates a dual-pronged strategy: ChatGPT, a vertical app with 800 million weekly active users, and a horizontal API that extends its reach even further. This structure is a direct reflection of its mission to maximize AI's distribution. While this creates natural tension with developers who fear competing with OpenAI’s first-party products, the market's explosive growth has so far smoothed over most conflicts.

The Great Unbundling: From One God Model to Many Specialists

"Even within OpenAI, the thinking was that there would be one model that rules them all. That has definitely completely changed. It's becoming increasingly clear that there will be room for a bunch of specialized models."

The initial industry assumption of a single, all-powerful AGI has been replaced by the reality of a proliferating ecosystem of specialized models. Users now seamlessly switch between models for different tasks within a single workflow—using one for high-level planning and another for fast, in-flow coding. This specialization increases model "stickiness," as developers build their products around the unique capabilities of a specific model, making it technically difficult to swap out.

The Data Unlock: Reinforcement Learning and Fine-Tuning

"Companies just have giant treasure troves of data that they are sitting on... The big unlock that has happened recently is with reinforcement fine-tuning... which allows you to leverage your data way more."

Fine-tuning is evolving from simple stylistic adjustments to a powerful tool for achieving state-of-the-art performance. The introduction of reinforcement learning (RL) fine-tuning allows companies to leverage their proprietary data to create highly specialized models. OpenAI is even piloting programs that offer discounted inference for customers willing to share data, creating a powerful feedback loop that benefits both parties.

Key Takeaways:

The conversation reveals a pragmatic and evolving strategy at the heart of the AI revolution, where grand theories are constantly tested against market realities.
The "One Model" Thesis Is Dead. The future belongs to a portfolio of specialized models. This creates distinct opportunities for both foundational labs and companies that can leverage proprietary data to build best-in-class models for niche applications.
Data Is the Ultimate Differentiator. Reinforcement learning fine-tuning elevates proprietary data from a simple input for RAG systems to the core ingredient for building a defensible, state-of-the-art product.
Agents Will Specialize. The agent ecosystem is bifurcating into two primary types: open-ended, creative agents for knowledge work and deterministic, procedural agents designed for enterprise automation where reliability and adherence to standard operating procedures are critical.

For further insights and detailed discussions, watch the full video: Link

This episode reveals OpenAI's strategy for scaling to 800 million weekly users, detailing the shift from a single "god model" to a portfolio of specialized, fine-tuned models and the economic tensions between its API platform and first-party apps.

Introduction: Sherman Woo on Leading OpenAI's Developer Platform

Sherman Woo, who leads the engineering team for OpenAI's developer platform, provides context on his background and current role. His team is primarily responsible for the API but also handles specialized deployments, such as a local model running at Los Alamos National Labs.
Sherman's experience spans from machine learning for asset pricing at OpenDoor to newsfeed ranking at Quora, giving him a unique perspective on both the technical and business challenges of deploying AI.
He highlights the stark cultural and operational differences between a traditional tech company like OpenDoor, which operates on thin margins, and a research-driven organization like OpenAI.

The Horizontal vs. Vertical Tension: Balancing API and First-Party Apps

The conversation opens by addressing the inherent tension in OpenAI's strategy of offering both a horizontal API platform for developers and vertical first-party applications like ChatGPT. Sherman Woo explains that this dual approach is rooted in OpenAI's mission to distribute AI's benefits as broadly as possible.
ChatGPT serves as a powerful distribution channel, reaching an astonishing 800 million weekly active users—roughly 10% of the global population.
The API platform extends this reach even further, enabling an ecosystem of applications that, at times, had a larger end-user base than ChatGPT itself.
Sherman acknowledges that while this creates tension, particularly when API customers see ChatGPT launch competing features, the company's rapid growth has so far mitigated major conflicts. He notes, "Growth solves so many different things."

The Myth of Model Interchangeability and Anti-Disintermediation

The discussion challenges the early industry assumption that AI models would be interchangeable commodities. Experience shows that models are difficult to abstract away, creating a "stickiness" that prevents developers and users from easily switching providers.
Both consumers and developers form a relationship with a specific model's behavior and performance, making it hard to swap out. This is evident when users notice personality shifts between model versions like GPT-4 and GPT-4o.
Developers building on the API create a technical dependency, as their entire application harness—including tools and prompts—is optimized for a specific model. This creates a form of "anti-disintermediation," where the value of the underlying model remains directly exposed to the end application.

The Proliferation of Specialized Models

The podcast explores the significant shift away from the belief in a single, all-powerful "god model" toward an ecosystem of specialized models. This trend is reshaping how developers build and how investors should view the market.
Sherman confirms this evolution in thinking, stating, "Even within OpenAI, the thinking was that there would be like one model that rules them all... it's like definitely completely changed."
The reality is a proliferation of models tailored for specific use cases, such as different models for coding, planning, or fast, in-flow generation. This suggests the market is not a winner-take-all scenario but a diverse landscape.
Strategic Implication: For investors, this signals a move away from betting on a single AGI winner. The opportunity now lies in companies that can leverage specialized models or build infrastructure to support a multi-model ecosystem.

Fine-Tuning and the Value of Proprietary Data

A key driver of model specialization is the ability to fine-tune models with proprietary data. Sherman details how OpenAI's fine-tuning API has evolved to meet this demand, empowering companies to leverage their unique data assets.
The initial supervised fine-tuning API was limited, primarily useful for adjusting a model's tone or style.
The major advance is reinforcement fine-tuning (RLFT), a technique that allows developers to use their data to significantly improve a model's core capabilities on a specific task, potentially achieving state-of-the-art performance.
OpenAI is piloting programs where customers can receive discounted inference or free training in exchange for sharing their fine-tuning data, highlighting the immense value of high-quality, domain-specific datasets.

From Prompt Engineering to Context Engineering

The conversation debunks the early notion that prompt engineering would become obsolete as models grew more intelligent. Instead, the discipline has evolved into what can be described as "context engineering."
The focus is no longer just on crafting the perfect text prompt but on providing the model with the right tools, data, and context to reason effectively.
This includes techniques like Retrieval-Augmented Generation (RAG), where models pull in external information to answer queries. The challenge now is making the retrieval process itself more intelligent, rather than relying on simple similarity searches.

The Economics of AI: Usage-Based Pricing and Open Source

Sherman discusses the financial models underpinning the AI economy, emphasizing the dominance of usage-based pricing for APIs and the strategic role of open-source models.
Usage-based pricing is seen as the most logical model because it aligns costs directly with compute consumption. Sherman notes it's a "one-way ratchet"—once adopted, it's unlikely companies will revert to simpler subscription models for services like APIs.
OpenAI's release of open-source models is not seen as a cannibalization risk. Instead, it grows the entire AI ecosystem, which ultimately benefits OpenAI. Self-hosting inference for large models remains a significant technical challenge, creating a natural moat for their API services.

Verticalization and Infrastructure for Different Modalities

The discussion touches on the operational complexity of developing models for different modalities, such as text (GPT series) and video (Sora), within the same organization.
Sherman confirms it's an "anti-pattern" that is difficult to execute successfully.
OpenAI manages this by running the teams separately, with distinct roadmaps and largely independent infrastructure and inference stacks. This separation allows each team to specialize and optimize for the unique demands of their modality.

The Evolution of Agent Building: From AGI to Practical Workflows

The final section analyzes the evolving understanding of AI agents. The initial vision of a single, autonomous AGI-like agent is giving way to more practical, structured approaches for automating specific workflows.
OpenAI's Agent Builder, which uses a node-based, deterministic framework, was designed to address real-world business needs where processes are highly procedural.
Many industries, like customer support or regulated finance, rely on strict Standard Operating Procedures (SOPs). For these use cases, a constrained, predictable agent is more valuable than a fully autonomous one that might deviate from policy.
Strategic Implication: The future of enterprise AI agents may be less about open-ended AGI and more about building reliable, verifiable systems that automate structured, repeatable tasks. This creates opportunities for tools and platforms that enable this deterministic approach.

Conclusion

This discussion highlights a critical market shift from a monolithic "one model" vision to a diverse ecosystem of specialized, fine-tuned models. Investors and researchers must now focus on the value of proprietary data for fine-tuning and the emerging infrastructure for deploying specialized, deterministic agents to build competitive moats.

How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas

How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

Join 10,000+ smart readers on our AI newsletter and stay ahead of the curve

Others You May Like

Dario Amodei and Dwarkesh Patel – Exponential Scaling vs. Real World Friction

The Deflationary Singularity: Why Everything is Going to ZERO w/ Salim Ismail

What If Intelligence Didn't Evolve? It "Was There" From the Start! - Blaise Agüera y Arcas