This episode dives deep into the Model Context Protocol (MCP), exploring how this standard from Anthropic aims to universally connect AI applications, impacting interoperability, development, and the future architecture of integrated AI systems relevant to investors and researchers.
Introducing Model Context Protocol (MCP)
- David and Justin from Anthropic introduce Model Context Protocol (MCP), a standard designed to extend and enhance the functionality of AI applications, not just the models themselves, by integrating them with an ecosystem of external capabilities (akin to plugins).
- Justin clarifies the core goal: "extending AI applications is really what this is about." He emphasizes the focus on the application layer, a common point of misunderstanding.
- David offers an analogy: MCP acts like a "USBC port of AI applications," aiming to be a universal connector enabling two-way interaction between an AI application (the client) and various external services or data sources (the servers).
The Origin Story: Developer Frustration Sparks Innovation
- MCP wasn't born from a top-down strategic mandate at Anthropic but rather from David's personal frustration as a developer tool engineer. While working on internal tooling in mid-2024, he found himself constantly copying data between different development environments (like Claude Desktop and his IDE) because they couldn't easily share context or capabilities.
- David, drawing on his developer tooling background, recognized this as a classic "M*N problem" – multiple applications needing multiple integrations. Inspired partly by his concurrent work related to LSP (Language Server Protocol), he conceived MCP as a protocol-based solution.
- LSP (Language Server Protocol): A protocol initially developed by Microsoft that standardizes communication between code editors/IDEs and language analysis tools (language servers), solving the M*N problem for language support across different editors.
- David pitched the idea to Justin, who became deeply involved. They spent roughly a month and a half building the initial protocol and proof-of-concept integrations (Justin focusing on Claude Desktop, David on an IDE integration), working largely as a two-person team initially.
Early Signs and the Zed Editor Connection
- For those seeking "alpha," David reveals the very first public MCP implementation appeared in the Zed editor repository about a month and a half before the official November 25th announcement. He wrote this integration himself for the open-source Zed project.
- Zed: A high-performance, collaborative code editor known for its low latency.
- Alessio notes Anthropic previously previewed a fast editing model integration with Zed. David pitches Zed as a "low latency super smooth experience editor with a decent enough AI integration."
Design Philosophy: Learning from LSP and Focusing on Presentation
- Justin explains the heavy inspiration from LSP, particularly its success in solving the M*N problem for IDEs and language support by creating a common communication standard. MCP applies this principle to AI applications and their extensions.
- While adopting JSON RPC (a lightweight remote procedure call protocol using JSON) and bidirectionality from LSP, MCP diverges significantly in its specific primitives.
- JSON RPC: A simple protocol for making requests and receiving responses between systems, often used for APIs and inter-process communication.
- A key principle borrowed and adapted from LSP is being "presentation focused." This means defining MCP primitives based on how a feature should manifest in the user interface, rather than just its underlying semantics. This allows application developers flexibility in crafting user experiences.
- David adds they consciously studied LSP's criticisms and limitations (like its unique JSON RPC approach) to inform MCP's design, aiming to innovate where necessary but remain "boring" (i.e., use established patterns like standard JSON RPC) elsewhere to focus effort on the core primitives.
Core MCP Primitives: Beyond Simple Tool Calling
- Justin outlines the initial core primitives, designed from the perspective of an application developer needing different ways to integrate external capabilities:
- Tools: Functions meant to be invoked by the model at its discretion (akin to function calling).
- Resources: Chunks of data or context (identified by a URI) that can be added to the model's context window. Crucially, resource inclusion can be model-driven or application/user-controlled (e.g., via a UI element like a paperclip icon), offering more flexibility than tools alone. This could support use cases like RAG (Retrieval-Augmented Generation) where an application builds an index over exposed resources.
- URI (Uniform Resource Identifier): A string of characters used to identify a resource, like a web address (URL) or a specific file path.
- Prompts: Pre-defined text or messages intended for user initiation or substitution (like slash commands or autocompletions in an editor).
- David highlights that the first Zed integration was actually a prompt implementation, allowing users to pull external data (like a Sentry backtrace) into the context window proactively, demonstrating the utility beyond model-driven tool use. He expresses a desire for more servers to expose prompts (e.g., usage examples for their tools) and resources (e.g., document sets for RAG).
Tool vs. Resource: Clarifying the Distinction
- Addressing the common tendency to use tools for everything (like database queries), Justin reiterates the design intent: Tools are model-initiated actions. Resources are more flexible context/data sources, potentially user-selected or application-managed (e.g., exposing database schemas as resources for the user or an agentic application to reference).
- Justin acknowledges the practical limitation that many clients don't fully support resources yet, but envisions a future where resources handle scenarios like listing entities (e.g., database tables, documents) and allowing selection or automated lookup. Resources, identified by URIs, can also act as general-purpose transformers (e.g., interpreting a dropped URI).
- David mentions the Zed example where default prompts were exposed as resources, populating the editor's prompt library dynamically – a specific application of resources agreed upon by client and server. He also notes how existing UI patterns like attachment menus naturally map to the resource concept.
MCP vs. OpenAPI: Complementary Standards
- Justin argues that OpenAPI specifications, while valuable for traditional API development, are often too granular for LLMs and lack AI-specific concepts. MCP's primitives (tools, resources, prompts) are purpose-built for richer AI application interactions.
- OpenAPI Specification: A standard format for describing RESTful APIs, allowing both humans and computers to understand the capabilities of a service without access to source code or documentation.
- David emphasizes MCP's deliberate design for statefulness, anticipating that AI interactions will become increasingly stateful, especially with multi-modal inputs (video, audio). This contrasts with the often stateless nature of REST APIs described by OpenAPI. He states, "...we do really believe that AI applications and AI like interactions will become inherently more stateful..."
- Both speakers stress that MCP and OpenAPI are complementary, not adversarial. Bridges already exist in the community to translate between the two formats. The choice depends on the use case: MCP for rich AI application integration, OpenAPI for standard API specification and interaction where appropriate.
Building MCP Servers: Simplicity and AI Assistance
- David encourages developers to start simple: pick a language, use the SDK, build a basic tool for a personally relevant task, and connect it via standard I/O to an application. He emphasizes the ease of getting started (often <30 minutes) and the "magic" of seeing the model interact with custom capabilities quickly. "Just start as simple as possible and just go go build a server in like half an hour..."
- Justin advocates using AI (like Claude) to write the MCP servers. By providing the relevant SDK code and a description of the desired functionality, LLMs can generate functional server code remarkably well, significantly accelerating development. He notes getting started often requires only 100-200 lines of code.
- Alessio mentions a hackathon project where an agent was built to automatically generate MCP servers from API specs, highlighting the potential for AI-driven server creation.
The Future of Servers: API Wrappers and Novel Experiences
- Justin anticipates a mix of server types: many will be adapters or wrappers around existing APIs (potentially using different MCP primitives for richer interaction), but there's also significant opportunity for servers offering novel capabilities not tied to existing APIs.
- He cites examples like the "memory" server (allowing LLMs to retain information across sessions) and the "sequential thinking" server (enhancing reasoning) as early instances of servers providing intrinsic capabilities rather than just external API access. These often originated from internal hackathons.
- David agrees that while API wrappers are valuable, the ecosystem is still in the early stages of exploring genuinely new MCP-native experiences, such as a hypothetical server summarizing personalized content (e.g., subreddit threads). He notes a "chicken and egg problem" where richer server capabilities depend on broader client support for primitives like sampling.
Composability, Recursion, and the Agent Question
- David explains MCP's support for composability:
- Bidirectionality: Servers can make requests back to the client (e.g., asking the client's LLM to perform a sub-task like summarization), enabling model-independent server logic.
- Server-as-Client: An MCP server can itself act as an MCP client, consuming other MCP servers. This allows for recursive structures and potentially complex Directed Acyclic Graphs (DAGs) of interacting servers.
- This recursive capability naturally leads to the question: Is a server-that-is-also-a-client an agent?
- Justin sees a clear relationship but notes the definition is fuzzy. MCP could be a way to represent agents, a communication layer for agents, or remain focused on application extension. He cautions against over-scoping the protocol ("God Box" problem).
- David believes agents can be built this way, distinguishing simple proxies from true agents which might involve more complex internal logic (e.g., using sampling loops or invoking tools internally via client requests).
- Strategic Implication: The potential for composable MCP servers creating complex, potentially autonomous workflows is highly relevant for researchers exploring decentralized agent architectures and investors looking at platforms enabling such systems.
Scalability, Tool Limits, and User Control
- Regarding how many tools/servers a model can handle (the "wide" vs. "deep" question), Justin suggests it depends on the model, tool descriptions, and potential for confusion. While Claude might handle hundreds, the ideal is for the LLM to manage complexity, but practical limits exist.
- Potential solutions for managing large numbers of tools include client-side filtering, using smaller LLMs for pre-filtering, or proxy MCP servers that manage subsets of tools.
- David notes overlap in descriptions (e.g., GitHub vs. GitLab servers) increases confusion risk.
- A core MCP principle, Justin emphasizes, is that the client application (and by extension, the user) retains ultimate control. Even "model-controlled" tools are invoked via the client. Clients can filter, transform, or enrich information from servers, allowing for differentiated user experiences and maintaining user agency.
Ecosystem: Standardization, Trust, and Registries
- David pushes back against the idea of a single "canonical" MCP server for common services (like GitHub), favoring a "bazaar" approach where the best implementations emerge organically from the ecosystem. He notes company-provided servers (like a hypothetical Cloudflare one) will likely be de facto canonical for their specific products.
- Justin raises the critical issue of trust and vetting servers ("How do you determine which MCP servers are... good and safe ones to use?"). This is a classic supply chain security problem common to package registries (like npm, PyPI). Anthropic is exploring solutions (like reputation systems or vetting) but has no definitive answers yet.
- Alessio points out the brittleness of trust systems (a compromised trusted package does more damage) and suggests tools like mcp-inspector for observing traffic as a security measure.
- Actionable Insight: For investors and researchers, the lack of robust, standardized trust mechanisms in the nascent MCP ecosystem presents both risk (supply chain attacks) and opportunity (solutions for secure discovery and vetting).
Future Roadmap: Transport, Auth, and Scopes
- Streamable HTTP Transport: Justin explains the move towards a new transport layer balances the desire for statefulness (seen as key for future AI) with operational simplicity. The original SSC-based persistent connection model was hard to scale. The new approach allows gradual adoption from simple stateless HTTP POSTs to streaming and resumable sessions, offering flexibility.
- SSE (Server-Sent Events): A standard allowing a server to push data to a client over a single, long-lived HTTP connection.
- Authorization: David confirms authorization (primarily OAuth 2.1-based user-to-server auth) is part of the next protocol revision draft, crucial for secure remote server interaction. This avoids embedding API keys directly. Local standard I/O transport presents challenges, potentially solvable by using HTTP locally (a point of internal debate).
- OAuth 2.1: An authorization framework enabling applications to obtain limited access to user accounts on an HTTP service.
- Scopes: Responding to Alessio's question about granular permissions (e.g., accessing only certain emails), David acknowledges the potential need but emphasizes the protocol design principle of adding features only when clear, unsolved use cases are demonstrated, to avoid over-design. Prototyping via extensibility is preferred before standardization. Justin adds that finer granularity might sometimes be better handled by having smaller, more focused servers rather than complex scope logic within one large server.
Community, Governance, and Getting Involved
- David and Justin stress their commitment to MCP being an open, community-driven project, not just "Anthropic's protocol." They highlight existing contributions and commit access granted to external individuals and companies (Pydantic, Microsoft, JetBrains, Spring AI).
- Getting involved primarily happens via the GitHub specification repository discussions. However, they emphasize a meritocratic approach: contributions involving actual code, practical examples, and SDK work carry more weight than opinions alone, given the high volume of discussion. "If you have done work... you have a good chance that it gets in. If you're just there to have an opinion... you're very likely just being ignored..."
- Regarding formal governance (like moving to a foundation), they express caution. While openness is paramount, they worry that premature formalization and associated processes could slow down development in the fast-moving AI field, preferring a balance between broad participation and agile iteration for now. Alessio draws parallels to React working groups and GraphQL's move to a foundation.
Wishlist: What's Next for the Ecosystem?
- David's top wish: More clients implementing sampling (allowing servers to request model completions from the client) and servers leveraging it for richer, model-independent experiences (like his desired EVE Online or Reddit summarizer).
- Justin echoes the need for broader client support across all primitives. Personally, he'd love MCP integrations for game development engines like Godot, enabling AI-assisted game creation or playtesting ("Claude plays Pokemon.")
Conclusion
MCP aims to standardize AI interactions beyond basic APIs, emphasizing richer primitives and composability crucial for next-generation applications. Investors and researchers should track its ecosystem growth, client adoption of advanced features (like resources and sampling), and evolving security/governance models to anticipate shifts in AI architecture and identify opportunities in interoperable AI systems.