a16z
May 2, 2025

What Is an AI Agent?

This a16z podcast episode dives into the ambiguity surrounding "AI agents," exploring definitions, current capabilities versus hype, architectural considerations, market dynamics, and the path toward integrating AI as a foundational technology.

Defining the Elusive "Agent"

  • "I kind of think agent is just a word for AI applications, right? Anything that uses AI kind of can be an agent."
  • "The cleanest definition I've seen of an agent is just something that does complex planning and something that interacts with outside systems. The problem with that definition is all LLMs now do both of those things..."
  • The term "AI agent" lacks a universally agreed-upon definition, spanning from simple LLM wrappers with chat interfaces to hypothetical, near-AGI systems capable of long-term persistence and independent learning.
  • Many view "agent" as an overloaded marketing term rather than a distinct technical category, often used interchangeably with "AI application."
  • Definitions focusing on capabilities like complex planning or external system interaction become blurry as core LLM functionalities increasingly encompass these traits.

Capabilities: Hype vs. Reality

  • "Another definition we heard from Anthropic recently was this idea that an agent is an LLM running in a loop with tool use."
  • "What was so interesting about the Karpathy talk is he basically... related it to autonomous vehicles and said AI agents are a real problem but it's... like a 10-year problem... most of what we're seeing... is like the weekend demo version of this problem."
  • A common technical view describes agents as LLMs operating in a loop, using tools, feeding outputs back as inputs, and making decisions (including when to stop).
  • Despite the hype, current agent implementations are often closer to demos than robust, autonomous systems capable of complex, long-horizon tasks—a challenge potentially requiring a decade of work, akin to autonomous vehicles.
  • The line blurs even with advanced chatbots using web search and chain-of-thought, as their "agentic" behavior depends heavily on the complexity of the user's prompt.

Architecture and Market Dynamics

  • "I actually feel it's like a multi-step LLM chain with a dynamic decision tree."
  • "Traditionally for infra a rule of thumb... is that if the service is used by a human it's a per-seat pricing and if it's a service that's used by other machines it's a usage-based pricing. And I actually don't know where to put agent here."
  • Architecturally, agents often resemble standard SaaS applications: lightweight logic coordinating external LLM calls, external state management (databases), and tool integrations. Handling LLM non-determinism in control flow remains a challenge.
  • A significant marketing angle positions agents as human replacements, justifying high, value-based pricing (e.g., comparing a $30k/year agent to a $50k/year employee). However, this narrative clashes with the reality that AI primarily augments human workers and slows new hiring rather than causing direct, large-scale replacement.
  • Pricing models are currently in flux, caught between value-based pitches and the reality that underlying costs are decreasing, potentially converging towards usage-based or per-seat models depending on whether the agent serves humans or machines. Clear ROI, as seen with coding tools, allows pricing to decouple from marginal costs.

Key Takeaways:

  • The discussion highlights the fuzziness of the "AI agent" label, urging a focus on concrete capabilities and underlying architecture (like LLMs in loops with tools) rather than potentially misleading terminology. While AI boosts productivity, the narrative of imminent human job replacement seems overblown; augmentation is the more common pattern.
  • Define by Function, Not Hype: The term "agent" is ambiguous; focus on specific functionalities like LLMs in loops, tool use, and planning capabilities rather than the label itself.
  • Augmentation Over Replacement: Current AI, including "agents," primarily enhances human productivity and potentially slows hiring growth, rather than directly replacing most human roles which involve creativity and complex decision-making.
  • Towards "Normal Technology": The ultimate goal is for AI capabilities to become seamlessly integrated, like electricity or the internet, moving beyond the "agent" buzzword towards powerful, normalized tools.

For further insights and detailed discussions, watch the full podcast: Link

This episode tackles the ambiguity surrounding AI agents, dissecting the hype versus the technical reality and exploring the strategic implications for development, pricing, and market adoption.

What Is an AI Agent? Defining the Spectrum

  • The discussion kicks off by acknowledging the significant disagreement surrounding the term "AI agent," both technically and in marketing narratives. Guido notes a spectrum of definitions exists.
  • At the simplest end, an agent might just be a sophisticated prompt interacting with a knowledge base or even just a trained Large Language Model (LLM) – a complex neural network trained on vast text data to understand and generate human-like language – presented through a chat interface.
  • Conversely, some define a "true" agent as something nearing Artificial General Intelligence (AGI) – AI with human-like cognitive abilities across diverse tasks. This definition implies persistence, learning, independent problem-solving, and a knowledge base, capabilities Guido suggests are not yet functional. Matt adds a philosophical layer, questioning if true AGI is even achievable.

Agents as AI Applications: Karpathy's View and Market Hype

  • Matt presents a contrarian view, suggesting "agent" is often just a new term for AI applications, particularly those involving complex planning and interaction with external systems (like APIs or the internet).
  • He references a talk by Andrej Karpathy comparing AI agents to autonomous vehicles – a significant, decade-long challenge. Matt argues, "most of what we're seeing in the market now is... the weekend demo version of this problem," contrasting deep research goals with current market hype often seen in marketing materials.
  • This highlights a key challenge for investors: distinguishing genuine long-term agent development from simpler AI applications rebranded for marketing appeal. The blurry line stems from modern LLMs increasingly incorporating planning and external data consumption.

Dissecting Agentic Behavior: Loops, Tools, and Decisions

  • The conversation explores elements contributing to "agentic" behavior. Guido mentions Anthropic's definition: an LLM running in a loop, using tools. This implies a dynamic process where the LLM uses its output to decide the next step, unlike a single static prompt.
  • Matt questions if this definition makes every advanced chatbot an agent, especially those using web search (a tool) and chain-of-thought reasoning. He argues defining a system by its unstructured input is difficult, suggesting focusing on the "LLM in a loop with a tool" description is more productive.
  • Yoko emphasizes reasoning and decision-making as core elements. "I actually feel like it's like a multi-step LLM chain with a dynamic decision tree," she proposes, distinguishing agents from simple function calls like text translation. This points towards evaluating agents based on their autonomy in processing tasks.

The Marketing and Pricing Angle: Replacing Humans?

  • Guido raises the significant marketing influence on the "agent" narrative, noting startups leverage the term to justify higher prices by framing the software as a direct human replacement.
  • This value-based pricing compares the agent's cost (e.g., $30k/year) to a human worker's salary (e.g., $50k/year). While initially compelling for buyers, Guido contrasts this with the economic principle that prices tend towards the marginal cost of production – the cost of producing one additional unit.
  • For AI, especially software-based agents using LLM Application Programming Interfaces (APIs) – protocols allowing different software components to communicate – the marginal cost is very low and decreasing, suggesting current high pricing models may not be sustainable long-term. Investors should scrutinize pricing models based on human replacement claims.

Human Replacement vs. Augmentation: The Reality Check

  • Matt strongly challenges the idea of AI agents leading to widespread, direct human job replacement, arguing most jobs involve fundamental creativity and decision-making that AI currently lacks.
  • Yoko acknowledges partial replacement, citing voice agents handling customer service tasks and noting slowed hiring in some areas, but not mass layoffs. The consensus leans towards augmentation: "In most cases... two humans will get replaced... by one human that's more productive with AI," Matt suggests, or companies expanding output with the same staff.
  • The discussion implies that the "agent" framing taps into anxieties and hopes about human replacement, but the practical reality is more nuanced, focusing on productivity gains. Investors should assess AI's impact based on augmentation potential rather than pure replacement narratives.

Agents as Functions: A Technical Perspective

  • Yoko proposes viewing agents, particularly those handling back-end processes like interacting with a Customer Relationship Management (CRM) system, as complex, multi-step functions involving LLMs for decision-making.
  • Guido agrees that from an external perspective, an agent performing a task and a traditional API call can be indistinguishable if the internal mechanism isn't known.
  • However, Matt points out a key difference: AI models underlying these "functions" are inherently more shareable and reusable (via platforms like Hugging Face or fine-tuning) than traditional code functions, creating a different development and infrastructure dynamic. This suggests new opportunities in AI-specific development tools and infrastructure.

The "Mechanical Turk" Analogy and Human Input

  • The discussion touches on systems where humans act as components, referencing the historical Mechanical Turk (an automaton that secretly housed a human operator) and a modern supermarket example where humans initially performed tasks later advertised as AI-driven.
  • Yoko's supermarket example involved humans labeling data in real-time behind the scenes for a supposed computer vision system.
  • Matt uses this to reinforce his point: even seemingly simple tasks often require human judgment and creative problem-solving, which automation struggles to replicate fully. This highlights the persistent need for human oversight and intervention, even with advanced AI.

Pricing Dynamics Revisited: Value, Cost, and Verticalization

  • The conversation returns to pricing, with Yoko outlining traditional infrastructure pricing: per-seat for human-used services, usage-based for machine-to-machine services. AI agents blur this line.
  • Matt argues most AI companies currently lack a clear understanding of the value they generate, leading to cost-plus pricing initially. As use cases solidify (like coding assistants), value-based pricing becomes more viable, decoupling price from underlying LLM costs.
  • Yoko uses Pokémon Go storage pricing—vastly exceeding actual data storage costs—as an analogy. The value isn't the storage (commodity) but the application-layer capability within a closed ecosystem (monopoly, brand, user experience). This suggests successful AI agent pricing may depend heavily on building indispensable solutions within specific verticals, creating defensible value beyond the raw AI capabilities.

System Architecture: SaaS Parallels and Challenges

  • Guido proposes that architecturally, building an AI agent system closely resembles building typical SaaS applications today.
  • Key components are often externalized: LLMs run on specialized Graphics Processing Unit (GPU) infrastructure (processors optimized for parallel computation essential for AI training/inference), and state management relies on external databases.
  • The core agent logic (retrieving context, assembling prompts, invoking tools) is relatively lightweight, allowing many agents to run on standard servers. Matt agrees but highlights a key unsolved challenge: handling the non-determinism (unpredictability) of LLM outputs within the program's control flow, which might drive future architectural shifts.

Specialization, Data Moats, and Access Hurdles

  • Yoko predicts specialists building on or fine-tuning foundational models will be winners, citing her experience with image generation models excelling in specific styles (like manga) but needing human artists to push beyond dominant aesthetics ("out-of-distribution art.")
  • Guido identifies data access as a major hurdle. Agents are often blocked by technical integration difficulties or deliberate "data silos" and "walled gardens" (like accessing iPhone photos via API). Consumer companies may resist automated access to maintain user engagement and advertising opportunities.
  • The potential for browser-native agents to overcome some barriers is discussed, but countered by the likelihood of websites deploying sophisticated anti-agent CAPTCHAs. This creates an ongoing cat-and-mouse game impacting agent feasibility. Investors should track developments in data access protocols and agent-website interactions.

Future Outlook: Obstacles and Opportunities

  • The speakers identify critical missing pieces for widespread agent adoption: robust security, authentication, access control for agents acting on a user's behalf, and clear data retention policies.
  • Guido presents the bull case: agents seamlessly accessing all of a user's fragmented data (across drives, apps, etc.) to perform complex tasks, dramatically boosting productivity.
  • Yoko bets on multimodality – AI models trained on diverse data types like images, clicks, and actions beyond text – to unlock new agent capabilities, especially for visual or interaction-heavy tasks like web navigation or device control.
  • Matt hopes the term "agent" eventually fades as AI becomes "normal technology," integrated seamlessly like electricity or the internet, focusing discourse on powerful, practical applications rather than the loaded term itself.

Conclusion

The discussion reveals "AI agent" as a currently ambiguous term, blurring lines between advanced AI applications and hyped marketing. Investors and researchers should focus on concrete capabilities—LLMs in loops with tools, specialized fine-tuning, architectural patterns, and navigating data access challenges—rather than the label itself. Progress hinges on solving security, data integration, and potentially multimodality challenges.

Others You May Like