a16z
October 23, 2025

Marc Andreessen & Amjad Masad on “Good Enough” AI, AGI, and the End of Coding

In a wide-ranging conversation, a16z cofounder Marc Andreessen and Replit founder Amjad Masad explore the seismic shifts in software development driven by AI. They dissect the rise of autonomous AI agents, debate whether we're on a true path to AGI, and question if "good enough" AI is trapping us in a local maximum.

The End of Coding as We Know It

  • “Instead of typing syntax, you're actually typing thoughts, which is what we ultimately want. And the machine writes the code.”
  • The 75-year-old dream of programming in plain English, first envisioned by computer scientist Grace Hopper, is finally here. AI is abstracting away the “accidental complexity” of coding—the syntax, the setup, the package managers—allowing anyone to build software by simply describing their idea.
  • From Idea to App in Minutes: On platforms like Replit, a user can type “I want to sell crepes online,” and an AI agent handles the rest. It selects the optimal tech stack, provisions the database, integrates payment systems like Stripe, and deploys the application, turning a paragraph of English into a functional business.
  • History Repeats Itself: The resistance from some programmers to this new wave echoes historical patterns. Assembly programmers once scoffed at higher-level languages, and "vanilla JS" purists scorned modern frameworks. Each layer of abstraction democratizes creation, and this is the ultimate layer.

The Rise of the Autonomous AI Agent

  • “Agent two... ran for 20 minutes. Agent 3, 200 minutes... It's like setting up a marathon or a relay race. As long as each step is done properly, you could do an infinite number of steps.”
  • The real breakthrough isn’t just language models, but autonomous agents that can execute complex, long-horizon tasks. Their ability to maintain coherence has grown exponentially, from handling 2-minute tasks to running for over 200 minutes without going off the rails.
  • The Verifier-in-the-Loop: The key innovation is a verification loop. One agent writes code for 20 minutes, then another agent spins up a browser to test it. If it finds a bug, that bug report becomes the prompt for a new agent to start a corrected trajectory. This "relay race" model allows for nearly endless, self-correcting work.
  • John Carmack on Stimulants: Watching these agents work is like observing the world's best programmer on a stimulant. It’s incredibly fast but also includes pauses for reflection, web searches, and reasoning, mimicking a hyper-productive human developer.

The AGI Paradox: Magic Meets a Local Maximum

  • “What if we're in a local maximum trap where it's because it's good enough for so much economically productive work, it relieves the pressure in the system to create the generalized answer?”
  • AI is advancing at a blistering pace in concrete domains with verifiable "true or false" answers, like coding, math, and bioscience. However, it's struggling in "softer," more nuanced fields like law and healthcare, where correctness is subjective. This split points to a fascinating paradox.
  • The "Good Enough" Trap: We may be optimizing for economically valuable, narrow AI at the expense of true Artificial General Intelligence (AGI). The immense utility of current models creates a "local maximum" that disincentivizes the harder, foundational research needed for a generalized breakthrough.
  • Have You Met People?: AI is often criticized for its lack of "transfer learning"—the ability to apply knowledge from one domain to another. Yet, this is a deeply human flaw. As Andreessen notes, brilliant physicists often have shockingly naive political views. Our standard for AGI may be an idealized fantasy that no human can meet.

Key Takeaways:

  • AI is simultaneously the most amazing technology ever and a source of deep disappointment for those chasing AGI. It’s automating complex cognitive labor at an astonishing rate, but this very success in narrow domains may be steering us away from the ultimate prize.
  • English is the New Programming Language. The era of wrestling with boilerplate code is ending. AI agents are empowering anyone to build software with natural language, transforming creative ideas directly into functional products.
  • Autonomous Agents Are Already Here. Coherent, multi-hour task execution is now possible thanks to verification loops and multi-agent systems. Expect agents to soon handle entire development workflows with minimal human oversight.
  • We’re Trapped in a "Good Enough" AGI Loop. The explosion of value in verifiable domains like coding is creating a powerful economic incentive to perfect narrow AI. This risks trapping us in a local maximum, delaying the quest for true, generalized intelligence.

For further insights and detailed discussions, watch the full podcast: Link

This episode reveals the critical tension between economically powerful "good enough" AI, which is rapidly mastering verifiable domains like coding, and the uncertain, long-term pursuit of Artificial General Intelligence (AGI).

The Modern Coding Experience: From Idea to Application in Minutes

  • Amjad Masad, CEO of Replit, outlines the platform's AI-driven experience, which aims to eliminate the "accidental complexity" of software development. For a novice or experienced programmer, the process begins not with code, but with a simple English prompt describing an idea, such as "I want to sell crepes online."
  • The Replit AI agent interprets the natural language request, classifies the project type, and selects the optimal technology stack (e.g., Python for a data app, JavaScript for a web app).
  • The user interacts entirely in their native language, with Amjad noting that the AI performs well with most mainstream languages like Japanese, not just English.
  • This fulfills a long-held vision in computing. As Amjad explains, "I read this quote from Grace Hopper... 'I want to get to a world where people are programming in English.'... I think we're at a moment where it's the next step. Instead of typing syntax, you're actually typing thoughts."

Historical Resistance to Abstraction

  • Marc Andreessen provides historical context, noting that resistance to higher-level abstractions is a recurring theme in programming. He recalls how early programmers writing in direct machine code (zeros and ones) looked down on those using assembly language, which itself is a very low-level language that compiles into machine code.
  • This pattern repeated with each new layer of abstraction, from assembly to higher-level languages like BASIC and C.
  • Amjad shares his own experience as part of the "JavaScript revolution" at Facebook, where they faced criticism for building tools like ReactJS instead of using "vanilla JavaScript." He observes that the same programmers who built careers on that wave are now often critical of the new AI-driven approach.

How AI Agents Build Software

  • Once a user provides a prompt, the Replit agent takes over as the primary programmer. It presents a plan of action, detailing the steps it will take, such as setting up a database, integrating payment systems like Stripe, and building the application.
  • The agent then executes this plan autonomously, a process that can take 20-40 minutes.
  • A key innovation is the agent's ability to test its own work. It spins up a browser, interacts with the application to find bugs, and iterates on the code to fix them.
  • Once complete, the user can publish the application to the cloud with a few clicks, a process that previously required extensive manual setup of servers, databases, and deployment pipelines on platforms like AWS.

The Evolution of AI Agents and Long-Horizon Reasoning

  • The conversation shifts to the core technical challenge for AI agents: maintaining coherence over long, complex tasks. Early agents would "spin out" or get confused after only a few minutes.
  • Long-Horizon Reasoning: This refers to an AI's ability to follow a complex, multi-step logical process over an extended period without losing track of its goal.
  • Amjad states that a key breakthrough has been extending this capability. While agents could only maintain coherence for a few minutes in 2023, Replit's Agent 2 could run for 20 minutes, and the current Agent 3 can run for over 200 minutes.
  • This improvement is driven by both more powerful foundation models and clever engineering, such as compressing the agent's "memory" or context window to maintain focus.

The Breakthrough: Reinforcement Learning and Verification

  • Amjad attributes the leap in reasoning capabilities to Reinforcement Learning (RL), a training technique where an AI model is rewarded for successful outcomes.
  • In the context of coding, an LLM is placed in a programming environment and tasked with solving a bug. It generates many possible solutions ("trajectories"), and the one that successfully passes a test receives a reward, reinforcing that reasoning path.
  • Marc Andreessen clarifies that for RL to be effective, the problem must have a "defined and verifiable answer." This is why AI is progressing fastest in domains with concrete, testable outcomes.
  • The Verification Loop: Amjad highlights a critical innovation: using a multi-agent system where one agent writes code for 20 minutes, and another agent acts as a verifier, testing the work. If a bug is found, it becomes the prompt for a new agent to continue the task. This "relay race" approach allows agents to work for hours without losing coherence.

AI's Progress in Verifiable vs. "Soft" Domains

  • The discussion emphasizes that AI's rapid advancement is concentrated in "hard" domains where correctness can be objectively measured.
  • Verifiable Domains: These include mathematics, physics, chemistry, and coding. In coding, the SWE-bench benchmark, which tests an AI's ability to solve real-world software engineering tasks from GitHub, has seen performance jump from ~5% to over 82% in the last year.
  • "Soft" Domains: Progress is slower in areas like law, healthcare, and creative writing, where answers are more subjective and correctness is harder to verify algorithmically. Amjad notes, "The more concrete the problem... that is the key variable, not the difficulty of the problem."
  • Strategic Implication: For investors and researchers, this indicates that the most immediate and predictable returns from AI will come from applications in these verifiable, "hard science" domains.

The Paradox: Immense Progress, Lingering Disappointment

  • Marc Andreessen captures a central tension in the AI field: "This is the most amazing technology ever... and yet we're still like really disappointed... like it's not moving fast enough." This paradox stems from the enormous expectations placed on AI, particularly the goal of achieving AGI.
  • The conversation touches on the "bitter lesson," an essay by AI researcher Richard Sutton arguing that scalable methods leveraging computation (like RL) will ultimately outperform those relying on human-engineered knowledge. However, recent interviews suggest even Sutton has doubts about whether current methods are on the right path.
  • A major concern is the "fossil fuel argument," articulated by figures like Ilya Sutskever, that models are running out of high-quality human-generated training data from the internet.

The AGI Debate: Are We Trapped in a "Good Enough" Local Maximum?

  • The discussion questions the very definition of AGI and whether it's a realistic near-term goal. Marc points out that transfer learning—the ability to apply knowledge from one domain to another—is rare even in humans, suggesting the bar for AGI may be set unrealistically high.
  • Amjad proposes the concept of "functional AGI," where models are trained on data from every economically useful activity, automating vast sectors of the economy without achieving true, generalized intelligence.
  • This leads to the "worse is better" trap: the current generation of AI is so economically valuable that it creates a local maximum. The immense investment flowing into optimizing today's "good enough" models may divert resources and attention from the fundamental research needed for a true AGI breakthrough.
  • Amjad expresses a bearish view on a near-term AGI breakthrough, stating, "Because what we built is so useful and economically valuable... good enough is the enemy."

Conclusion: The Two-Track Future of AI

  • The episode highlights a dual reality for AI: rapid, economically transformative progress in verifiable domains like coding, contrasted with a more uncertain path toward generalized intelligence. Investors and researchers must navigate this landscape by focusing on near-term applications in "hard" sciences while closely monitoring the fundamental, albeit slower, research into generalized learning and reasoning.

Others You May Like