Trillion Agents
August 15, 2025

State of AI in the Enterprise

Fresh from the AI4 conference in Las Vegas, this episode unpacks the stark divide between the buttoned-down world of enterprise AI and the Wild West of open-source development, exploring where the real innovation—and opportunity—lies.

Enterprise AI's Monopolistic Play

  • "It's enterprise-focused fully... a lot of it is a SAS business provider selling to large enterprise and vendor lock-in type business models."
  • "It does seem like enterprise may struggle to roll out agents themselves, so they need a consultant to come in... build an agent... and help us fire a bunch of people."
  • The enterprise AI scene is dominated by traditional IT consultancies and SAS providers who have simply rebranded as "AI agent startups." Their model isn't open innovation; it's about securing high-value contracts, promising to reduce headcount, and achieving vendor lock-in.
  • Many of these firms are pitching "large activity models"—essentially bespoke agentic services layered over a company's private data to create digital twins of human workflows. This approach often starts with a human consultant manually analyzing a process, with a vague promise to "automate this more over time."

The Coding Agent Blind Spot

  • "I kept asking everyone, 'How well do coding agents use your library?'... the majority basically didn't know."
  • "All existing code generation benchmarks focus on self-contained use cases... very few focus on library-specific generation."
  • Dev tool companies are largely flying blind, with almost no data on how effectively AI coding agents use their libraries. This is a massive gap, as agents are quickly becoming the primary entry point for developers.
  • Current benchmarks are broken. They test agents on self-contained problems, not real-world scenarios that involve importing and utilizing existing library stacks. A new experiment, Stackbench, aims to fix this by testing library-specific implementation.
  • Agentic search (used by Claude) is slower but more accurate in avoiding deprecated functions, while semantic search is faster but less reliable. Optimizing docs and codebases to be more "AI-readable" is an emerging competitive advantage.

The Human-AI Relationship

  • "Sam Altman did a tweet about it... he basically said we kind of underestimated how much people would be attached... Now people are starting to talk about GPT psychosis."
  • Users are forming real emotional attachments to specific LLM versions, leading to a phenomenon dubbed "GPT psychosis" when a favored model is deprecated. This highlights the deepening human-AI bond and its implications for product development.
  • Tool discovery remains a fundamental weakness for agents. An agent’s inability to find and use the right tool for a task (e.g., solving a Rubik's cube using an online tool) is a major bottleneck on the path to AGI.

Key Takeaways:

  • The podcast reveals a clear divergence in the AI landscape. While enterprises pursue closed, ROI-driven automation, a significant opportunity lies in the developer-first ecosystem.
  • Beware of "AI" Consultants: Many enterprise-focused "agent startups" are just traditional IT consultancies in disguise, selling high-cost, human-led services with a thin veneer of AI.
  • Benchmark What Matters: The real value in coding agents isn’t just solving abstract problems; it’s how well they integrate with existing libraries. Companies that measure and optimize for this will win the next wave of developer adoption.
  • Tooling is the Final Frontier: The key hurdle to superintelligence isn't just model capability; it's an agent's ability to discover and skillfully use an infinite library of external tools to solve problems.

Link

This episode reveals a critical disconnect between the closed, vendor-driven world of enterprise AI and the emerging open, multi-agent economy, highlighting immediate threats and opportunities for the decentralized ecosystem.

Impressions from the AI4 Enterprise Conference

  • Robinson reports from the AI4 conference in Las Vegas, describing it as a massive, well-attended event squarely focused on the enterprise market.
  • He notes that while the conference features numerous tracks, much of the content is high-level and introductory, such as overviews of Google AI or basic agentic workflows.
  • The dominant audience consists of corporate IT professionals, and the prevailing business model is traditional Software-as-a-Service (SaaS) aimed at large enterprises, with little to no discussion of open networking or decentralized systems.

The Enterprise AI Model: Vendor Lock-In vs. Open Networks

  • The conversation pivots to the business models being promoted at the conference, which heavily favor vendor lock-in and closed ecosystems.
  • Robinson observes that enterprise clients are being sold on monopolizing their client relationships rather than participating in a multi-agent system—an ecosystem where multiple independent AI agents from different providers can interact and transact.
  • This approach is framed as a potential threat to the open economy, as large enterprises may opt for single-provider, consultant-led rollouts of AI, creating walled gardens that limit opportunities for more open, decentralized platforms.
  • Robinson expresses concern that this model could stifle innovation: "That may create less opportunity for businesses that are focused on higher volumes of users, more open markets."

Monopolization vs. The Rise of Bespoke Solutions

  • The speakers analyze the historical success of the SaaS model, which thrived because building bespoke software like ERPs or CRMs was prohibitively expensive for most companies.
  • However, they argue that the rise of powerful AI code generation tools fundamentally changes this dynamic.
  • Now, companies have the capability to build their own custom solutions in-house, posing a direct threat to the traditional monopolistic SaaS model.
  • Despite this, the conference was dominated by partners offering development frameworks and consultancy services—the classic IT services approach of selling a framework and then charging for its implementation.

Consultancies Rebranding as AI Startups

  • A key insight for investors is the trend of traditional consultancies rebranding themselves as "vertical agent startups" to attract higher valuations.
  • One speaker recounts an experience with a "marketing agent" company that, upon inquiry, was revealed to be a standard consultancy assigning a human to analyze marketing processes, with the promise of future automation.
  • This highlights a critical need for due diligence to distinguish genuine AI product companies from service-based businesses masquerading as scalable tech startups.
  • The speaker notes the bait-and-switch: "I was like, this is just consultancy. Like where's the agent here? But I think they said something like the plan is to automate this more over time."

The "Godfathers of AI" on the Future

  • The discussion briefly touches on Geoffrey Hinton's recent talk, where he expressed a cynical outlook, criticized Anthropic for taking Middle East funding, and commented on figures like Elon Musk and Mark Zuckerberg.
  • The speakers observe a trend among AI pioneers like Hinton and Yoshua Bengio leaning towards socialist viewpoints, contrasting them with Yann LeCun's more pro-AI stance and Rich Sutton's unwavering focus on scaling compute via reinforcement learning.
  • This provides context on the ideological divides shaping the AI landscape, suggesting investors should separate academic research contributions from economic or political commentary.

A New Frontier: Benchmarking Coding Agents with Stackbench

  • The conversation shifts to a new product experiment called Stackbench, designed to address a major gap in the market: understanding how well AI coding agents use third-party developer libraries.
  • The speaker explains that most companies don't know if agents use their libraries correctly, often citing issues like agents using outdated versions or deprecated functions.
  • Stackbench automates the process of testing an agent's ability to implement use cases from a library's documentation, providing a benchmark for performance.
  • Agentic Search: This is a method where an AI agent actively uses tools to navigate and read documentation to find information, as opposed to semantic search, which relies on finding text with similar meaning.
  • The speaker notes that Claude Code uses agentic search, which reduces the use of deprecated functions but is significantly slower.
  • The project aims to compare different agents and search methods to help developer tool companies improve their libraries for AI-driven interactions, a critical factor as agents become the primary entry point for developers.

The Critical Flaw in Today's Agents: Tool Selection

  • The speakers identify a fundamental limitation in current AI agents: poor tool selection.
  • Using the ChatGPT agent as an example, they describe it as slow and rudimentary.
  • The agent struggles to discover and select the appropriate tool from the vast number available online to solve a specific problem.
  • This capability gap—moving from solving self-contained problems to effectively leveraging an infinite ecosystem of external tools—is presented as a key milestone on the path to Artificial Super Intelligence (ASI).
  • One speaker highlights this limitation with a clear example: "It can solve like advanced physics problems but it can't solve a Rubik's cube. Even though there are Rubik's cube solving tools on the internet, it just doesn't know where to like discover these tools."

"GPT Psychosis": The Human Attachment to AI Models

  • The discussion explores the phenomenon of user attachment to specific AI models, dubbed "GPT psychosis."
  • This was highlighted by the user backlash when OpenAI deprecated older versions of GPT-4.
  • Sam Altman acknowledged that the company underestimated the personal connection users form with AI assistants, which have distinct "personalities."
  • This trend signals the growing importance of the human-AI relationship, a factor investors should monitor as it will likely shape future product design and user retention strategies.

The Future of Communication: Brain-Computer Interfaces

  • The episode concludes with a speculative look at the future of user interfaces.
  • While browser-based interaction will persist, the speakers predict a shift towards more direct interfaces, culminating in brain-computer interfaces (BCIs).
  • They posit that just as handwriting has been replaced by typing, verbal communication may eventually be superseded by direct thought transfer, fundamentally altering how humans interact with each other and with AI.

Conclusion

This episode contrasts the closed, consultant-driven enterprise AI world with the needs of an open, decentralized ecosystem. For investors and researchers, the key takeaway is the urgent need to develop benchmarks and tools that validate genuine AI capabilities, especially as traditional firms rebrand to capture AI valuations without delivering true innovation.

Others You May Like