Epoch AI
November 26, 2025

Epoch AI webinar: Inside the Frontier Data Centers hub

Epoch AI unveils its Frontier Data Centers Hub, an open-source project tracking the colossal infrastructure powering the AI revolution. Led by Yafa Edelman and Ben Kotler, the team provides a transparent look at the methodology and insights behind mapping these multi-billion dollar megaprojects.

Mapping the AI Megastructures

  • "Our frontier's data hub is tracking these frontier data centers... the very, very largest data centers that are being constructed right now. Many of them cost $10 billion or more."
  • "Understand tracking these data centers lets us understand the distribution of compute among companies and nations as well as the trajectory of AI."
  • Epoch’s free, open-source hub tracks "frontier" data centers to monitor global AI investment and compute distribution. The scale is immense, with the largest projects rivaling the Manhattan Project in cost.
  • Meta’s Hyperion campus in Louisiana will cover an area one-fifth the size of Manhattan. By 2027, Microsoft's Fairwater facility is projected to reach 3 gigawatts, equivalent to the power of 5 million H100 GPUs.
  • The project is expanding globally to China, the Middle East, and Europe to provide a comprehensive view of AI geopolitics beyond the US.

Satellite Spies & Permit Sleuths

  • "We developed a model where we can take some visual feature like the diameter of these fans and actually predict quite accurately how much cooling capacity... is being taken out of the building."
  • "We use a ChatGPT agent, actually, this is quite helpful as a tool to search quite deeply for permit databases and find permits which mention the address of the site."
  • Epoch’s methodology combines satellite imagery, public permit filings, and local news to identify and analyze data centers. A ChatGPT agent assists in scouring permit databases for key details like site addresses.
  • The team estimates a facility's power capacity by analyzing its cooling equipment. Their model, which measures the diameter of cooling tower fans from satellite images, has proven accurate to within 2% of stated capacity in verified cases.

The Gigawatt Power Play

  • "The worry that people have about data centers not being able to get enough power is really not going to pan out because data centers can absolutely afford to pay more... GPUs are still going to be a much larger cost than power."
  • Grid capacity is not the hard bottleneck many assume. AI companies view power as a secondary expense compared to GPUs and will pay significant premiums to get electricity faster, effectively solving supply issues with cash.
  • Companies use creative "bridge" solutions, like renting portable gas turbines (as xAI did for Colossus), to power up facilities while waiting for permanent grid connections. Some even explore repurposing jet engines for power.

Key Takeaways:

  • Epoch AI’s hub provides a crucial, transparent window into the physical arms race of the AI era, revealing a landscape of unprecedented scale and speed.
  • AI's Physical Footprint is Astronomical: Individual AI data centers are now multi-billion dollar megaprojects, with construction timelines accelerating to as little as one year for a gigawatt-scale facility.
  • Power is a Solvable Problem, Not a Hard Cap: AI firms will pay whatever it takes to secure electricity, making power costs a secondary concern to the price of GPUs. The real constraint is getting chips, not watts.
  • Open-Source Intelligence Unveils All: By combining satellite imagery, public permits, and news reports, the physical expansion of the AI industry can be tracked in near real-time, providing unprecedented transparency.

For further insights and detailed discussions, watch the full podcast: Link

This episode reveals how tracking the physical construction of massive, multi-billion dollar data centers provides a ground-truth signal for the trajectory of AI development, moving beyond corporate announcements to real-world capital deployment.

Introduction to the Frontier Data Centers Hub

Yafa Edelman, Head of Data at Epoch, introduces the Frontier Data Centers Hub, a project designed to track the largest AI data centers currently under construction. These are unprecedented infrastructure projects, with many costing over $10 billion. Yafa emphasizes their importance as a key indicator of AI investment, national compute distribution, and the ability of companies to maintain historical scaling rates. The largest project tracked is projected to cost $100 billion by its 2028 launch, rivaling the scale of historical R&D efforts like the Manhattan Project.

  • Strategic Insight: Monitoring these physical builds offers a tangible, verifiable metric for AI progress, distinct from software-based benchmarks or company press releases. Investors can use this data to gauge the real-world commitment and operational capacity of major AI players.

The Importance of Open and Transparent Tracking

Yafa highlights that the Epoch database is intentionally free and open, a commitment to transparency that allows the public and researchers to scrutinize their methodology. This contrasts with other proprietary data center trackers. A key strategic goal is to expand coverage globally beyond the US to include China, the Middle East, and Europe. This expansion will provide a more comprehensive view of the global AI landscape and geopolitical competition over compute resources.

  • Yafa's Perspective: "Ours being free and open means that the public can actually understand... how we do this. We're committed to transparency."

Navigating the Data Hub: A Walkthrough

Ben Kotier, a researcher at Epoch, provides a live demonstration of the data hub's "Satellite Explorer." This interactive map pinpoints the locations of major US data centers like OpenAI's Stargate. Users can click on a site, such as the Google Omaha data center, to view high-resolution satellite imagery with detailed annotations.

  • Key Features Demonstrated:
    • Annotated Imagery: Buildings, grid power substations, backup power generators, and cooling equipment are clearly outlined.
    • Construction Timeline: Users can scroll back through historical satellite images to observe the construction progress of facilities over time.
    • Quantitative Data: A graph view visualizes the growth of operational power capacity, compute capacity, and capital cost for each site over time.

Methodology Deep Dive: From Satellite Imagery to Power Estimates

Ben explains Epoch's methodology for estimating a data center's power capacity, a critical metric for its computational potential. The process relies heavily on analyzing cooling equipment visible in satellite images. By measuring visual features like the diameter of cooling tower fans, they can accurately predict the heat dissipation capacity.

  • Core Methodology:
    • A model predicts cooling capacity based on fan diameter. For example, a 5-meter fan corresponds to roughly 8 megawatts of cooling power.
    • Megawatt (MW): A unit of power equal to one million watts. An 8 MW cooling unit can dissipate enough heat to power thousands of average US homes.
    • The total number of fans is aggregated, then adjusted for redundancy and overhead, to estimate the power dedicated to GPUs or TPUs.
  • Validation and Accuracy: Ben notes that for one site, their model's estimate was within 2% of the power capacity stated in a public permit document. While not always this precise, they are confident their estimates are generally within a +/- 50% accuracy range.

Key Research Insights: The Unprecedented Scale of AI Infrastructure

The data reveals the immense scale and rapid pace of current data center construction.

  • Meta's Hyperion (Louisiana): The largest campus found, set to begin operations in 2028, covers a land area equivalent to one-fifth of Manhattan.
  • Rapid Buildouts: Several gigawatt-scale data centers are scheduled to come online in 2026. xAI's Colossus 2 is on a particularly aggressive timeline, projected to go from construction to 1 gigawatt of operational power in just one year.
    • Gigawatt (GW): A unit of power equal to one billion watts, or 1,000 megawatts. A gigawatt-scale data center represents a massive concentration of energy and compute.
  • Future Growth: Projections show continued exponential growth, with facilities like Microsoft's Fairwater (Wisconsin) expected to reach 3 gigawatts by 2027, housing an estimated 5 million H100-equivalent GPUs.
    • H100-equivalent GPUs: A standardized metric for compute power, benchmarked against NVIDIA's H100, a leading GPU for AI training.

Future Roadmap: Expanding Coverage and Metrics

Yafa outlines the future direction for the hub, focusing on expanding data coverage and depth.

  • Coverage Goal: Increase tracking from the current estimated 15% of AI compute to over 90% by including more large US data centers and expanding internationally.
  • New Metrics: Future updates will aim to track:
    • Construction Workforce: Using permit data and satellite analysis of parking lots to estimate the number of construction workers.
    • Network Connectivity: Identifying which data centers are networked together to form larger compute clusters, providing insight into the maximum potential scale of AI training jobs.

Q&A Session: Power, Siting, and Decentralization

Power Generation and Grid Capacity

  • Yafa observes that while many data centers plan for eventual grid connection, on-site power generation (like portable gas turbines) is often used as a "bridge" solution. She expresses confidence that grid operators can keep up, arguing that AI companies can afford to pay a premium for faster power delivery, as energy costs are still dwarfed by the cost of GPUs.

Site Identification and Data Sourcing

  • Ben explains that sites are identified through a combination of local news reports, permit databases (searched with a GPT-powered agent), and manual searches of satellite imagery. They plan to explore more automated satellite-based discovery methods in the future.

Challenges in Tracking Chinese Data Centers

  • Yafa anticipates that tracking data centers in China will be more challenging due to less reliable permit data and more "noise" from unsubstantiated claims. However, initial efforts have already identified a half-gigawatt facility. The core satellite analysis methodology is expected to work effectively once sites are located.

Tracking Decentralized Compute and Network Connectivity

  • Ben states that information on connectivity often comes from company announcements, such as Microsoft's claim that its Fairwater data centers are connected. Epoch plans to verify these claims where possible and uses a "confidence level" indicator in the hub to distinguish confirmed facts from rumors.

Final Insights on AI Model Training and Data Center Utilization

Yafa provides a crucial reality check on how these massive data centers are used. While a single facility could theoretically be used to train one giant AI model, she believes this rarely happens. Currently, even the largest models are trained on just a fraction of a single data center's capacity. This suggests there is significant room for scaling models even with existing infrastructure, making claims about inter-data center distributed training more of a future possibility than a current necessity.

Conclusion

This analysis underscores that tracking physical AI infrastructure offers a powerful, non-traditional signal of corporate strategy and technological progress. For investors and researchers, the Epoch hub provides a tool to monitor real-world capital expenditure and construction velocity, offering a tangible proxy for AI ambitions that moves beyond market announcements.

Others You May Like