What the 2026 Stanford AI Index Report Tells Us About the State of AI

Every year, Stanford’s Institute for Human-Centered AI (HAI) publishes the AI Index Report, widely considered the most comprehensive, independent snapshot of where artificial intelligence actually stands. Now in its ninth edition, the report is produced by an interdisciplinary steering committee of academics and industry experts, and it pulls together data on everything from model performance and compute infrastructure to economic impact, education, policy, and public opinion. Rather than hype or fear-monger, it aims to give researchers, policymakers, journalists, and the general public a rigorously sourced, unbiased picture of a field that moves faster than almost anyone can track.

The 2026 edition spans nine chapters: Research and Development, Technical Performance, Responsible AI, Economy, Science, Medicine, Education, Policy and Governance, and Public Opinion. Below are the highlights that stood out most.

AI development is scaling, but getting harder to see inside

Industry now produces the overwhelming majority of notable AI models, and 2025 was no exception, with over 90% of notable models coming from industry rather than academia. But the price of that dominance is transparency. Training code, parameter counts, dataset sizes, and training duration have quietly stopped being disclosed for several of the most resource-intensive systems, including releases from OpenAI, Anthropic, and Google. Reported parameter counts have hovered near a trillion for three years, even as independently estimated training compute keeps climbing.

Geographically, the research and development race looks less like a single race and more like two different ones. China leads in publication volume, citations, and patent grants, with its share of the world’s 100 most-cited AI papers growing from 33 in 2021 to 41 in 2024, while the U.S. leads in notable model development, producing 50 notable models in 2025 versus China’s 30.

Underneath all of this sits a massive physical buildout. Global AI compute capacity has grown 3.3x per year since 2022, reaching the equivalent of 17.1 million H100 GPUs, with Nvidia supplying over 60% of it. The U.S. now hosts more than 5,400 data centers, over ten times any other country, and a single Taiwanese company, TSMC, fabricates nearly every leading AI chip, making the entire global AI hardware supply chain dependent on one foundry. That buildout has a real environmental cost too: AI data center power capacity reached 29.6 GW (comparable to New York State at peak demand), and training runs like Grok 4 are now estimated to have emitted more than 72,000 tons of CO₂-equivalent.

Capability is surging, and the U.S.-China gap has essentially closed

Frontier models are improving faster than the benchmarks meant to measure them. Models gained 30 percentage points in a single year on Humanity’s Last Exam, a test built to be hard for AI and favorable to human experts, and evaluations intended to stay challenging for years are now getting saturated in months. Top model performance is also converging: as of March 2026, Anthropic, xAI, Google, OpenAI, Alibaba, and DeepSeek are all clustered within 25 Elo points of each other on the Arena Leaderboard, shifting the competition toward cost and reliability rather than raw capability.

That convergence shows up starkly in the U.S.-China comparison. The report states plainly that the U.S.-China AI model performance gap has effectively closed. U.S. and Chinese models have traded the top spot multiple times since early 2025, with DeepSeek-R1 briefly matching the leading U.S. model in February 2025. As of March 2026, the top U.S. model leads by just 2.7%, a gap that fluctuated over the past year but stayed in the single digits throughout. Interestingly, the gap between top closed models and top open models moved in the opposite direction, widening back to 3.3% in 2025 after nearly closing in 2024.

Elsewhere, capability gains look “jagged”: Gemini Deep Think won IMO (International Mathematical Olympiad) gold in 2025, yet the best model on ClockBench read analog clocks correctly only 50.1% of the time, compared with 90.1% for humans. AI agents improved sharply on structured computer-use tasks (from roughly 12% to 66.3% accuracy on OSWorld) but robots still fail nearly 9 in 10 real household tasks, and autonomous vehicles, while now operating at real scale in the U.S. and China, remain limited to favorable conditions with remote human backup.

Benchmarks keep climbing, but trust is not keeping pace

The Responsible AI chapter is a reminder that “smarter” and “more trustworthy” aren’t the same thing. Documented AI incidents rose sharply, from 233 in 2024 to 362 in 2025. In a striking new benchmark testing whether models can distinguish knowledge from mere belief, hallucination rates across 26 top models ranged from 22% to a startling 94%: GPT-4o’s accuracy on this test reportedly dropped from 98.2% to 64.4%, and DeepSeek R1 fell from over 90% to just 14.4%, when a false claim was framed as something the user believed rather than something a third party believed.

Transparency is also backsliding. After improving for two straight years, the average score on the Foundation Model Transparency Index dropped from 58 to 40 in 2025, with persistent gaps in disclosure around training data, compute, and post-deployment impact. And while safety scores look solid under normal testing conditions, they weaken considerably once models face deliberate jailbreak attempts, a gap that shows just how much daylight remains between “safe in the lab” and “safe in the wild.”

The economic story: explosive growth, uneven returns

If there’s one word for AI’s economic footprint in 2025, it’s acceleration. Global corporate AI investment more than doubled, with private investment growing 127.5% and generative AI alone surging over 200% to capture nearly half of all private AI funding. The U.S. continues to dominate here, investing 23 times more than China in absolute private dollars, though China’s true spending is likely underestimated once government guidance funds are factored in.

Consumers are benefiting too: the estimated consumer surplus from generative AI in the U.S. hit $172 billion, up 54% from the year before, with most tools still free. Organizational adoption also jumped, with 88% of surveyed organizations now using AI in some form, though AI agents specifically remain in early days, with single-digit adoption across most business functions.

The labor market picture is more complicated. Employment for software developers aged 22 to 25 has fallen nearly 20% since 2024, and a third of surveyed employers expect workforce reductions in the year ahead, concentrated in service operations, supply chain, and software engineering. Productivity gains are real but uneven: studies point to roughly 14-15% gains in customer support, 26% in software development, and a striking 73% in marketing output, with smaller gains in work that requires deeper reasoning. Notably, the report also flags emerging concern that heavy reliance on AI tools may come with a long-term learning cost, potentially slowing skill development in the workforce over time.

Adoption is going global, just not evenly

Generative AI hit 53% global adoption in just three years, faster than the PC or the internet ever did. But adoption correlates strongly with income, and some surprising leaders have emerged: Singapore (61%) and the UAE (54%) outpace what their GDP would predict. Oddly, despite leading in AI investment and model development, the U.S. ranks just 24th globally in adoption, at 28.3%.

AI is starting to do science, with real limits

A new chapter this year tracks AI’s role inside scientific research itself. AI-related publications in the natural sciences grew 26% in 2025, now making up 5.8-8.8% of research output depending on the field, up from under 1% in 2010. Frontier models already beat human chemists on average on chemistry benchmarks, and 2025 saw genuine milestones: the first astronomy foundation model (AION-1), the first end-to-end AI weather forecasting pipeline, and even the first fully AI-generated paper accepted at a peer-reviewed workshop.

But autonomy has clear ceilings. On tasks requiring agents to replicate published research or run real scientific workflows end-to-end, performance drops sharply. The best AI agents score roughly half of what PhD experts achieve on complex research tasks, and frontier models manage only about 17% accuracy on real-world bioinformatics analysis. AI is accelerating science, but it isn’t yet doing science unsupervised.

Medicine: fast clinical adoption, thinner evidence base

In healthcare, adoption is outpacing rigorous validation. AI-generated clinical notes are now widely used, with physicians reporting up to 83% less time spent on documentation. The FDA authorized 258 AI medical devices in 2025, but the vast majority reached market through pathways that don’t require new clinical trials; only 2.4% of those with clinical studies relied on randomized trial data. On the research side, smaller, more efficient biology models are starting to beat larger ones, and multi-agent diagnostic systems have shown striking gains, with one system scoring 85.5% on complex published cases versus 20% for unaided physicians. Meanwhile, AI-generated summaries now top 84-92% of health-related Google searches, making AI a de facto first stop for health information.

Education: adoption is outrunning policy

Four out of five U.S. high school and college students now use AI for schoolwork, but school policy hasn’t caught up. Only half of middle and high schools have any AI policy, and just 6% of teachers say those policies are actually clear. Meanwhile U.S. computer science enrollment fell 11% between 2024 and 2025, even as AI-specific graduate programs kept growing. More than 90% of countries now teach computer science in schools, but formal AI education is newer: China and the UAE both mandated it starting in the 2025-26 school year.

Policy: sovereignty is the new watchword, spending stays modest

Governments are racing to write AI strategies. Over half of newly adopted national AI strategies in 2024 came from emerging economies, but the resulting infrastructure is wildly uneven. Europe and Central Asia grew their state-backed AI supercomputing clusters from 3 to 44 between 2018 and 2025, while South Asia, Latin America, and the Middle East/North Africa each have fewer than 10. In the U.S., AI’s political profile has surged, with congressional witness appearances on AI growing twentyfold since 2017, but public investment remains small next to private capital: about $20.4 billion in U.S. AI-related federal contracts and grants over 2013-2024, versus $285.9 billion in private investment in 2025 alone.

Public opinion: more optimism, more anxiety, and a widening trust gap

Global optimism about AI ticked up (55% to 59% saying benefits outweigh drawbacks), but so did nervousness, now at 52%. The biggest divide isn’t geographic; it’s between experts and the public. 73% of experts expect AI to improve how people do their jobs, versus just 23% of the public, with similarly large gaps on the economy and medical care. Nearly two-thirds of Americans expect AI to reduce jobs over the next 20 years. And trust in government to regulate AI responsibly varies enormously: the U.S. reports the lowest self-trust of any country surveyed (31%), while globally, people trust the EU to regulate AI more than either the U.S. or China.

The bigger picture

Read together, the 2026 Index paints a field defined by contradictions: capability climbing while transparency falls, U.S. and Chinese models trading the top spot while investment stays lopsided toward the U.S., and adoption spreading globally while trust and safety struggle to keep pace with the underlying technology. It’s less a story of AI “arriving” and more one of AI settling in, unevenly, rapidly, and in ways that are still very much being negotiated.

This summary draws on all nine chapters of the 2026 AI Index Report from Stanford HAI. For the full data and methodology, see the complete report.