AI Pulse: How Big Tech Impacts AI Bot Traffic

Rob Lester

Nov 03, 2025

Rob Lester

Rob Lester

Written by

Rob Lester

Rob Lester is a Data Scientist at Akamai. He is researching the application security environment and Layer 7 traffic at large, including the role of AI products.

Additional commentary by Emily Lyons

Share

Welcome back to AI Pulse, the blog series in which we share data insights on the state of AI bots. In this fourth post in our series, we are diving into AI bot traffic across the tech giants: Google, Amazon, Meta, Microsoft, and others.

If you missed any of the previous posts, check out AI Pulse: OpenAI’s Wild Bot Behavior After GPT-5,  AI Pulse: AI Bot Mitigation Is Increasing Everywhere, and AI Pulse: What Circadian Rhythms Reveal About AI Bot Behavior now.

Tech giants account for less AI bot traffic than you’d expect

Since we started sharing AI bot insights, you may have noticed some potentially unexpected distribution of traffic among different sources. The AI bot landscape is dominated by AI-native players like OpenAI and Anthropic, not the tech giants you might expect.

Companies such as Google and Meta, with their massive search and data infrastructure, account for far less AI bot traffic than you might assume (Figure 1).

Companies such as Google and Meta, with their massive search and data infrastructure, account for far less AI bot traffic than you might assume (Figure 1). Fig. 1: Total AI bot traffic by source

Why? Because they don’t need to. These companies have been crawling and cataloging the internet for years. Their data pipelines and indexers already span most of what’s available online, allowing them to repurpose existing infrastructure rather than deploy new AI-specific crawlers. 

What we see instead are supplemental bots that are smaller, targeted AI fetchers from their AI research divisions, designed to fill in gaps where their existing datasets fall short.

No fetch required

What’s even more notable (and perhaps the key point) is that we see little to no activity from these companies’ user-driven AI fetchers against our customers. AI fetchers grab specific web pages in real time to fulfill specific user requests on AI assistants; in order to, for example, answer a question or summarize a webpage via an AI chatbot.

The reason that we don’t see this type of AI bot activity across big tech is that they already have most of the data they need.

Take Google, for example: When an AI Overview appears, it doesn’t need to send an AI bot to click through websites in real time. The content is already indexed and stored within Google’s own infrastructure, ready to be served instantly without reaching beyond its own ecosystem.

A trade-off at the cost of visibility

This creates a real dilemma for businesses that want to stay visible in search results but don’t want their content used for AI outputs. It’s two conflicting desires.

The only foolproof way to prevent your data from being used is to keep it completely out of reach, which is a trade-off that comes at the cost of visibility.

Source distribution by AI bot type 

When it comes to AI training crawlers, the tech giants, with their supplemental AI research crawling bots, are not invisible. This is despite their understated presence relative to their larger scraping footprints.

At Akamai, we define AI training crawlers as AI bots that automatically scan and collect large amounts of data from websites to be used as training data for large language models (LLMs). Building competitive models still depends on access to massive, diverse datasets, and that means lots of crawling.

Companies like Google and Meta may not dominate user-driven AI bot traffic, but they’re quietly maintaining and expanding the vast data pipelines that power their models behind the scenes (Figure 2).

Companies like Google and Meta may not dominate user-driven AI bot traffic, but they’re quietly maintaining and expanding the vast data pipelines that power their models behind the scenes (Figure 2). Fig 2: AI training crawler traffic by source

Open AI’s domination

That said, we have to tip our hats to OpenAI. AcrossAI fetchers and AI search crawlers, they’ve absolutely dominated from a user-acquisition standpoint among AI-first companies, which is a clear reflection of how effectively they’ve built direct engagement and visibility (Figure 3).

That said, we have to tip our hats to OpenAI. AcrossAI fetchers and AI search crawlers, they’ve absolutely dominated from a user-acquisition standpoint among AI-first companies, which is a clear reflection of how effectively they’ve built direct engagement and visibility. (Figure 3) Fig 3: AI fetcher traffic by source (left) and AI search crawler traffic by source (right)

Your content may still be at risk

For any business with a digital footprint, the trend of AI models training on previously collected data instead of new crawls highlights how not every AI bot is new or even visible. 

The tech giants already have what they need, and much of their “AI activity” happens inside closed systems built on years of indexed data. Your content may still be influencing AI outputs, from summaries to product suggestions, even when no new crawlers are touching your site.

The so what

This raises the critical question: If traditional search crawlers are now powering AI features, should they be classified as AI bots, too?

  • For publishing, that means rethinking how to still stay discoverable across these giants and monetize content that’s increasingly being summarized rather than clicked. 

  • For ecommerce, it’s about safeguarding product data and pricing that could be pulled into AI models without consent. 

  • Across other industries, it’s about striking the right balance between visibility, protection, and control in a world where AI is quietly learning from and acting on your data.

Stay tuned

Stay tuned for the next AI Pulse. To learn more about how to manage your AI bot traffic, contact an expert.

Rob Lester

Nov 03, 2025

Rob Lester

Rob Lester

Written by

Rob Lester

Rob Lester is a Data Scientist at Akamai. He is researching the application security environment and Layer 7 traffic at large, including the role of AI products.

Tags

Share

Related Blog Posts

Security
Advancing Collective Defense with Project Glasswing
May 11, 2026
Explore how Akamai and Anthropic are advancing cybersecurity with Project Glasswing. Learn how AI-driven insights from Claude Mythos help harden global systems.
Security
Virtually Patch Vulnerabilities with Microsegmentation and Akamai Hunt
November 09, 2023
If you’re already using Akamai Guardicore Segmentation for microsegmentation, you can also use it to virtually patch system vulnerabilities.
Security
Introducing Akamai Hunt to Stop the Most Evasive Threats and Risks
March 07, 2023
Ravit Greitser is a Senior Product Marketing Professional at Akamai, with vast experience in product marketing and technical writing, specializing in cybersecurity.