Learning Center

Should Publishers Let AI Bots Crawl Their Sites?

May 27, 2026

Show Editorial Policy

shield-icon-2

Editorial Policy

All of our content is generated by subject matter experts with years of ad tech experience and structured by writers and educators for ease of use and digestibility. Learn more about our rigorous interview, content production and review process here.

Should Publishers Let AI Bots Crawl Their Sites?
Ready to be powered by Playwire?

Maximize your ad revenue today!

Apply Now

Key Points

  • Microsoft's VP of publisher product publicly advised publishers to stop blocking AI crawlers and optimize content for AI discovery, but a publisher already in Microsoft's licensing program pushed back with a more nuanced position.
  • The "block or not" debate misses the real strategic question: which crawlers get access, on what terms, and what do you get in return.
  • Microsoft's Publisher Content Marketplace pays publishers when their content informs an AI response, but the program currently has only eight publishers and aims to eventually cover the entire open web.
  • Blocking everything first gives publishers negotiating leverage. Opening everything up gives AI companies free inventory.
  • Whatever you decide on crawlers, the traffic you still own needs to work harder. That's where monetization strategy matters most.

See It In Action:

What Happened

According to AdExchanger's coverage of the Programmatic AI event in Las Vegas, Nikhil Kolar, VP of publisher product at Microsoft AI, told publishers they should open their sites to AI crawlers. His argument: if your content isn't legible to AI agents, your business isn't discoverable. Four out of five websites currently block AI bots, per Kolar. That means most publishers are effectively invisible to AI-driven recommendations and discovery.

Kolar also pointed to Microsoft's Publisher Content Marketplace, announced in February, as a path toward fair compensation. The model pays publishers when their content informs an AI inference. Microsoft handles the licensing agreements and runs the compute on Azure, which means Microsoft makes money on the cloud side regardless. Currently, eight publishers have signed on.

Jonathan Roberts, Chief Innovation Officer at People Inc. and one of those eight publishers, offered a different read. People Inc. blocks 30,000 to 35,000 crawlers per day, granting access to only 38. Roberts framed blocking as a control mechanism, not a wall: you block first, then permission selectively, and negotiate from a position of strength.

His actual disagreement with Kolar was narrower than it appeared. Kolar's advice about opening up access applies more to retail and merchant sites that want AI chatbots recommending their products. For content publishers with valuable IP, the calculus is different.

Essential Background Reading:

Why This Matters for Publishers

The underlying tension here isn't really about blocking. It's about who captures value from your content in the AI era.

Kolar shared a telling data point from 430 meetings Microsoft AI held with publishers last year: the most common sentiment was a feeling of powerlessness. Publishers watched social, mobile, and search reshape their traffic without being able to shape those shifts. AI feels like the same pattern repeating.

Free access feeds the models that answer user questions directly, cutting out the click-through to your site. Block everything, and you lose discoverability entirely. Neither extreme is a complete strategy.

The distinction Kolar drew between "training" and "grounding" is worth understanding. Training pulls from the broad archive of published content to build foundational model knowledge. Grounding pulls from current, trusted sources in real time using model context protocol (MCP) connections. Microsoft's marketplace focuses on grounding, which means publishers participating get paid each time their content is used in a live inference, not just when it contributes to a model's training data. That's a meaningfully different economic relationship.

Related Content:

What Publishers Should Do

There's no universal right answer here, but there is a decision framework worth applying.

Start by understanding what you're blocking and why. A blanket robots.txt block on all AI crawlers is easy to implement and gives you a clean baseline. It's a starting position, not a permanent strategy.

Publishers should evaluate crawlers by category:

  • Search indexers: essential. Blocking Googlebot or Bingbot hurts your organic traffic. Don't do it.
  • Licensing-eligible AI crawlers: these are crawlers connected to platforms that have or are building publisher compensation programs. Worth evaluating case by case.
  • Unauthorized scrapers: block these. They offer nothing and take everything.

The leverage question matters, too. Roberts' point about negotiating from a blocking position applies most to publishers with content that AI companies actively want. If you're a premium publisher with original reporting, vertical expertise, or proprietary data, you have something worth licensing. Block first, negotiate second.

Smaller publishers without that bargaining power face a different reality. Roberts acknowledged the leverage dynamic weakens considerably at that scale. You can still control access, but a licensing deal may not materialize. In that case, the strategic priority shifts: maximize revenue from the traffic you do have.

One move benefits nearly every publisher regardless of where they land on the blocking debate: optimizing content structure for AI legibility. Clear structure, authoritative sourcing, and direct answers to specific questions all increase the likelihood that AI tools surface your content as a source rather than just absorbing it silently.

Next Steps:

Our Take

The "block or don't block" framing is a distraction. The real question is whether you're managing your content's AI exposure deliberately or just letting it happen to you.

Kolar put it directly: publishers have more power than they think. But that power requires using it. Passive openness is not a strategy. Neither is reflexive blocking.

Whatever your crawling policy, the traffic you still own and control deserves serious monetization infrastructure. AI is reshaping discovery, but publishers who invest in squeezing maximum revenue per session from their existing audience are building something that doesn't depend on what any platform decides next.

We work with publishers across gaming, education, news, and entertainment to do exactly that. If you want to see what your current traffic is actually worth, our AI Crawler Protection Grader is a good place to start assessing your exposure, and our AI crawler resource center lays out the full decision framework for publishers navigating this.

You've built the audience. Don't leave revenue on the table while you figure out the rest.

New call-to-action