Learning Center

Publishers Are Organizing Against AI Scraping

April 30, 2026

Show Editorial Policy

shield-icon-2

Editorial Policy

All of our content is generated by subject matter experts with years of ad tech experience and structured by writers and educators for ease of use and digestibility. Learn more about our rigorous interview, content production and review process here.

Publishers Are Organizing Against AI Scraping
Ready to be powered by Playwire?

Maximize your ad revenue today!

Apply Now

Publishers Are Organizing Against AI Scraping. Here's What That Means for You.

Key Points

  • UK publishers including the BBC, Financial Times, and The Guardian have formed SPUR (Standards for Publisher Usage Rights) to establish shared technical standards and licensing frameworks for AI access to content.
  • The robots.txt honor system is no longer enough. Publishers need technical and collective solutions that make unauthorized crawling cost-prohibitive.
  • AI agents, bots that respond to user requests in real time, are the next frontier of this fight, and they're harder to block than traditional crawlers.
  • The window to shape how AI companies access publisher content is open now. Once the rules harden, changing them will be significantly harder.
  • Whatever content protection strategy you adopt, the traffic you do have still needs to work as hard as possible.

For years, publishers fought AI scraping individually, with whatever tools they had available. That meant a patchwork of robots.txt rules, paywalls, and the occasional lawsuit. The AI companies, for their part, mostly kept crawling.

That dynamic is starting to shift. Fast Company reports that a coalition of UK publishers, including the BBC, the Financial Times, and The Guardian, has formed SPUR: Standards for Publisher Usage Rights. The goal is to establish shared technical standards and licensing frameworks that push AI developers toward legitimate, compensated access to quality journalism.

This matters beyond the UK. If SPUR gains traction, it could set the template for how publishers worldwide negotiate with AI companies.

What SPUR Is Actually Trying to Do

SPUR isn't a lobbying group and it isn't a lawsuit. It's an attempt to shift the leverage equation through collective technical action.

Publishers currently have three main options: pursue a licensing deal (only available to the largest players), sue (expensive and slow), or build technical defenses. SPUR is focused on the third option, at scale. The idea is that if enough publishers make crawling difficult and costly, AI companies have more incentive to negotiate.

Cloudflare is an important ally here. It introduced Pay Per Crawl, a mechanism that charges bots for content access. TollBit has also been documenting the scope of the problem, highlighting how "headless browsers", bots that mimic human behavior to bypass standard crawler detection, are being used at industrial scale.

The core insight from SPUR's formation is simple: individual publishers have almost no leverage. A coordinated global coalition backed by infrastructure-level partners does.

Essential Background Reading:

The Robots.txt Problem

Publishers have been relying on the robots exclusion protocol for decades. It was never designed for this fight.

Robots.txt is an honor system. It works when the parties on the other end respect it. Many AI crawlers don't. Some ignore it outright. Others use headless browsers that sidestep it entirely. The gap between what robots.txt was designed to do and what publishers actually need it to do has never been wider.

The table below shows where the main defensive options sit today:

Defense MethodHow It WorksLimitation
robots.txtInstructs crawlers not to index or access contentHonor system; easily ignored or bypassed
PaywallsRestricts content access to authenticated usersReduces public reach; doesn't stop all crawlers
Cloudflare Pay Per CrawlCharges bots for access at the network levelStill relatively new; adoption across AI companies is uneven
Licensing agreementsDirect commercial deals with AI companiesOnly accessible to large publishers
LitigationLegal action against unauthorized scrapingExpensive and slow; outcomes uncertain

No single method covers the full attack surface. The SPUR approach tries to combine them into a more coherent, industry-wide framework.

Related Content:

The Agent Problem Is Different

Crawlers scraping content for training data are one thing. AI agents are a different and more complicated problem.

AI services access publisher content for three reasons: training data, search indexing, and responding to user requests in real time. That last category is where agents live. Unlike mass scrapers, agents act more like proxies for individual users, which is why they've historically been given a pass from blocking.

The problem is that agents don't behave like humans in the ways that matter to publishers. They don't load ads. They don't generate page views. They consume content and return synthesized answers, removing the audience touchpoint entirely.

There's no clean consensus yet on how to handle this. Regulation is being discussed. Lawsuits are ongoing. SPUR is trying to move faster than either. Once behavioral norms around agents harden, changing them will be extremely difficult.

Next Steps:

What Publishers Should Do Right Now

The SPUR coalition is an encouraging sign, but it'll take time to build real leverage. In the meantime, publishers need to act on their own situation.

A few practical steps worth taking:

  • Audit your crawler exposure: Review which bots are accessing your content, how frequently, and whether your robots.txt rules are actually being honored. Tools like TollBit's reports and Cloudflare's analytics can surface what you might be missing.
  • Consider Cloudflare's tools: Pay Per Crawl and Cloudflare's AI bot blocking features give publishers a technical layer that doesn't rely on AI companies voluntarily respecting rules.
  • Track your referral traffic: If AI platforms are generating citations or referrals, you need to know. This is baseline data for any licensing or compensation conversation.
  • Watch SPUR's progress: If the coalition gets traction, publishers who've already audited their exposure and built basic defenses will be better positioned to participate in whatever framework emerges.

The decision about whether to block, license, or optimize for AI citation is a business call that depends on your specific situation. What's not optional is understanding what's actually happening with your content.

Use our AI Crawler Protection Grader to see where your defenses stand, and the AI Crawler Resource Center for a deeper look at your options.

See It In Action:

The Revenue Question Nobody Is Answering Loudly Enough

Here's the part that gets lost in the scraping conversation: even if SPUR succeeds, even if Cloudflare's Pay Per Crawl becomes standard, even if licensing frameworks emerge, publishers will still be operating in a world where AI is replacing some portion of direct traffic.

That's not a reason to panic. It's a reason to make sure every session that lands on your site is working as hard as possible.

If traffic is flatter or more volatile than it was two years ago, your RPS needs to go up to compensate. That means better yield optimization, smarter demand configurations, and ad setups that aren't leaving money on the table.

We work with publishers across gaming, sports, news, and entertainment to do exactly that. The AI traffic conversation is real and worth taking seriously. So is the question of what you're doing with the audience you have right now. Both matter.

New call-to-action