Publishers Are Organizing Against AI Scraping
April 30, 2026
Editorial Policy
All of our content is generated by subject matter experts with years of ad tech experience and structured by writers and educators for ease of use and digestibility. Learn more about our rigorous interview, content production and review process here.
Publishers Are Organizing Against AI Scraping. Here's What That Means for You.
Key Points
- UK publishers including the BBC, Financial Times, and The Guardian have formed SPUR (Standards for Publisher Usage Rights) to establish shared technical standards and licensing frameworks for AI access to content.
- The robots.txt honor system is no longer enough. Publishers need technical and collective solutions that make unauthorized crawling cost-prohibitive.
- AI agents, bots that respond to user requests in real time, are the next frontier of this fight, and they're harder to block than traditional crawlers.
- The window to shape how AI companies access publisher content is open now. Once the rules harden, changing them will be significantly harder.
- Whatever content protection strategy you adopt, the traffic you do have still needs to work as hard as possible.
For years, publishers fought AI scraping individually, with whatever tools they had available. That meant a patchwork of robots.txt rules, paywalls, and the occasional lawsuit. The AI companies, for their part, mostly kept crawling.
That dynamic is starting to shift. Fast Company reports that a coalition of UK publishers, including the BBC, the Financial Times, and The Guardian, has formed SPUR: Standards for Publisher Usage Rights. The goal is to establish shared technical standards and licensing frameworks that push AI developers toward legitimate, compensated access to quality journalism.
This matters beyond the UK. If SPUR gains traction, it could set the template for how publishers worldwide negotiate with AI companies.
What SPUR Is Actually Trying to Do
SPUR isn't a lobbying group and it isn't a lawsuit. It's an attempt to shift the leverage equation through collective technical action.
Publishers currently have three main options: pursue a licensing deal (only available to the largest players), sue (expensive and slow), or build technical defenses. SPUR is focused on the third option, at scale. The idea is that if enough publishers make crawling difficult and costly, AI companies have more incentive to negotiate.
Cloudflare is an important ally here. It introduced Pay Per Crawl, a mechanism that charges bots for content access. TollBit has also been documenting the scope of the problem, highlighting how "headless browsers", bots that mimic human behavior to bypass standard crawler detection, are being used at industrial scale.
The core insight from SPUR's formation is simple: individual publishers have almost no leverage. A coordinated global coalition backed by infrastructure-level partners does.
Essential Background Reading:
- AI Crawler Resource Center for Publishers: Everything you need to understand how AI crawlers work and what publishers can do about them
- News Publisher Guide: Foundational strategies for news publishers navigating a rapidly changing digital landscape
- The Digital Squeeze: Why mid-market publishers are losing ground and what concrete steps they can take to push back
The Robots.txt Problem
Publishers have been relying on the robots exclusion protocol for decades. It was never designed for this fight.
Robots.txt is an honor system. It works when the parties on the other end respect it. Many AI crawlers don't. Some ignore it outright. Others use headless browsers that sidestep it entirely. The gap between what robots.txt was designed to do and what publishers actually need it to do has never been wider.
The table below shows where the main defensive options sit today:
| Defense Method | How It Works | Limitation |
|---|---|---|
| robots.txt | Instructs crawlers not to index or access content | Honor system; easily ignored or bypassed |
| Paywalls | Restricts content access to authenticated users | Reduces public reach; doesn't stop all crawlers |
| Cloudflare Pay Per Crawl | Charges bots for access at the network level | Still relatively new; adoption across AI companies is uneven |
| Licensing agreements | Direct commercial deals with AI companies | Only accessible to large publishers |
| Litigation | Legal action against unauthorized scraping | Expensive and slow; outcomes uncertain |
No single method covers the full attack surface. The SPUR approach tries to combine them into a more coherent, industry-wide framework.
Related Content:
- Leveling Up the Programmatic Ad Game: How transparency, quality, and performance shape better outcomes in programmatic advertising
- Playwire Named Jounce Media Bellwether Portfolio: A third-party recognition of our commitment to quality, performance, and transparency
- Our Publishers Are Partners, Not Just Customers: How we think about the publisher relationship and what that means in practice
The Agent Problem Is Different
Crawlers scraping content for training data are one thing. AI agents are a different and more complicated problem.
AI services access publisher content for three reasons: training data, search indexing, and responding to user requests in real time. That last category is where agents live. Unlike mass scrapers, agents act more like proxies for individual users, which is why they've historically been given a pass from blocking.
The problem is that agents don't behave like humans in the ways that matter to publishers. They don't load ads. They don't generate page views. They consume content and return synthesized answers, removing the audience touchpoint entirely.
There's no clean consensus yet on how to handle this. Regulation is being discussed. Lawsuits are ongoing. SPUR is trying to move faster than either. Once behavioral norms around agents harden, changing them will be extremely difficult.
Next Steps:
- News Publishers Ad Revenue Resource Center: A full library of resources covering ad revenue strategy for news publishers
- Playwire Expands Partnership with LiveRamp: How publisher-first identity solutions are evolving to protect and grow revenue
- Playwire Launches Complete Monetization Platform: How our full-stack platform connects publishers and advertisers to maximize yield
What Publishers Should Do Right Now
The SPUR coalition is an encouraging sign, but it'll take time to build real leverage. In the meantime, publishers need to act on their own situation.
A few practical steps worth taking:
- Audit your crawler exposure: Review which bots are accessing your content, how frequently, and whether your robots.txt rules are actually being honored. Tools like TollBit's reports and Cloudflare's analytics can surface what you might be missing.
- Consider Cloudflare's tools: Pay Per Crawl and Cloudflare's AI bot blocking features give publishers a technical layer that doesn't rely on AI companies voluntarily respecting rules.
- Track your referral traffic: If AI platforms are generating citations or referrals, you need to know. This is baseline data for any licensing or compensation conversation.
- Watch SPUR's progress: If the coalition gets traction, publishers who've already audited their exposure and built basic defenses will be better positioned to participate in whatever framework emerges.
The decision about whether to block, license, or optimize for AI citation is a business call that depends on your specific situation. What's not optional is understanding what's actually happening with your content.
Use our AI Crawler Protection Grader to see where your defenses stand, and the AI Crawler Resource Center for a deeper look at your options.
See It In Action:
- Squaredle Case Study: How a word puzzle publisher maximized ad revenue without compromising the user experience
- Alienware Esports Challenge: A real-world example of high-impact publisher partnerships driving measurable results at scale
The Revenue Question Nobody Is Answering Loudly Enough
Here's the part that gets lost in the scraping conversation: even if SPUR succeeds, even if Cloudflare's Pay Per Crawl becomes standard, even if licensing frameworks emerge, publishers will still be operating in a world where AI is replacing some portion of direct traffic.
That's not a reason to panic. It's a reason to make sure every session that lands on your site is working as hard as possible.
If traffic is flatter or more volatile than it was two years ago, your RPS needs to go up to compensate. That means better yield optimization, smarter demand configurations, and ad setups that aren't leaving money on the table.
We work with publishers across gaming, sports, news, and entertainment to do exactly that. The AI traffic conversation is real and worth taking seriously. So is the question of what you're doing with the audience you have right now. Both matter.
