Categories
Selected Articles

Nations will try again on plan to confront world’s ‘spiraling’ plastic pollution mess

Nations will try again on plan to confront world’s ‘spiraling’ plastic pollution mess
Categories
Selected Articles

An AI data trap catches Perplexity impersonating Google

A fly lies in a carnivorous plant of the Venus flytrap species.
A fly lies in a Venus flytrap.

  • Cloudflare set a trap, and Perplexity crawled right into it.
  • Perplexity impersonates Google’s Chrome browser to gain unauthorized access to data, Cloudflare finds.
  • Cloudflare CEO Matthew Prince compares Perplexity to “North Korean hackers.”

If you want to succeed in AI, a good hack would be to impersonate Google. You just can’t get caught.

This is what just happened to Perplexity, a startup that competes with ChatGPT, Google’s Gemini, and other generative AI services.

Quality data is crucial for success in AI, but tech companies don’t want to pay for this, so they crawl the web and scrape information for free, often without permission. This has sparked a backlash by some content creators and others interested in preserving the incentives that built the web.

Cloudflare and its CEO, Matthew Prince, have stormed into this battle with new features that help websites block unwanted AI bot crawlers. Cloudflare is an infrastructure, security, and software company that helps run about 20% of the internet. It thrives when the web does well, hence its interest in helping sites get paid for content.

Some Cloudflare customers recently complained to the company that Perplexity was evading these blocks and continued to scrape and collect data without permission.

So, CloudFlare set a digital trap and caught this startup red-handed, according to a Monday blog describing the escapade.

“Some supposedly ‘reputable’ AI companies act more like North Korean hackers,” Prince wrote on X on Monday. “Time to name, shame, and hard block them.”

Perplexity didn’t respond to a request for comment. 

The bait: Honeytrap domains and locked doors

Cloudflare created entirely new, unpublished websites and configured them with robots.txt files that explicitly blocked all crawlers — including Perplexity’s declared bots, PerplexityBot and Perplexity-User. These test sites had no public links, search engine entries, or metadata that would normally make them discoverable.

Yet, when Cloudflare queried Perplexity’s AI with questions about these specific sites, the startup’s service responded with detailed information that could only have come from those restricted pages. The conclusion? Perplexity had accessed the content despite being clearly told not to.

The cloak: How Perplexity masked its crawl

Perplexity initially crawled these sites using its official user-agent string, complying with standard protocols. However, Cloudflare said it discovered that once blocked, Perplexity resorted to stealth tactics.

Cloudflare found that Perplexity began deploying undeclared crawlers disguised as normal web browsers and sending requests from unknown or rotated IP addresses and unofficial ASNs, [what is ASN? write out on first ref?] which are crucial identifiers that help route internet traffic efficiently.

When its official crawlers were blocked, Perplexity also used a generic web browser designed to impersonate Google’s Chrome browser on Apple Mac computers. (Business Insider asked Google whether it has told Perplexity to stop impersonating Chrome. Google did not respond).

According to Cloudflare, Perplexity has been making millions of such “stealth” requests daily across tens of thousands of web domains.

This behavior not only violated web standards, but also betrays the fundamental trust that underpins the functioning of the open web, Cloudflare explained.

The comparison: How OpenAI gets it right

To emphasize what good bot behavior looks like, Cloudflare compared Perplexity’s conduct to that of OpenAI’s crawlers, which scrape data for developing ChatGPT and giant AI models such as the upcoming GPT-5.

When OpenAI’s bots encountered a robots.txt file or a similar block, they simply backed off. No circumvention. No masking. No backdoor crawling, according to Cloudflare tests.

The Fallout: De-verification and blocking

As a result of these findings, Cloudflare has de-listed Perplexity as a verified bot and rolled out new detection and blocking techniques across its network.

Cloudflare’s takedown serves as a cautionary tale in the AI arms race. While the web shifts toward stronger control over data access and usage, actors who flout these evolving norms may find themselves not just blocked, but publicly called out.

In an era where AI systems are hungry for training data, Cloudflare’s sting operation is a signal to startups and established players alike: Respect the rules of the web, or risk being exposed.

Sign up for BI’s Tech Memo newsletter here. Reach out to me via email at abarr@businessinsider.com.

Read the original article on Business Insider
Categories
Selected Articles

Sydney Sweeney’s gun range skills go viral after American Eagle ad uproar: ‘Wifey material’

The “Euphoria” star’s 2019 video — which noted she was “training for a new project” — has attracted plenty of new comments of support from mostly male fans.
Categories
Selected Articles

Trade deadline addition Andrew Slater exits Yankees game early with hamstring injury

The Yankees may not have a roster decision to make after all for who goes when Aaron Judge comes back.
Categories
Selected Articles

Clip of Sydney Sweeney’s expert shooting range run goes viral after Republican registration revealed, American Eagle ad fiasco

Sweeney expertly takes out multiple dummy targets in just a few seconds, not so much as flinching while the force of each gunshot rippled up her arms.
Categories
Selected Articles

Cubs Predicted to Dump Ex-Yankees Role Player Before End of Season

The Chicago Cubs are trying to reclaim the National League Central and made moves at the deadline to help them do that. More moves could be coming.
Categories
Selected Articles

Carlos Correa inadvertently pushed teammate Griffin Jax to request trade during Twins wild deadline

The Twins traded nearly half of their MLB roster at last week’s trade deadline, but if it weren’t for the Carlos Correa deal, pitcher Griffin Jax likely would have stayed put.
Categories
Selected Articles

Oil baron CEO claims he was subject of ‘sham’ swinger rumors so rival could steal his job: lawsuit

Duginski had several “revealing” conversations at the Glenmoor Country Club in Cherry Hills Village, where he learned that Ciotti had allegedly spread rumors that he and his wife were swingers,
Categories
Selected Articles

Pirates Reportedly Begin Initial Paul Skenes Contract Extension Talks

The Pittsburgh Pirates have reportedly started conversations with Paul Skenes over extension talks as the season goes on, but they know it will take some work.
Categories
Selected Articles

Don’t whine about federal budget cuts, lefties — put your money where your mouths are

President Donald Trump has given political liberals a chance to take a stand for NPR, PBS and all the causes they claim to care about.