Is AI Quietly Taking Your Stuff While You’re Not Looking?

Is AI Quietly Taking Your Stuff While You’re Not Looking?
TechCrunch/Kimberly White/Getty Images

So here’s the scoop: Cloudflare, the internet’s security guard, says the AI company Perplexity has been tiptoeing into websites that clearly hung up a “No Trespassing” sign. And by “No Trespassing,” we mean robots.txt, the internet’s polite way of telling bots, “Hey buddy, don’t touch this.” According to Cloudflare, Perplexity didn’t just ignore the sign; they allegedly showed up in disguise pretending to be Google Chrome, switching IP addresses like a spy swapping license plates, and sneaking in through back doors to grab content.

Cloudflare even set up fake websites with unpublished pages and strict no‑crawl rules, just to see what would happen. Surprise! Perplexity’s systems still managed to find and summarize what was on those secret pages. Cloudflare wasn’t impressed. They yanked Perplexity off their verified bot list and rolled out new anti‑sneak tools. Perplexity’s response? Basically: “Wasn’t us, you’re overreacting.”

Maybe you’re thinking, “Okay… and why do I care?” Here’s why: if AI companies can secretly grab data from sites that say “no,” your blog posts, business content, research, product prices, maybe even that company wiki you forgot was public, could end up training someone else’s AI. Without asking you. Without credit. Without permission. That matters because AI is being woven into almost everything we use, from search engines to hiring tools to customer service bots. If they’re trained on stolen, biased, or incomplete data, it can mess with the accuracy and fairness of the very systems you rely on.

Perplexity’s game here is to gobble up as much fresh, high‑quality data as possible so their AI can outthink the competition. And their competition is no small fry; ChatGPT, Gemini, Copilot, all fighting to be the AI you talk to every day instead of Google search. In this arms race, whoever gets the most and best data wins. And if that means pushing the limits of what’s “allowed,” well… you can see why this isn’t just nerd drama.

Why should you care whether you’re a CEO, a manager, a solo freelancer, or just scrolling while waiting for your coffee? Because if your work, your company’s work, or even your personal info can be scooped up without your say‑so, you lose control over how it’s used. That could mean competitors get smarter at your expense, customers see AI‑summarized versions of your stuff without ever visiting your site, or whole industries shift because AI can instantly regurgitate what used to take weeks of human effort.

Here’s what you should be asking yourself: Who’s using my content right now, and do I even know? If AI can learn from my data without my permission, what’s stopping it from learning the wrong lessons? What’s my plan if the internet becomes more AI‑summarized blurbs than original sources?

This isn’t just about Perplexity vs Cloudflare. It’s about who gets to control the future of information and whether the stuff you create, post, or share will be yours to control or just fuel for someone else’s machine.

How do you feel about AI crawling the web without asking first? Drop your thoughts. If you’ve ever had your content used in a way you didn’t expect, we want to hear your story.

- Matt Masinga


*Disclaimer: The content in this newsletter is for informational purposes only. We do not provide medical, legal, investment, or professional advice. While we do our best to ensure accuracy, some details may evolve over time or be based on third-party sources. Always do your own research and consult professionals before making decisions based on what you read here.