Getting your Trinity Audio player ready...
|
Internet security firm Cloudflare has announced a novel solution for its users keen on stifling the activity of artificial intelligence (AI) bots and data scrapers on their websites to “preserve a safe internet.”
The solution, described as an “easy button” designed to prevent bot access to web content, comes at a time when creators are accusing AI companies of failing to seek their express permission before using proprietary data to train their AI models.
Cloudflare said the new feature will be available to all tiers of users, including those on the free plan. To activate the feature, users simply need to navigate to the Security section on their Cloudflare dashboard and toggle between their selected preferences.
“We hear clearly that customers don’t want AI bots visiting their websites, and especially those that do so dishonestly,” read the announcement. “To help, we’ve added a brand new one-click to block all AI bots.”
In 2023, Cloudflare debuted a feature for customers to block AI bots that don’t play by the rules, seeking consent before using licensed data to train their models. Cloudflare noted in its post that despite the clarification, an overwhelming majority of users still opted to block the bots from accessing their websites.
Rather than play by the rules, Cloudflare’s data reveals a growing number of bots that still attempt to bypass guardrails designed to keep them out. The bots typically crawl websites by leaning on a false user agent to mislead security measures, but Cloudflare says a year’s worth of monitoring has shed significant insights.
“Sadly, we’ve observed bot operators attempt to appear as though they are a real browser by using a spoofed user agent,” read the post. “We’ve monitored this activity over time, and we’re proud to say that our global machine learning model has always recognized this activity as a bot, even when operators lie about their user agent.”
The firm says it is able to identify AI bots masquerading as real web browsers through several methods, including the use of a bot scoring metric system. Furthermore, Cloudflare says bot attempts to crawl websites at scale leave several glaring fingerprints that are easily identifiable, given the firm’s 57 million requests per second.
Mainstream bots get all the attention
Cloudflare disclosed that users largely contain bot activity from mainstream AI developers, but lesser-known companies are running in the shadows. The analysis indicated that Bytespider, the bot from Bytedance, led other bots in terms of activities by crawling a staggering 40.40% of tracked websites.
Bytespider outperformed OpenAI’s GPTBot and other crawlers by Meta (NASDAQ: META), Anthropic AI and Google (NASDAQ: GOOGL) by a country mile, with the firm pledging to crack down on attempts to bypass guardrails using a combination of emerging technologies.
“We fear that some AI companies intent on circumventing rules to access content will persistently adapt to evade bot detection,” said Cloudflare. “We will continue to keep watch and add more bot blocks to our AI scrapers and crawlers rule and evolve our machine learning models.”
In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.
Watch: sCrypt Hackathon students realize there’s more to blockchain