The Battle Between News Websites and AI Bots

The use of AI-specific bots for web crawling by news websites has become a contentious issue. Data journalist Ben Welsh conducted an analysis and discovered that a significant number of news websites are blocking these bots. Welsh found that just over a quarter of the news websites in his sample are blocking Applebot-Extended, while 53 percent are blocking OpenAI’s bot. Additionally, Google’s AI-specific bot, Google-Extended, is blocked by nearly 43 percent of the sites surveyed. This indicates a growing trend of news websites blocking AI bots, with Welsh noting that the numbers have been gradually increasing since he began monitoring the situation.

There seems to be a divide among news publishers regarding whether to block AI bots or not. Some news organizations have made licensing deals with AI bot owners, where they are paid in exchange for allowing the bots to crawl their websites. The strategic approach taken by major publishers like The New York Times, Condé Nast, and Buzzfeed suggests that there may be business considerations involved in these decisions. For example, Condé Nast used to block OpenAI’s web crawlers but unblocked them after announcing a partnership with the company. Buzzfeed blocks Applebot-Extended but allows AI bots with partnership agreements. This strategic approach highlights the importance of partnerships in the digital publishing landscape.

Managing AI bots has become a challenge for news websites, as there are numerous bots debuting and constantly evolving. With robots.txt needing to be manually edited to block specific bots, many publishers struggle to keep their block lists up to date. This challenge has led to the emergence of services like Dark Visitors, which offer automatic updates to clients’ robots.txt files. Publishers, particularly those concerned about copyright issues, make up a significant portion of Dark Visitors’ client base. The complexity of managing AI bots underscores the importance of understanding and controlling the web crawling activities on news websites.

The decision of which AI bots to block ultimately rests with media executives, as evidenced by the CEOs of major media companies directly deciding which bots to block. Some outlets have explicitly stated that they block AI scraping tools due to a lack of commercial agreements with the bot owners. For example, Vox Media blocks Applebot-Extended across all its properties as a precautionary measure. This hands-on approach by media executives emphasizes the significance of bot blocking decisions in protecting publisher content and intellectual property.

The battle between news websites and AI bots reflects the evolving landscape of digital publishing. The strategic considerations, challenges in bot management, and role of media executives highlight the complex interplay between technology and media in the AI age. As news websites continue to grapple with the presence of AI bots, the need for proactive decision-making and effective bot management strategies will be crucial in safeguarding the integrity of online content.

Articles You May Like

Leave a Reply Cancel reply