Now that we know AI bots will ignore robots.txt and churn residential IP addresses to scrape websites, does anyone know of a method to block them that doesn’t entail handing over your website to Cloudflare?
Now that we know AI bots will ignore robots.txt and churn residential IP addresses to scrape websites, does anyone know of a method to block them that doesn’t entail handing over your website to Cloudflare?
I run !news_summary@lemmy.dbzer0.com and bypassing cloudflair, paywalls, anti bot filters, etc is way easyer compared to what anyone thinks.
Their is no escape from web scrapers. Best u can do is poison ur images and obfuscate the page source.
In that case I’m interested in tools to automate doing that.