hylobates@jlai.lu to Selfhosted@lemmy.worldEnglish · 1 month agoBased on this graph, and this graph alone, guess at what time I completely blocked OpenAI crawlersjlai.luimagemessage-square76linkfedilinkarrow-up1590arrow-down16file-text
arrow-up1584arrow-down1imageBased on this graph, and this graph alone, guess at what time I completely blocked OpenAI crawlersjlai.luhylobates@jlai.lu to Selfhosted@lemmy.worldEnglish · 1 month agomessage-square76linkfedilinkfile-text
minus-squareAHemlocksLie@lemmy.ziplinkfedilinkEnglisharrow-up18·1 month agoPretty sure I’ve repeatedly heard about the crawlers completely ignoring robots.txt, so does Cloudflare really do that much?
minus-squareSv443@sh.itjust.workslinkfedilinkEnglisharrow-up10arrow-down1·1 month agoLike a lock on a door, it stops the vast majority but can’t do shit about the actual professional bad guys
minus-squaretomjuggler@lemmy.worldlinkfedilinkEnglisharrow-up6·1 month agoYes, CloudFlare blocks agents completely if they ignore it’s restrictions. The key is scale - CloudFlare has a birds eye view of traffic patterns across millions of sites and can do statistical analysis to determine who is a bot. I hate the necessity but it works
Pretty sure I’ve repeatedly heard about the crawlers completely ignoring robots.txt, so does Cloudflare really do that much?
Like a lock on a door, it stops the vast majority but can’t do shit about the actual professional bad guys
Yes, CloudFlare blocks agents completely if they ignore it’s restrictions. The key is scale - CloudFlare has a birds eye view of traffic patterns across millions of sites and can do statistical analysis to determine who is a bot.
I hate the necessity but it works