Sujet : Re: bad bot behavior
De : not (at) *nospam* telling.you.invalid (Computer Nerd Kev)
Groupes : comp.miscDate : 18. Mar 2025, 23:19:22
Autres entêtes
Organisation : Ausics - https://newsgroups.ausics.net
Message-ID : <67d9f16a@news.ausics.net>
References : 1 2
User-Agent : tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
D Finnigan <
dog_cow@macgui.com> wrote:
On 3/18/25 10:17 AM, Ben Collver wrote:
Please stop externalizing your costs directly into my face
==========================================================
March 17, 2025 on Drew DeVault's blog
Over the past few months, instead of working on our priorities at
SourceHut, I have spent anywhere from 20-100% of my time in any given
week mitigating hyper-aggressive LLM crawlers at scale.
This is happening at my little web site, and if you have a web site,
it's happening to you too. Don't be a victim.
Meh, my little Web site runs so light that even when Amazon's bot
got stuck in a recursive loop grabbing the same dynamic page tens of
times a second from different IPs, the server load was near nill as
usual. The main problem that caused was access logs of hundreds of
megabytes per day. Amazon is still scraping the hell out of
everything I put online (even a mirror that's tens of GBs), and
other bots squeeze into the logs too, maybe even a few humans view
things sometimes? I don't care, they're welcome to it, and they
helped me find the bug in the Apache configuration which allowed
that recursive loop (though I still don't get why bots started
forming such URLs in the first place).
-- __ __#_ < |\| |< _#