Re: bad bot behavior

Liste des GroupesRevenir à c misc 
Sujet : Re: bad bot behavior
De : anthk (at) *nospam* openbsd.home (anthk)
Groupes : comp.misc
Date : 12. May 2025, 07:24:45
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <slrn101ue1g.198p.anthk@openbsd.home.localhost>
References : 1 2 3
User-Agent : slrn/1.0.3 (OpenBSD)
On 2025-03-18, Toaster <toaster@dne3.net> wrote:
On Tue, 18 Mar 2025 12:00:07 -0500
D Finnigan <dog_cow@macgui.com> wrote:
>
On 3/18/25 10:17 AM, Ben Collver wrote:
Please stop externalizing your costs directly into my face
==========================================================
March 17, 2025 on Drew DeVault's blog
 
Over the past few months, instead of working on our priorities at
SourceHut, I have spent anywhere from 20-100% of my time in any
given week mitigating hyper-aggressive LLM crawlers at scale.
 
This is happening at my little web site, and if you have a web site,
it's happening to you too. Don't be a victim.
 
Actually, I've been wondering where they're storing all this data;
and how much duplicate data is stored from separate parties all
scraping the web simultaneously, but independently.
>
But what can be done to mitigate this issue? Crawlers and bots ruin the
internet.
>

GZip bombs + fake links = profit. Remember that gz'ed web pages are a
standard, even lynx can parse gz files natively.

Also, Megahal/Hailo under Perl. Feed it nonsense, and create some
non-visible contents under a robots.txt-dissallowed directory
full of Markov-chains generated nonsense and gzip bombs.



Date Sujet#  Auteur
18 Mar 25 * bad bot behavior18Ben Collver
18 Mar 25 `* Re: bad bot behavior17D Finnigan
18 Mar 25  +- Re: bad bot behavior1Computer Nerd Kev
18 Mar 25  `* Re: bad bot behavior15Toaster
19 Mar 25   +* Re: bad bot behavior13Ian
19 Mar 25   i+* Re: bad bot behavior2Rich
23 Mar 25   ii`- Re: bad bot behavior1candycanearter07
20 Mar 25   i+* Re: bad bot behavior5Lawrence D'Oliveiro
20 Mar 25   ii`* Re: bad bot behavior4Ian
21 Mar 25   ii +- Re: bad bot behavior1Toaster
21 Mar 25   ii `* Re: bad bot behavior2Lawrence D'Oliveiro
21 Mar 25   ii  `- Re: bad bot behavior1Ian
23 Mar 25   i+* Re: bad bot behavior4candycanearter07
26 Mar 25   ii`* Re: bad bot behavior3D Finnigan
26 Mar 25   ii +- Re: bad bot behavior1candycanearter07
26 Mar 25   ii `- Re: bad bot behavior1Computer Nerd Kev
12 May 25   i`- Re: bad bot behavior1anthk
12 May 25   `- Re: bad bot behavior1anthk

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal