Sujet : Re: "Red" And The DoD Language Competition
De : randy (at) *nospam* rrsoftware.com (Randy Brukardt)
Groupes : comp.lang.adaDate : 14. Sep 2024, 08:27:22
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vc3acd$1b4vk$1@dont-email.me>
References : 1 2 3 4 5
User-Agent : Microsoft Outlook Express 6.00.2900.5931
"Lawrence D'Oliveiro" <
ldo@nz.invalid> wrote in message
news:vc091i$ljiq$2@dont-email.me...On Thu, 12 Sep 2024 19:16:50 -0700, Paul Rubin wrote:
>
I run into sites all the time that block the wget user agent, but that I
can retrieve with curl.
>
And I run into sites all the time that block the default wget user agent,
but that I can retrieve with wget.
You're confused. The attackers aren't using Wget, but they are *claiming* to
be WGet. As you point out, real WGet users tend to claim to be other things.
So blocking WGet would be more likely to block the attackers than real
users. (As you state, real users know how to get around the blocks, so the
inconvinience for them is minor. Usually, the attackers don't change their
attacks often, there's plenty of sites that don't protect themselves at all.
So they are more effective against attackers.)
And anyone that thinks that ad revenue is important is probably blocking all
grabbers, and probably throttling everything else so that grabbing multiple
pages is very slow (at human reading speeds). (At least 90% of the browser
hits I see are obviously fake, and if I cared enough I would block all of
them - it would just take a bit of programming to check if the behavior is
similar to that of a live human. But I only block when something is causing
performance problems, and generally by IP.)
Randy.
Randy.