Sujet : Re: terminal only for two weeks
De : mds (at) *nospam* bogus.nodomain.nowhere (Mike Spencer)
Groupes : comp.miscDate : 26. Nov 2024, 22:57:53
Autres entêtes
Organisation : Bridgewater Institute for Advanced Study - Blacksmith Shop
Message-ID : <87jzcp4pzy.fsf@enoch.nodomain.nowhere>
References : 1 2 3 4 5
User-Agent : Gnus v5.7/Emacs 20.7
D <
nospam@example.net> writes:
On Tue, 26 Nov 2024, yeti wrote:
<https://www.brow.sh/>
Ah yes... I've seen this before! I did drop it due to its dependency on
FF, but the concept is similar. My idea was to aggressively filter a web
page before passing it on to elinks or similar.
Perhaps rewriting it a bit in order to avoid the looooooong list of menu
options or links that always come up at the top of the page, before the
content of the page shows after a couple of page downs (this happens for
instance if I go to wikipedia).
Instead parsing it, and adding those links at the bottom, removing
javascript, and perhaps passing on only the text. Well, those are only
ideas. Maybe I'll try, maybe I won't. Time will tell! =)
I've done this for a few individual sites that I visit frequently.
+ A link to that site resides on my browser's "home" page.
+ That home page is a file in ~/html/ on localhost.
+ The link is actually to a target-specific cgi-bin Perl script on
localhost where Apache is running, restricted to requests from
localhost.
+ The script takes the URL sent from the home page, rewrites it for
the routable net, sends it to the target using wget and reads all
of the returned data into a variable.
+ Using Perl's regular expressions, stuff identified (at time of
writing the script) as unwanted is elided -- js, style, svg,
noscript etc. URLs self-referencing the target are rewritten to
to be sent through the cgi-bin script.
+ Other tweaks peculiar to the specific target...
+ Result is handed back to the browser preceded by minimal HTTP
headers.
So far, works like a charm. Always the potential that a target host
will change their format significantly. That has happened a couple of
times, requiring fetching an unadorned copy of the target's page,
tedious reading/parsing and edit to the script.
This obviously doesn't work for those sites that initially send a
dummy all-js page to verify that you have js enabled and send you a
condescending reproof if you don't. Other server-side dominance games
a potential challenge or a stone wall.
Writing a generalized version, capable of dealing with pages from
random/arbitrary sites is a notion perhaps worth pursuing but clearly
more of a challenge than site-specific scripts. RSN, round TUIT etc.
-- Mike Spencer Nova Scotia, Canada