Re: Archive Any And All Text Usenet

Liste des GroupesRevenir à ns nntp 
Sujet : Re: Archive Any And All Text Usenet
De : ross.a.finlayson (at) *nospam* gmail.com (Ross Finlayson)
Groupes : news.admin.peering news.software.nntp
Date : 13. Mar 2024, 21:43:33
Autres entêtes
Message-ID : <qqydnQe1R5x_nG_4nZ2dnZfqnPSdnZ2d@giganews.com>
References : 1 2 3
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
On 03/12/2024 07:57 PM, immibis wrote:
On 11/03/24 05:12, David Chmelik wrote:
On Sat, 9 Mar 2024 10:01:52 -0800, Ross Finlayson wrote:
>
Hello.  I'd like to start with saying thanks to Usenet administrators
and originators,
Usenet has a lot of perceived value as a cultural artifact, and also a
great experiment in free speech, association, and press.
>
Here I'm mostly interested in text Usenet,
not binaries, that text Usenet is a great artifact and experiment in
speech, association,
and press.
>
When I saw this example that may have a lot of old Usenet, then it sort
of aligned with an idea that started as an idea of vanity press, about
an archive of a group.
Now though, I wonder how to define an "archive any and all text usenet",
AAATU,
filesystem convention, as a sort of "Library Filesystem Format", LFF.
[...]
>
Sounds good; I'm interested in full archive of text newsgroups I use
(1300+) but don't know free Usenet servers even go back to when I started
(1996, though tried Internet in museum before Eternal September).  I'm
aware I could use commercial ones that may, but don't know which nor
cost/
space.  Is Google Groups the only going back to 1981?  I hope other
servers managed to save that before Google disconnected from peers or
some
might turn up back to 1979.
>
Accessing some old binary ones would be nice also, but these days people
use commercial servers for those, which probably didn't save even back to
'90s... an archive of those (even though I'm uninterested in most rather
than a few relating to history of science, some types of art/graphics &
music) would presumably be too large except for data centres.
>
Giganews on Reddit published the number: 20 gigabits per second. Of new
data. This is approximately one new server full of hard drives every few
days. If your servers are some of those dedicated to holding as many
hard drives as possible, then one a week.
Right.... Once upon a time a major retail website made
a study, and 99% of the traffic was JPEG, and 50+% of
the CPU was compression and encryption.
These days usually encryption and compression
is a very significant load on web servers, which
are often designed also to simply consume huge
amounts of RAM.
It doesn't really have to be that way, in the case
that basically Internet Messages here Usenet
are "static assets" of a sort once arrived, if the
so very many of them and with regards to their
size, here that most text Usenet messages are
on the order of linear in 4KiB header + body,
while on the order of messages, each post.
So one way to look at the facilities, of the system,
is DB FS MQ WS, database filesystem message-queue
web-services, with regards to nodes, on hosts,
with regards to connection-oriented architectures,
message-passing systems, according to distributed
topologies, mostly point-to-point protocols.
Then nodes have CPU, RAM, and network and
storage I/O, these are the things, "space", and "time".
Our model of Usenet operation is "INN" or innd,
and the related tools and protocols and conventions,
for example cleanfeed, NoCem, Cancel or what was
Cancelmoose, or otherwise in terms of control
and junk bands, about site policies which include
rejection and retention, "INN" is in, the surrounds
is INN, there's an ecosystem of INN and derivative
projects and innovative projects, in the ecosystem.
NNTP, and IMAP, and POP3, and SMTP, have a very
high affinity, about protocols the exchange of
Internet Messages, text-based protocols, connection-
oriented protocols, with the layers of protocols,
DEFLATE compression and SASL authentication
and TLS encryption.
So, the idea of LFF, is basically that Usenet posts,
or Internet Messages, are each distinct and unique,
and cross between groups, and emails, yet mostly
within and among groups, and, with regards to
References, the threading of posts in threads.
So, the idea of LFF, is just that the FS filesystem,
is ubiquitous for hierarchical storage, and the
tools, are commonplace, are very well understood,
and the limits, of modern (meaning, since at least
20 years ago), filesystems, are sort of understood,
with respect to the identifiers of groups and posts,
in character sets and encodings, according to the
headers and bodies of the posts the messages,
at rest, according to a given group and date.
Then the idea seems to gather these, to forage
the posts, into a directory structure, then when
those are determined found, that the headers
may be have added some informative headers,
with regards to their archival as a sort of terminus
of the delivery, then to call that an archive for
the group+date and zip it up for posterity,
and put it in a hierarchical filesystem or object-store,
then for the declared purpose here of "archive
any and all text usenet", of course with respect
to the observance or honoring of such directives
as x no-archive and cancel or supersedes, or
otherwise what "is" or "isn't", what was.
So I'm really happy when I think about it,
Usenet, and stuff like INN and the ecosystem,
and the originators of these parts of the
ecosystem, and then administrators, and
the innovators, and with regards to the
_belles lettres_ of text Usenet, then only
sort of secondarily to the bells and whistles,
of the binary or larger objects that are not
for this, or that this is for "text LFF". (Binaries
basically as Internet Messages have quite
altogether variously structured contents
their bodies and references and external
references and body parts, not relevant here.)
So, especially if someone rather emeritus in
the originators, reads this, your opinion and
estimates, are highly valued, as with regards
and respect to what you want to see, with
regards to "AAAATU: belles lettres", and basically
for making it so that the protocols, of URL's and
URI's and URN's, about, Usenet posts,
even result Dublin Core, and DOI's, Message-IDs.
It's figured then if posts are just data,
and LFF is ubiquitous, then the ecosystem
can help develop the Museum Experience,
an archives, a search, tours, exhibits,
browsing, and the carrels, a living museum,
of Usenet, its authors, their posts, this culture.

Date Sujet#  Auteur
11 Mar 24 * Re: Archive Any And All Text Usenet7David Chmelik
11 Mar 24 +* Re: Archive Any And All Text Usenet2Ross Finlayson
12 Mar 24 i`- Re: Archive Any And All Text Usenet1Ross Finlayson
13 Mar 24 `* Re: Archive Any And All Text Usenet4immibis
13 Mar 24  `* Re: Archive Any And All Text Usenet3Ross Finlayson
14 Mar 24   `* Re: Archive Any And All Text Usenet2CPMST
14 Mar 24    `- Re: Archive Any And All Text Usenet1Blue-Maned_Hawk

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal