Re: Archive Any And All Text Usenet

Liste des GroupesRevenir à ns nntp 
Sujet : Re: Archive Any And All Text Usenet
De : ross.a.finlayson (at) *nospam* gmail.com (Ross Finlayson)
Groupes : news.admin.peering news.software.nntp
Date : 10. Mar 2024, 18:42:51
Autres entêtes
Message-ID : <MtCcnRqpcPiGbHD4nZ2dnZfqn_idnZ2d@giganews.com>
References : 1
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0
On 03/09/2024 10:01 AM, Ross Finlayson wrote:
Hello.  I'd like to start with saying thanks
to Usenet administrators and originators,
Usenet has a lot of perceived value as a cultural
artifact, and also a great experiment in free
speech, association, and press.
>
...
>
The idea is that each "message", "post", has an ID,
then as far as that's good, that each group
in the hierarchy has a name, and that, each
message has a date.  Then, the idea is to
make an LFF, that makes a folder for a group,
for a date, each its messages.
>
a.b.c/YYYY/MMDD/HHMM/
>
>
...
>
If you can point me to similar interests or efforts
with regards to digital preservation, I'd be
interested your comments or details here.
>
>
>
Hello, I've studied this for a while.  Over on
sci.math, I've been tapping away on a thread
called "Meta:  a usenet server just for sci.math".
There it's sort of detailed the context and the
surrounds, about the specs and usual program models,
and, models of the data.
What I hope to figure out, is this "LFF" or
"Library Filesystem Format", convention, what
results "it's sort of a complete collection of
a groups' dates' posts, that is under 2GB
and fits on on all file-systems if it's
less than a few deep from the root of the
volume".
So, the idea is specifically how to pack away
posts, not so much how to access them at the
runtime, though it's also then quite directly
about how to implement Usenet protocols.
The sort of idea is, like, "either Windows or
Linux, FAT/NTFS or ext2/3/..., character sets
and encodings in the names of the groups and
the message ID's and file-names in the file-systems,
partitioned by group and date, all the groups' date's
posts".
One idea is that "a directory can't have more than
32k sub-directories, and should be quite less, and,
a directory might store files up to 4-billion many,
and, should be less, and, a directory depth should
be less, than, 7 deep".
Then the idea after the a.b.c/YYYY/MMDD/HHMM,
to store message ID's, by taking an MD5 hash
of the message ID, splitting that into four,
then putting message ID's under H1/H2/H3/H4/MessageId/,
then whether to have a directory or a file,
for the message ID.  The usual idea is a file,
because, it's just the actual Internet Message
its contents, but there's an idea that it's various
files, or a directory for them.
Then, the issue seems that gets at least 8 deep,
vis-a-vis, that it doesn't have too many sub-directories
or too many files or not-in-range characters while
it does partition each groups' dates' posts and
stores each groups' dates' posts.
Portable filesystem conventions seem the easiest way
to encourage fungible data this way, then whether
or however it's a tape-archive or zip file, that
they can all just get unpacked together and result
a directory with groups' dates' posts all together,
then make a maildir for example representation of
that, like with symlinks or whatever works on the
destination.
So anyways mostly the context behind this is
in "Meta:  a usenet server just for sci.math"
over on sci.math, I think about it a lot because
I really think Usenet is a special thing.
"AAAATU:  Archive Any And All Text Usenet"

Date Sujet#  Auteur
10 Mar 24 o Re: Archive Any And All Text Usenet1Ross Finlayson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal