Newsportal USENET - Re: Rationale for aligning data on even bytes in a Unix shell file?

Re: Rationale for aligning data on even bytes in a Unix shell file?

Sujet : Re: Rationale for aligning data on even bytes in a Unix shell file?
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.c
Date : 06. May 2025, 19:01:41

Autres entêtes

Organisation : A noiseless patient Spider
Message-ID : <vvdj1o$3gijr$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
User-Agent : Mozilla Thunderbird

On 5/1/2025 4:13 AM, David Brown wrote:

On 01/05/2025 01:56, Lawrence D'Oliveiro wrote:
On Wed, 30 Apr 2025 12:38:53 -0000 (UTC), Muttley wrote:
>
Its certainly not a scheme I'd use, but I've also seen Makefile and
makefile in the same package build directory in the past.
>
The GNU “make” command, specified without a filename, looks for
“GNUmakefile”, then “Makefile”, then “makefile”. The man page
<https://manpages.debian.org/make(1)> says:
>
     We recommend Makefile because it appears prominently near the
     beginning of a directory listing, right near other important files
     such as README.
>
But is this still true for most people? I think the default sort
settings these days no longer put all-caps names at the top.
I can't speak for "most people", but since my project directories rarely have more than about a dozen files and directories (like "src" and "build") in the top directory, it could be called zzzz and still be near the top!

Wandering in a bit late, but I can note for my project (or, the makeshift OS part):
   Nominal filename format: UTF-8.
IIRC, my experimental (Unix style) filesystem could use one of several encodings:
   ASCII
   UTF8
   CP1252 (Latin-1 with extended control-codes replaced)
Merit of 1252 here is that it can potentially take fewer bytes, and statistically is most likely to cover any non-ASCII characters encountered (most are Latin-1), if everything fits into the character range, and using UTF-8 if it doesn't fit. It is possible to rely on disambiguation, and not use 1252 if it could be potentially confused for UTF-8. Most of the time, 1252 (if any non-ASCII chars are used) results in sequences that are invalid as UTF-8, thus no ambiguity would result (if not valid UTF-8, assume 1252). ASCII case can be ignored as it is equivalent between both encoding schemes.
The partial rationale here being that the directory entries in this case were fixed size (like FAT, albeit with longer names), and this could potentially make the difference between using a single directory entry or needing a more complex LFN style scheme. Though, in this case, the default name length is 48, and it is rare for a filename to not fit into 48 bytes.
No other codepages were supported here (so, anything not Latin-1 or similar will need to use UTF-8 regardless).
Another semi-filesystem is in use with similar rules, except with 32 byte filenames.
FAT32, as noted, is:
8.3, CP1252, with bits to encode upper or lower-case base and extension;
LFN's, with up to 256 characters of UCS-2.
...
At higher levels, API's generally assume normalization to UTF-8.
Though, with a few non-standard tweaks: 0080..009F are assumed to be the chars from 1252, and not the extended control chars;
In console settings, the Arabic alphabet was replaced with 2-digit hex numbers (00..FF), as:
   I felt a need for 2 dense hex numbers in the console;
   I ideally needed a spot low in the mapping;
   The Arabic characters don't map to 8x8 pixel character cells (1).
*1: Might reconsider if someone can make a case that this alphabet could in-fact be represented in a recognizable form in 8x8 pixel character cells.
This mostly doesn't apply outside the console. For application use, the standard character assignments would be assumed.
As for collating:
   Nominal order is raw unsigned bytes (based on the UTF-8 encoding);
   This will put uppercase before lowercase.
I debated some if/what style of normalization to use for UTF strings.
   Full Unicode normalization was too complicated;
   Fully non-normalized encoding could also pose issues.
If this context, if it takes much over a few hundred lines of code and around 1K of tables, it was too expensive.
Normalization rules ended up being a compromise:
   Only the Latin and Extended Latin combining characters are handled.
   Or, roughly, Latin-1 and Latin-2.
   Pretty much everything else is passed through as-is.
The combined characters are first broken up, and then any combining characters are combined. Filenames exist in combined form as this uses fewer bytes.
Say, filesystem layer does not normalize emojis, it has no reason to know what an emoji is.
There was some debate over representing non-BMP characters as UTF-8 coded surrogate pairs or as larger UTF-8 codepoints, off hand, I don't remember for certain. I think I may have chosen the latter due to fewer bytes, whereas I would usually have preferred UTF-8 coded surrogate pairs in other contexts. I do vaguely remember dealing with this issue in my normalization code though.
Though, in this case, the UTF-8 normalization was dealt with in the VFS level rather than in the FS drivers.
There was also a partial concern (that I remembered) that if a file in the filesystem were normalized in a way that differs from the VFS's normalization, it could potentially make the file effectively inaccessible. IIRC, there was no good solution to this possibility.
Most likely partial answer though is to assume that any filename normalization rules are to preferably be kept frozen.
...

Les messages affichés proviennent d'usenet.

Date	Sujet	#	Auteur
26 Apr 25	Rationale for aligning data on even bytes in a Unix shell file?	147	Janis Papanagnou
26 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Keith Thompson
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Kaz Kylheku
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	140	Bonita Montero
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	127	Janis Papanagnou
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	126	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	124	vallor
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	122	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	121	vallor
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	120	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	119	Janis Papanagnou
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	118	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	54	Janis Papanagnou
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	53	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	44	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	43	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	42	Richard Harnden
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	41	Bonita Montero
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	36	Richard Heathfield
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	33	Bonita Montero
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	31	Richard Heathfield
6 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	30	Bonita Montero
7 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	29	BGB
7 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	18	Janis Papanagnou
7 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	13	Michael S
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	11	BGB
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	10	Janis Papanagnou
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	9	BGB
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	8	Keith Thompson
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	7	BGB
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Keith Thompson
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	Lawrence D'Oliveiro
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	3	BGB
15 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Lawrence D'Oliveiro
15 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	BGB
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Lawrence D'Oliveiro
7 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	BGB
7 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	David Brown
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Lawrence D'Oliveiro
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	5	Lawrence D'Oliveiro
8 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	BGB
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	3	Lawrence D'Oliveiro
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	BGB
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	5	Bonita Montero
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	3	BGB
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Keith Thompson
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	BGB
14 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Lawrence D'Oliveiro
29 Apr 25	Locales [was: Re: Rationale for aligning data on even bytes in a Unix shell file?]	1	Alexis
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	David Brown
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Richard Heathfield
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	James Kuyper
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Bonita Montero
5 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Tim Rentsch
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Michael S
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	8	Janis Papanagnou
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Kaz Kylheku
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	6	Bonita Montero
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	5	Janis Papanagnou
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	David Brown
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	3	Muttley
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	David Brown
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Lawrence D'Oliveiro
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	54	Muttley
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	53	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	41	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	40	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	37	Michael S
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	11	Kaz Kylheku
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	10	Michael S
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Kaz Kylheku
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	8	Lawrence D'Oliveiro
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	7	Janis Papanagnou
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	6	Michael S
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	5	Lawrence D'Oliveiro
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	3	Janis Papanagnou
1 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Lew Pitcher
2 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Michael S
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	24	Muttley
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	23	Lawrence D'Oliveiro
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	22	Muttley
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	20	David Brown
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	18	Muttley
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	4	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	David Brown
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Muttley
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	13	David Brown
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	11	Muttley
1 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	10	Lawrence D'Oliveiro
1 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	vallor
1 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	7	David Brown
6 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	6	BGB
2 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
30 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Lawrence D'Oliveiro
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	11	Muttley
29 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	9	Lawrence D'Oliveiro
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	vallor
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Janis Papanagnou
27 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Kaz Kylheku
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Kenny McCormack
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	1	Bonita Montero
28 Apr 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	8	Lawrence D'Oliveiro
9 May 25	Re: Rationale for aligning data on even bytes in a Unix shell file?	2	Keith Thompson