Re: Rationale for aligning data on even bytes in a Unix shell file?

Liste des GroupesRevenir à cl c 
Sujet : Re: Rationale for aligning data on even bytes in a Unix shell file?
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.c
Date : 07. May 2025, 19:26:03
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vvg8tf$1647n$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
User-Agent : Mozilla Thunderbird
On 5/7/2025 8:45 AM, Scott Lurndal wrote:
BGB <cr88192@gmail.com> writes:
On 5/6/2025 6:19 PM, Janis Papanagnou wrote:
On 06.05.2025 20:01, BGB wrote:
[...]
>
The partial rationale here being that the directory entries in this case
were fixed size (like FAT, albeit with longer names), and this could
potentially make the difference between using a single directory entry
or needing a more complex LFN style scheme. Though, in this case, the
default name length is 48, and it is rare for a filename to not fit into
48 bytes.
>
You mean rare in your application areas?
>
This appears to me like a very conservative size. While I'd agree
that it's probably a sensible value for own files with explicitly
chosen file names a lot of files that are downloaded regularly do
have longer file names. A quick check of my "Documents" directory
(that contains both, downloaded files and own files) shows a ratio
of 1563:629, i.e. roughly about 30% files of "document" type with
lengths > 48 (there's no files with a file name length > 128).
>
I recall someone here recently spoke about chosen lengths of 255
(or some such)for file names, which seems to be plenty, OTOH.
>
>
>
Running quick/dirty stats of everything on my "K:" drive, roughly 2
million files of various assorted types.
>
Stats (file names less than N bytes):
   16: 66.40%
   24: 87.85%
   32: 95.38%
   48: 99.31%
 Are you considering the entire path, or just the final component?
 
Here, only the final:
   "whatever.txt"
Along with each directory (on lines where the final component is the directory).
The directory names are not counted for each line though, as this would over-count the directory names. So, effectively, it is only counting the final name component of each line.
The list of drive contents in this case generated in WSL via the "find" command.
These component names are what the filesystem actually stores.
Combined path length can be somewhat longer.
   A traditional limit is 260 chars though.
There wasn't a well defined path-length limit in my projects, though informally typically somewhere between 256 and 768 bytes.
It is rare to see a path over 256, but, "if doing it properly" a consistent length limit of, say, 512 or 768 would make sense. Going any bigger is likely needlessly overkill.

Full paths will exceed the 48 bytes frequently, this, for example
is 142 bytes.
 /work/music/Blues/Howlin' Wolf/The Chess Box (1963 - 1973) (disc 3)/20 - The Red Rooster (London Sessions w false start and dialog) (1970).wav
We don't usually store full paths in a filesystem, as each directory exists as its own entity.
So, say, if broken down:
  5 chars ("work")
  6 chars ("music")
  6 chars
  13 chars
  37 chars
  75 chars.
Longest name counted here would be 75.
Can note that in my experimental filesystem, directory contents were stored in a structure resembling an AVL tree.
It is not all that uncommon for directories to get larger than a limit where linear search is inefficient.
Hashing works, but adds a constant space cost and a given hash table size only makes sense over a given size range (otherwise, it merely functions as a linear divisor relative to the item count).
Though, if one assumes 99% of directories have fewer than 1024 files and the target search length is 16 or less, could just use a 64 entry hash table and call it done. But, a 64 entry hash table merely wastes space for small directories.
Though, for a 64B dirent, a 64-entry hash would eat 4 dirents, so hashed directories could still fit in a single 1K disk block if fewer than 11 files (also semi common).
B-Trees are arguably the more efficient option for large directories, but B-Trees are also more complicated and would negatively effect the efficiency of small directories (if also used for small directories).
So, effectively I had rejected both the EXTn and NTFS approaches here.
Well, and also rejected EXTn's thing of effectively using variable length directory entries.
Modified AVL trees seemed like the most efficient option for the sizes of directories considered.
Small N (16 or less): No significant overhead compared with linear search.
Medium N (16 to 1024): Comparable efficiency to what B-Trees would provide.
Over 1024: B-Trees would overtake AVL trees for efficiency, but very few directories have more than 1024 items (and one still has an average-case search length of 10 blocks in this case).
The main modification to AVL trees here was that I had loosened the balancing constraints to +/- 2, as the looser constraint significantly reduces the number of node rotations needed when inserting items, while not significantly increasing average search length (or tree depth).
Though, in a theoretical worst case, a +/- 2 tree could have 2x the depth of a +/- 1 tree. This would likely only happen if items were inserted in a rather contrived ordering.
In the design, I did make the "possibly controversial" design decision that the tree-walking only works with 48 byte names.
For longer names, the "short name" is a modified form:
   First 40 bytes: The first part of the name;
   Last 8 bytes: A hash value of the full name.
Lookup happens with this shorter name, with the final dirent holding the rest of the name (which is spread across multiple dirents, sort of like in VFAT). This modified name is never exposed outside of the FS driver itself though.
Though, may be indirectly exposed in the way that calls like readdir would no longer report longer names in sorted order if the difference does not exist within the first 40 characters (the hash does not respect collation ordering).
Gemerally, the filesystem assumes an overall name-length limit of 256. Just opts for smaller storage when the name if 48 or less (which is true most of the time).
Though, FWIW, my FAT32/VFAT driver had also ignored the SFN when LFN's were present, and rather than using the Windows style handling for LFN's, essentially merely turns the SFN into a Base32 hash value and only uses the LFN as the name. Driver would still use the SFN if the name could be exactly represented in 8.3 form (thus, LFN entries being unnecessary).
Seemingly, modern Windows doesn't notice/care that the 8.3 name is merely a hash (and, in contexts where an 8.3 name does appear, it appears to be synthetic).
For now though, project still mostly uses FAT32 though...

Date Sujet#  Auteur
26 Apr 25 * Rationale for aligning data on even bytes in a Unix shell file?147Janis Papanagnou
26 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
27 Apr 25 i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Kaz Kylheku
27 Apr 25 i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?140Bonita Montero
27 Apr 25 i+* Re: Rationale for aligning data on even bytes in a Unix shell file?127Janis Papanagnou
28 Apr 25 ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?126Bonita Montero
28 Apr 25 ii +* Re: Rationale for aligning data on even bytes in a Unix shell file?124vallor
28 Apr 25 ii i+* Re: Rationale for aligning data on even bytes in a Unix shell file?122Bonita Montero
28 Apr 25 ii ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?121vallor
28 Apr 25 ii ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?120Bonita Montero
28 Apr 25 ii ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?119Janis Papanagnou
28 Apr 25 ii ii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?118Bonita Montero
28 Apr 25 ii ii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?54Janis Papanagnou
28 Apr 25 ii ii    i`* Re: Rationale for aligning data on even bytes in a Unix shell file?53Bonita Montero
28 Apr 25 ii ii    i +* Re: Rationale for aligning data on even bytes in a Unix shell file?44Bonita Montero
28 Apr 25 ii ii    i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?43Bonita Montero
28 Apr 25 ii ii    i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?42Richard Harnden
28 Apr 25 ii ii    i i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?41Bonita Montero
29 Apr 25 ii ii    i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?36Richard Heathfield
29 Apr 25 ii ii    i i   i+* Re: Rationale for aligning data on even bytes in a Unix shell file?33Bonita Montero
29 Apr 25 ii ii    i i   ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?31Richard Heathfield
6 May 25 ii ii    i i   iii`* Re: Rationale for aligning data on even bytes in a Unix shell file?30Bonita Montero
7 May 25 ii ii    i i   iii `* Re: Rationale for aligning data on even bytes in a Unix shell file?29BGB
7 May 25 ii ii    i i   iii  +* Re: Rationale for aligning data on even bytes in a Unix shell file?18Janis Papanagnou
7 May 25 ii ii    i i   iii  i+* Re: Rationale for aligning data on even bytes in a Unix shell file?13Michael S
8 May 25 ii ii    i i   iii  ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?11BGB
8 May 25 ii ii    i i   iii  iii`* Re: Rationale for aligning data on even bytes in a Unix shell file?10Janis Papanagnou
8 May 25 ii ii    i i   iii  iii `* Re: Rationale for aligning data on even bytes in a Unix shell file?9BGB
8 May 25 ii ii    i i   iii  iii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Keith Thompson
9 May 25 ii ii    i i   iii  iii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?7BGB
9 May 25 ii ii    i i   iii  iii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
9 May 25 ii ii    i i   iii  iii    i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
9 May 25 ii ii    i i   iii  iii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?4Lawrence D'Oliveiro
9 May 25 ii ii    i i   iii  iii     `* Re: Rationale for aligning data on even bytes in a Unix shell file?3BGB
15 May 25 ii ii    i i   iii  iii      `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Lawrence D'Oliveiro
15 May 25 ii ii    i i   iii  iii       `- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
9 May 25 ii ii    i i   iii  ii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
7 May 25 ii ii    i i   iii  i`* Re: Rationale for aligning data on even bytes in a Unix shell file?4BGB
7 May 25 ii ii    i i   iii  i +* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
8 May 25 ii ii    i i   iii  i i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
8 May 25 ii ii    i i   iii  i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
8 May 25 ii ii    i i   iii  +* Re: Rationale for aligning data on even bytes in a Unix shell file?5Lawrence D'Oliveiro
8 May 25 ii ii    i i   iii  i`* Re: Rationale for aligning data on even bytes in a Unix shell file?4BGB
9 May 25 ii ii    i i   iii  i `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Lawrence D'Oliveiro
9 May 25 ii ii    i i   iii  i  +- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
9 May 25 ii ii    i i   iii  i  `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
9 May 25 ii ii    i i   iii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Bonita Montero
9 May 25 ii ii    i i   iii   +* Re: Rationale for aligning data on even bytes in a Unix shell file?3BGB
9 May 25 ii ii    i i   iii   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
9 May 25 ii ii    i i   iii   i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
14 May 25 ii ii    i i   iii   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
29 Apr 25 ii ii    i i   ii`- Locales [was: Re: Rationale for aligning data on even bytes in a Unix shell file?]1Alexis
29 Apr 25 ii ii    i i   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
29 Apr 25 ii ii    i i   i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Richard Heathfield
29 Apr 25 ii ii    i i   +- Re: Rationale for aligning data on even bytes in a Unix shell file?1James Kuyper
29 Apr 25 ii ii    i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Bonita Montero
5 May 25 ii ii    i i   i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Tim Rentsch
29 Apr 25 ii ii    i i   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Michael S
28 Apr 25 ii ii    i `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Janis Papanagnou
28 Apr 25 ii ii    i  +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kaz Kylheku
28 Apr 25 ii ii    i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?6Bonita Montero
29 Apr 25 ii ii    i   `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Janis Papanagnou
29 Apr 25 ii ii    i    `* Re: Rationale for aligning data on even bytes in a Unix shell file?4David Brown
29 Apr 25 ii ii    i     `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Muttley
29 Apr 25 ii ii    i      +- Re: Rationale for aligning data on even bytes in a Unix shell file?1David Brown
30 Apr 25 ii ii    i      `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
28 Apr 25 ii ii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?54Muttley
28 Apr 25 ii ii    i`* Re: Rationale for aligning data on even bytes in a Unix shell file?53Bonita Montero
28 Apr 25 ii ii    i +* Re: Rationale for aligning data on even bytes in a Unix shell file?41Bonita Montero
28 Apr 25 ii ii    i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?40Bonita Montero
28 Apr 25 ii ii    i i +* Re: Rationale for aligning data on even bytes in a Unix shell file?37Michael S
28 Apr 25 ii ii    i i i+* Re: Rationale for aligning data on even bytes in a Unix shell file?11Kaz Kylheku
28 Apr 25 ii ii    i i ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?10Michael S
28 Apr 25 ii ii    i i ii +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kaz Kylheku
29 Apr 25 ii ii    i i ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Lawrence D'Oliveiro
29 Apr 25 ii ii    i i ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?7Janis Papanagnou
29 Apr 25 ii ii    i i ii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?6Michael S
30 Apr 25 ii ii    i i ii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Lawrence D'Oliveiro
30 Apr 25 ii ii    i i ii     `* Re: Rationale for aligning data on even bytes in a Unix shell file?4Janis Papanagnou
30 Apr 25 ii ii    i i ii      `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Janis Papanagnou
1 May 25 ii ii    i i ii       `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Lew Pitcher
2 May 25 ii ii    i i ii        `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
28 Apr 25 ii ii    i i i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Michael S
29 Apr 25 ii ii    i i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?24Muttley
30 Apr 25 ii ii    i i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?23Lawrence D'Oliveiro
30 Apr 25 ii ii    i i i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?22Muttley
30 Apr 25 ii ii    i i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?20David Brown
30 Apr 25 ii ii    i i i   i+* Re: Rationale for aligning data on even bytes in a Unix shell file?18Muttley
30 Apr 25 ii ii    i i i   ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?4Janis Papanagnou
30 Apr 25 ii ii    i i i   iii+* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
30 Apr 25 ii ii    i i i   iiii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   iii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Muttley
30 Apr 25 ii ii    i i i   ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?13David Brown
30 Apr 25 ii ii    i i i   ii +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?11Muttley
1 May 25 ii ii    i i i   ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?10Lawrence D'Oliveiro
1 May 25 ii ii    i i i   ii   +- Re: Rationale for aligning data on even bytes in a Unix shell file?1vallor
1 May 25 ii ii    i i i   ii   +* Re: Rationale for aligning data on even bytes in a Unix shell file?7David Brown
6 May 25 ii ii    i i i   ii   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?6BGB
2 May 25 ii ii    i i i   ii   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
28 Apr 25 ii ii    i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Bonita Montero
28 Apr 25 ii ii    i `* Re: Rationale for aligning data on even bytes in a Unix shell file?11Muttley
29 Apr 25 ii ii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?9Lawrence D'Oliveiro
28 Apr 25 ii i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1vallor
28 Apr 25 ii `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 i+* Re: Rationale for aligning data on even bytes in a Unix shell file?2Kaz Kylheku
28 Apr 25 i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kenny McCormack
28 Apr 25 i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Bonita Montero
28 Apr 25 i`* Re: Rationale for aligning data on even bytes in a Unix shell file?8Lawrence D'Oliveiro
9 May 25 `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal