Re: Rationale for aligning data on even bytes in a Unix shell file?

Liste des GroupesRevenir à cl c  
Sujet : Re: Rationale for aligning data on even bytes in a Unix shell file?
De : cr88192 (at) *nospam* gmail.com (BGB)
Groupes : comp.lang.c
Date : 08. May 2025, 04:21:23
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vvh895$1ha26$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
User-Agent : Mozilla Thunderbird
On 5/7/2025 1:57 PM, Scott Lurndal wrote:
BGB <cr88192@gmail.com> writes:
On 5/7/2025 8:45 AM, Scott Lurndal wrote:
BGB <cr88192@gmail.com> writes:
On 5/6/2025 6:19 PM, Janis Papanagnou wrote:
On 06.05.2025 20:01, BGB wrote:
[...]
 
These component names are what the filesystem actually stores.
>
>
Combined path length can be somewhat longer.
   A traditional limit is 260 chars though.
>
There wasn't a well defined path-length limit in my projects, though
informally typically somewhere between 256 and 768 bytes.
>
It is rare to see a path over 256, but, "if doing it properly" a
consistent length limit of, say, 512 or 768 would make sense. Going any
bigger is likely needlessly overkill.
>
>
Full paths will exceed the 48 bytes frequently, this, for example
is 142 bytes.
>
/work/music/Blues/Howlin' Wolf/The Chess Box (1963 - 1973) (disc 3)/20 - The Red Rooster (London Sessions w false start and dialog) (1970).wav
>
We don't usually store full paths in a filesystem, as each directory
exists as its own entity.
>
So, say, if broken down:
  5 chars ("work")
  6 chars ("music")
  6 chars
  13 chars
  37 chars
  75 chars.
>
Longest name counted here would be 75.
 It doesn't matter from the application perspective, an
application needs to be written to support the largest
path length an implementation allows (e.g. POSIX PATH_MAX).
 Very few applications actually walk directories and thus
don't particularly notice per-directory limits.
 
As noted, I was talking about filesystem file-name lengths though, not overall path lengths.
The filesystem driver does do directory walking, so this is what I was writing about here.

POSIX has NAME_MAX as the maximum size that an implementation
supports (POSIX also defines the minimum value that an
implementation must support, which is 14 for NAME_MAX).
 14 bytes was the maximum filename length for the original unix
filesystems (later known as S5).
 $ getconf -a  | grep NAME_MAX
NAME_MAX                           255
_POSIX_NAME_MAX                    255
 
256 is a typical limit for this. I was also using 256.

$ getconf -a  | grep PATH_MAX
PATH_MAX                           4096
_POSIX_PATH_MAX                    4096
 
IMO though, 4096 is a little overkill though.
The traditional Windows path limit, 260 is a little small.
As noted, I would be more inclined towards 512 or 768 here, as it is "plenty long" without being so long as to become overly wasteful or problematic.

 
>
It is not all that uncommon for directories to get larger than a limit
where linear search is inefficient.
 Large directories are not generally considered a good
idea for multiple reasons.
 
More stats (from the gathered drive dumps).
Files in a directory:
   Under   10: 83.10
   Under   16: 88.74
   Under   50: 96.65
   Under  100: 98.81
   Under  500: 99.91
   Under 1000: 99.96
   Under 2000: 99.99
So, the vast majority of directories have a relatively small number of files.
Most of the directories are either in the domain where:
   Linear search is fastest
     But the added cost of an AVL tree walk does not matter.
   Are in the domain where AVL is the fastest option.
The main area where B-Tree reaches break-even being less than 0.04%, and where B-Tree's advantages are "actually relevant" being less than 0.01%.
And, if the minimal B-Tree takes more than a single disk block, or is much more expensive than a simple linear search over 10 or so items, it has already lost...
At best, one would end up with most of the directories consisting of a single B-Tree leaf block.
A typical B-Tree has a few drawbacks here:
   Needs to duplicate keys between leaf and node blocks;
   Leaf blocks are, on average, only around 50% full.
   Splitting a leaf means at least 3 blocks;
   ...
The argued merit is that a smaller number of nodes need to be checked, say, log16(n) rather than log2(n), ... But, this effect only really starts to come into play once 'n' gets bigger.
Similarly, most of the directories are effectively in a "zone of suckage" as far as B-Tree efficiency goes, but in this case can't push the K factor outside of a range to where this is no longer the case.
Where, for small n and small K, say:
   K < n < 0.5*K*K
Is an area where B-Trees actually kinda suck.
   They mostly start to pay off when n >= K*K.
Almost implicitly though, people assume n > K*K.
   Well, but also you want to keep logk(n) small,
     which is usual argument against AVL trees and similar.
       Say, log2(n) > log16(n) or similar.
Well, and not n<=K, where it is effectively just a flat array...
I guess, a similar "zone of suckage" exists for AVL trees, but in this case it is a tree with 2 items, and we can mostly ignore it (and it is functionally equivalent to a 2-element linked list). So, I guess, it is mostly a case of we quickly blow past this first limit, but n is small enough that we don't eat it in the cost of the 'log2(n)' tree walking part (but, log2(1024) = 10, which is crossing back into "starting to suck again"; where log16(1024) = 2.5, which is still good; but doesn't matter as much if it is 0.04% of the directories...).
One could argue for maybe having multiple types of directory (like ext2 and friends, but this adds complexity).
...
Statistical average file size seems to be somewhere around 256K ( ~ total_space_used / total_files_on_drive).
Though, I suspect more likely it is a distribution of actually most files being under 100K, but then a significant minority of files being considerably larger.
Like, with video files:
There are comparably few of them, all things said, but they are often ~ 100-200 MB each (so can eat lots of HDD space being small in number).
Can note, the FS design I had, ended up using a block allocation strategy like.
Uncompressed file with < 2^32 disk blocks:
   16 entries: direct block numbers (16K with 1K blocks);
   8 entries: indirect block (2MB with 1K blocks);
   4 entries: double indirect (256MB at 1K)
   2 entries: triple indirect (128GB)
   1 entry: quad indirect (16TB)
   1 entry: N/A
     (would be 5x, but bigger than volume size, so is N/A)
Uncompressed file with more than 2^32 blocks (64-bit block numbers):
   8 blocks: Direct block numbers (8K with 1K blocks).
   4 blocks: Indirect block numbers (512K with 1K blocks);
   1 entry: double indirect (16MB)
   1 entry: triple indirect (2GB)
   1 entry: quad indirect (256GB)
   1 entry: 5x indirect (32TB)
Though, theoretical max file size goes up to 4PB here with a 4K block size. There is another indirection table that also allows for larger maximum file size, but would require the use of a 512 byte inode.
Compressed files use a similar structure, with 64-bit block numbers, although the final leaf block size is 16x larger (So, 16K for 1K blocks, or 64K for 4K blocks). This is partly based on the mechanism used for file compression (it compresses each logical file block, and stores it in 1 to 16 disk blocks).
Can note that the general allocation of disk blocks working in a superficially similar way to the Linux EXTn (ext2/ext3/ext4) filesystems.
Different tradeoffs exist, but generally block sizes between 1K and 4K seem to make the most sense. Larger block sizes could be possible, but need end-packing to not waste a lot of space.
Small files on FAT32 volumes are often a bad case as FAT32 tends to default towards fairly large cluster sizes, which end up wasting a significant amount of space (when a statistical majority of the files don't even fill a single disk cluster).
Though, when using NTFS, I often end up enabling file compression for most of the drive, as in many contexts, enabling file compression actually makes stuff faster.
Though, from what I gather, NTFS file compression works file-at-a-time, rather than my experimental FS's compression working in terms of blocks.
...

Date Sujet#  Auteur
26 Apr 25 * Rationale for aligning data on even bytes in a Unix shell file?147Janis Papanagnou
26 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
27 Apr 25 i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Kaz Kylheku
27 Apr 25 i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 +* Re: Rationale for aligning data on even bytes in a Unix shell file?140Bonita Montero
27 Apr 25 i+* Re: Rationale for aligning data on even bytes in a Unix shell file?127Janis Papanagnou
28 Apr 25 ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?126Bonita Montero
28 Apr 25 ii +* Re: Rationale for aligning data on even bytes in a Unix shell file?124vallor
28 Apr 25 ii i+* Re: Rationale for aligning data on even bytes in a Unix shell file?122Bonita Montero
28 Apr 25 ii ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?121vallor
28 Apr 25 ii ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?120Bonita Montero
28 Apr 25 ii ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?119Janis Papanagnou
28 Apr 25 ii ii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?118Bonita Montero
28 Apr 25 ii ii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?54Janis Papanagnou
28 Apr 25 ii ii    i`* Re: Rationale for aligning data on even bytes in a Unix shell file?53Bonita Montero
28 Apr 25 ii ii    i +* Re: Rationale for aligning data on even bytes in a Unix shell file?44Bonita Montero
28 Apr 25 ii ii    i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?43Bonita Montero
28 Apr 25 ii ii    i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?42Richard Harnden
28 Apr 25 ii ii    i i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?41Bonita Montero
29 Apr 25 ii ii    i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?36Richard Heathfield
29 Apr 25 ii ii    i i   i+* Re: Rationale for aligning data on even bytes in a Unix shell file?33Bonita Montero
29 Apr 25 ii ii    i i   ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?31Richard Heathfield
6 May 25 ii ii    i i   iii`* Re: Rationale for aligning data on even bytes in a Unix shell file?30Bonita Montero
7 May 25 ii ii    i i   iii `* Re: Rationale for aligning data on even bytes in a Unix shell file?29BGB
7 May 25 ii ii    i i   iii  +* Re: Rationale for aligning data on even bytes in a Unix shell file?18Janis Papanagnou
7 May 25 ii ii    i i   iii  i+* Re: Rationale for aligning data on even bytes in a Unix shell file?13Michael S
8 May 25 ii ii    i i   iii  ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?11BGB
8 May 25 ii ii    i i   iii  iii`* Re: Rationale for aligning data on even bytes in a Unix shell file?10Janis Papanagnou
8 May 25 ii ii    i i   iii  iii `* Re: Rationale for aligning data on even bytes in a Unix shell file?9BGB
8 May 25 ii ii    i i   iii  iii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Keith Thompson
9 May 25 ii ii    i i   iii  iii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?7BGB
9 May 25 ii ii    i i   iii  iii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
9 May 25 ii ii    i i   iii  iii    i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
9 May 25 ii ii    i i   iii  iii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?4Lawrence D'Oliveiro
9 May 25 ii ii    i i   iii  iii     `* Re: Rationale for aligning data on even bytes in a Unix shell file?3BGB
15 May 25 ii ii    i i   iii  iii      `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Lawrence D'Oliveiro
15 May 25 ii ii    i i   iii  iii       `- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
9 May 25 ii ii    i i   iii  ii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
7 May 25 ii ii    i i   iii  i`* Re: Rationale for aligning data on even bytes in a Unix shell file?4BGB
7 May 25 ii ii    i i   iii  i +* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
8 May 25 ii ii    i i   iii  i i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
8 May 25 ii ii    i i   iii  i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
8 May 25 ii ii    i i   iii  +* Re: Rationale for aligning data on even bytes in a Unix shell file?5Lawrence D'Oliveiro
8 May 25 ii ii    i i   iii  i`* Re: Rationale for aligning data on even bytes in a Unix shell file?4BGB
9 May 25 ii ii    i i   iii  i `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Lawrence D'Oliveiro
9 May 25 ii ii    i i   iii  i  +- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
9 May 25 ii ii    i i   iii  i  `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
9 May 25 ii ii    i i   iii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Bonita Montero
9 May 25 ii ii    i i   iii   +* Re: Rationale for aligning data on even bytes in a Unix shell file?3BGB
9 May 25 ii ii    i i   iii   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson
9 May 25 ii ii    i i   iii   i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1BGB
14 May 25 ii ii    i i   iii   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
29 Apr 25 ii ii    i i   ii`- Locales [was: Re: Rationale for aligning data on even bytes in a Unix shell file?]1Alexis
29 Apr 25 ii ii    i i   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
29 Apr 25 ii ii    i i   i `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Richard Heathfield
29 Apr 25 ii ii    i i   +- Re: Rationale for aligning data on even bytes in a Unix shell file?1James Kuyper
29 Apr 25 ii ii    i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?2Bonita Montero
5 May 25 ii ii    i i   i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Tim Rentsch
29 Apr 25 ii ii    i i   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Michael S
28 Apr 25 ii ii    i `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Janis Papanagnou
28 Apr 25 ii ii    i  +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kaz Kylheku
28 Apr 25 ii ii    i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?6Bonita Montero
29 Apr 25 ii ii    i   `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Janis Papanagnou
29 Apr 25 ii ii    i    `* Re: Rationale for aligning data on even bytes in a Unix shell file?4David Brown
29 Apr 25 ii ii    i     `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Muttley
29 Apr 25 ii ii    i      +- Re: Rationale for aligning data on even bytes in a Unix shell file?1David Brown
30 Apr 25 ii ii    i      `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
28 Apr 25 ii ii    +* Re: Rationale for aligning data on even bytes in a Unix shell file?54Muttley
28 Apr 25 ii ii    i`* Re: Rationale for aligning data on even bytes in a Unix shell file?53Bonita Montero
28 Apr 25 ii ii    i +* Re: Rationale for aligning data on even bytes in a Unix shell file?41Bonita Montero
28 Apr 25 ii ii    i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?40Bonita Montero
28 Apr 25 ii ii    i i +* Re: Rationale for aligning data on even bytes in a Unix shell file?37Michael S
28 Apr 25 ii ii    i i i+* Re: Rationale for aligning data on even bytes in a Unix shell file?11Kaz Kylheku
28 Apr 25 ii ii    i i ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?10Michael S
28 Apr 25 ii ii    i i ii +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kaz Kylheku
29 Apr 25 ii ii    i i ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?8Lawrence D'Oliveiro
29 Apr 25 ii ii    i i ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?7Janis Papanagnou
29 Apr 25 ii ii    i i ii   `* Re: Rationale for aligning data on even bytes in a Unix shell file?6Michael S
30 Apr 25 ii ii    i i ii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?5Lawrence D'Oliveiro
30 Apr 25 ii ii    i i ii     `* Re: Rationale for aligning data on even bytes in a Unix shell file?4Janis Papanagnou
30 Apr 25 ii ii    i i ii      `* Re: Rationale for aligning data on even bytes in a Unix shell file?3Janis Papanagnou
1 May 25 ii ii    i i ii       `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Lew Pitcher
2 May 25 ii ii    i i ii        `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
28 Apr 25 ii ii    i i i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Michael S
29 Apr 25 ii ii    i i i`* Re: Rationale for aligning data on even bytes in a Unix shell file?24Muttley
30 Apr 25 ii ii    i i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?23Lawrence D'Oliveiro
30 Apr 25 ii ii    i i i  `* Re: Rationale for aligning data on even bytes in a Unix shell file?22Muttley
30 Apr 25 ii ii    i i i   +* Re: Rationale for aligning data on even bytes in a Unix shell file?20David Brown
30 Apr 25 ii ii    i i i   i+* Re: Rationale for aligning data on even bytes in a Unix shell file?18Muttley
30 Apr 25 ii ii    i i i   ii+* Re: Rationale for aligning data on even bytes in a Unix shell file?4Janis Papanagnou
30 Apr 25 ii ii    i i i   iii+* Re: Rationale for aligning data on even bytes in a Unix shell file?2David Brown
30 Apr 25 ii ii    i i i   iiii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   iii`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Muttley
30 Apr 25 ii ii    i i i   ii`* Re: Rationale for aligning data on even bytes in a Unix shell file?13David Brown
30 Apr 25 ii ii    i i i   ii +- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   ii `* Re: Rationale for aligning data on even bytes in a Unix shell file?11Muttley
1 May 25 ii ii    i i i   ii  `* Re: Rationale for aligning data on even bytes in a Unix shell file?10Lawrence D'Oliveiro
1 May 25 ii ii    i i i   ii   +- Re: Rationale for aligning data on even bytes in a Unix shell file?1vallor
1 May 25 ii ii    i i i   ii   +* Re: Rationale for aligning data on even bytes in a Unix shell file?7David Brown
6 May 25 ii ii    i i i   ii   i`* Re: Rationale for aligning data on even bytes in a Unix shell file?6BGB
2 May 25 ii ii    i i i   ii   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
30 Apr 25 ii ii    i i i   `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Lawrence D'Oliveiro
28 Apr 25 ii ii    i i `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Bonita Montero
28 Apr 25 ii ii    i `* Re: Rationale for aligning data on even bytes in a Unix shell file?11Muttley
29 Apr 25 ii ii    `* Re: Rationale for aligning data on even bytes in a Unix shell file?9Lawrence D'Oliveiro
28 Apr 25 ii i`- Re: Rationale for aligning data on even bytes in a Unix shell file?1vallor
28 Apr 25 ii `- Re: Rationale for aligning data on even bytes in a Unix shell file?1Janis Papanagnou
27 Apr 25 i+* Re: Rationale for aligning data on even bytes in a Unix shell file?2Kaz Kylheku
28 Apr 25 i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Kenny McCormack
28 Apr 25 i+- Re: Rationale for aligning data on even bytes in a Unix shell file?1Bonita Montero
28 Apr 25 i`* Re: Rationale for aligning data on even bytes in a Unix shell file?8Lawrence D'Oliveiro
9 May 25 `* Re: Rationale for aligning data on even bytes in a Unix shell file?2Keith Thompson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal