On 12/2/2024 12:13 PM, Scott Lurndal wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 01.12.2024 17:42, Bart wrote:
On 01/12/2024 15:08, Janis Papanagnou wrote:
On 01.12.2024 12:52, Bart wrote:
makes typing easier because it is case-insensitive,
>
I don't think that case-insensitivity is a Good Thing. (I also don't
think it's a Bad Thing.)
I think it's a _real bad thing_ in almost every context related
to programming.
Agreed, even in contexts where it is usually standard, such as ASM programming and BASIC.
In my case, I have ended up with dialects where:
Identifiers and similar are case sensitive;
mnemonics / keywords are often case insensitive.
For things like file-systems, I much prefer case sensitivity.
If a shell or file-manager wants to allow case-insensitivity, it can do so itself, if purely as a UI feature for user convenience, but the underlying filesystem should ideally remain case-sensitive.
Though, it can lead to wonk with FAT:
For 8.3 names, only a limited range of options exists, IIRC:
FIRSTNAM.EXT
FIRSTNAM.ext
firstnam.EXT
firstnam.ext
Where, internally it is always upper-case codepage-1252;
And, a few control-bits can be used to encode the case.
Anything much different than this, it needs to encode it with LFNs.
The LFN's are nominally case-insensitive but case preserving.
But, in my project, are treated as case-sensitive. If LFN's are used, my FAT driver will fill the SFN mostly with a hashed value of the LFN (Base32 or similar IIRC). Since, in this case, if an LFN exists the SFN doesn't matter (but will use an SFN if the SFN can exactly represent the filename). When walking the directory, if an LFN is present, the SFN is effectively ignored.
Though, my Boot ROM's fat driver differs in that it only knows about SFN's (to save ROM footprint), so things like the kernel image need to have a valid 8.3 name.
Contrast, NTFS has built in wonk to support case insensitivity in various locales (effectively, collation-mapping tables need to be provided by the filesystem for dealing with things like metadata). This is kinda absurd IMO. The filesystem should not need to know/care about locale.
In an experimental filesystem design, had gone over to 64-byte directory entries with 48 bytes for the name field.
Where:
48 is "usually sufficient";
Fixed-size directory entries are preferable to variable-length directory entries (eg in EXT2), though one will need multiple directory entries to encode LFNs (though usually less than FAT, since the filenames are usually entirely ASCII and use UTF-8 rather than UTF-16).
Though, it is this, or just assume a 48 byte name limit to simplify the filesystem driver...
Though, possibly controversial was that I had decided to base the directory trees around AVL trees:
Can walk the directory in code-point sorted order;
Can allow for efficient lookups (vs linear search);
Less up-front cost and complexity vs B-Trees.
Linear search is arguably still better for small N:
If one assumes most directories have fewer than around 16 files,
linear search is likely the winner.
But, many will have more than 16 files.
Threshold where B-Tree wins over AVL is larger.
Though, a case could have been made for 1K 14-file B-Tree nodes (*).
*: In a denser tree or bigger directory, with fairly random node allocations, AVL is likely to result in more blocks needing to be accessed vs a B-Tree. But, OTOH, if 32K contains 512 files, this is likely to be larger than most directories (and still better than a linear search, as needed in FAT).
Generally, for tree-walk, the names are essentially in "memcmp()" order (simple, effective, no collating table needed).
Though, for for long-names, preferable to be able to resolve the comparison purely within the first 48 byes, so an idea here is, say:
First 40 bytes of first dirent name contain first 40 bytes of name;
Remaining 8 bytes contain a hash for the full name (raw binary).
Allowing the full-name to effectively be ignored in the tree walk.
But, this filesystem still hasn't seen much use yet...
...
>
But I want my software maintainable and readable. So my experience
is that I want some lexical "accentuation"; common answers to that
are for identifiers (for example) Camel-Case (that I used in C++),
underscores (that I use in Unix shell, Awk, etc.), or spaces (like
in Algol 68, but which is practically irrelevant for me).
CamelCase reduced typing speed and adds little benefit when compared
with the alternatives (rational abbreviations, or even underscores).
Possibly true...
In my case I have an annoyance that my keyboard isn't super new, and right now the shift key is wobbly and doesn't always trigger reliably.
Checks, shift keycap post is broken, will need to epoxy it (many of the keyboard keys have needed to be repaired in this way; at some point I think I may also need to reflow some of the solder joints on some of the key switches as well, but this is a bigger hassle).
Checks, my specific model of keyboard is apparently dated from around 2012-2016 (with its "successor" released in 2014). And, seemingly a lot more expensive now than when it was new...
>
it's not fussy about semicolons,
>
From the languages I know of in detail and I'm experienced in none
is "fussy" about semicolons. Rather it's a simple and well designed
syntactical token, whether used as separator or terminator. You've
just to put it where it's defined.
Indeed. One wonders at Bart's familiarity with formal grammars.
In my case, personally I haven't encountered much that doesn't work well enough with a recursive-descent parser.