Sujet : Re: Rationale for aligning data on even bytes in a Unix shell file?
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.lang.cDate : 27. Apr 2025, 20:55:24
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vum23d$1hkjs$1@dont-email.me>
References : 1 2 3
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
On 27.04.2025 20:32, Bonita Montero wrote:
Am 26.04.2025 um 17:00 schrieb Scott Lurndal:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
In a "C" file (of the Kornshell software) I stumbled across this
comment: "Each command in the history file starts on an even byte
and is null-terminated."
I wonder what's the reason behind that even-byte-alignment, on "C"
level or on Unix/files level. Any ideas?
Possibly to support 16-bit character sets?
Unix has a big problem that it doesn't support 16 bit character sets.
Win32 supported UCS-2 from the beginning and UTF-16 afaik since Windows
2000.
What would be the advantage of a 16 bit encoding? (As opposed to,
say, UTF-8.)
With Unix there's even not a standard charset for the filesystem;
each filename character is just an octet.
I think we have to distinguish the technical base size, an octet,
from the actual filenames. My Linux has no problem to represent,
say, filenames in Chinese or German umlaut characters that require
for representation 2 octets.
That this is possible in the first place (as I had been told some
years ago) is an effect of the character-set non-specific encoding.
Janis