Liste des Groupes | Revenir à cl c |
BGB <cr88192@gmail.com> writes:https://en.wikipedia.org/wiki/Latin-1_SupplementOn 5/8/2025 6:13 AM, Janis Papanagnou wrote:I don't think that's accurate. Do you have a reference for that?On 08.05.2025 05:30, BGB wrote:>[...]I noticed that in several places you were referring to
>
Though, even for the Latin alphabet, once one goes much outside of ASCII
and Latin-1, it gets messy.
Latin-1. Since
decades that has been replaced by the Latin-9 (ISO 8859-15) character
set[*] for practical reasons ('€' sign, for example).
Why is your focus still on the old Latin-1 (ISO 8859-1) character
set?
Janis, just curious
[*] Unless Unicode and its encodings are used.
>
U+00A0..U+00FF are designated as Latin-1 in Unicode.
It's true that those characters have the same names in Unicode0000..001F, usually understood as C0 control codes.
as in Latin-1. Though the Wikipedia article says that the ranges
0x00..0x1F and 0x7F..0x9F are *undefined*. (That doesn't match my
recollection; I thought they were defined as control characters.)
In any case, Latin-1 and Latin-9 treat those ranges in the same way.Latin-9 does not exactly match up with U+00A0..U+00FF though, whereas for Latin-1, it does match up.
Both can be seen as encodings for small subsets of Unicode.
[...]It is 8-bit and byte-based, and informally I think, most extended-ASCII codepages were collectively known as ASCII even if only the low 7-bit range is ASCII proper (and I think more for sake of contrast with "Not Unicode", eg, UTF-8 / UTF-16 / UCS-2 / ...).
CP-1252, is the dominant remaining ASCII character set in use, isCP-1252 is not an ASCII character set. ASCII is a 7-bit character set.
based on Latin-1, with a few characters from Latin-15 shoved into the
places where control codes previously went.
CP-1252 is an 8-bit character set as are the Latin-* sets. Most 8-bit
sets are *based on* ASCII.
Les messages affichés proviennent d'usenet.