Sujet : Re: Byte Addressability And Beyond
De : monnier (at) *nospam* iro.umontreal.ca (Stefan Monnier)
Groupes : comp.archDate : 04. Jun 2024, 21:28:00
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <jwv7cf4mpug.fsf-monnier+comp.arch@gnu.org>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
User-Agent : Gnus/5.13 (Gnus v5.13)
If I want to validate combiner codes or normalize characters I need
UTF-32 because I have to work with the whole character as a unit.
You can read the code points directly from the UTF-8 sequence almost
as easily as you can from a UTF-32 sequence.
Most of the cost will be in the memory accesses and then in looking up the
various tables to decide how to normalize or whether it's valid, so the
difference between reading the info from UTF-32 or UTF-8 should be lost in
the noise.
UTF-32 might be marginally faster at this specific operation in some
cases (definitely not if your text is mostly ASCII), but I'd be very
surprised if the difference is ever large enough to pay for a conversion
from UTF-8 to UTF-32.
I was just trying to get people thinking of ways that malformed
characters might be used to bypass other validation checks in
their software.
Another issue with Unicode is the so-called "confusables": things that
may look identical (or close enough) on screen yet are different (and
not just because of normalization). E.g. Β vs B, А vs A, or ∕ vs / vs ⁄.
Unicode comes with a 700kB `confusables.txt` listing such issues.
Stefan
Date | Sujet | # | | Auteur |
1 May 24 | Byte Addressability And Beyond | 590 | | Lawrence D'Oliveiro |
1 May 24 | Re: Byte Addressability And Beyond | 431 | | John Levine |
1 May 24 | Re: Byte Addressability And Beyond | 409 | | Lawrence D'Oliveiro |
1 May 24 | Re: Byte Addressability And Beyond | 3 | | John Levine |
1 May 24 | Re: Byte Addressability And Beyond | 1 | | John Levine |
1 May 24 | Re: Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
1 May 24 | Re: Byte Addressability And Beyond | 1 | | Michael S |
1 May 24 | Re: Byte Addressability And Beyond | 404 | | John Levine |
2 May 24 | Re: Byte Addressability And Beyond | 382 | | Lawrence D'Oliveiro |
2 May 24 | Re: Byte Addressability And Beyond | 4 | | John Levine |
2 May 24 | Re: Byte Addressability And Beyond | 3 | | Lawrence D'Oliveiro |
2 May 24 | Re: Byte Addressability And Beyond | 2 | | John Levine |
5 May 24 | Re: Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
2 May 24 | Re: Byte Addressability And Beyond | 367 | | John Savard |
2 May 24 | Re: Byte Addressability And Beyond | 2 | | MitchAlsup1 |
11 May 24 | Re: Byte Addressability And Beyond | 1 | | John Savard |
4 May 24 | Re: Byte Addressability And Beyond | 364 | | Lawrence D'Oliveiro |
8 May 24 | Re: Byte Addressability And Beyond | 363 | | John Savard |
8 May 24 | Re: Byte Addressability And Beyond | 2 | | Lawrence D'Oliveiro |
10 May 24 | Re: Byte Addressability And Beyond | 1 | | David Brown |
8 May 24 | Re: Byte Addressability And Beyond | 360 | | MitchAlsup1 |
8 May 24 | Re: Byte Addressability And Beyond | 359 | | John Levine |
8 May 24 | Re: Byte Addressability And Beyond | 357 | | Lawrence D'Oliveiro |
9 May 24 | Re: Byte Addressability And Beyond | 356 | | John Levine |
10 May 24 | Re: Byte Addressability And Beyond | 354 | | David Brown |
10 May 24 | Re: Byte Addressability And Beyond | 353 | | Anton Ertl |
11 May 24 | Re: Byte Addressability And Beyond | 352 | | David Brown |
11 May 24 | Re: Byte Addressability And Beyond | 351 | | Anton Ertl |
11 May 24 | Re: Byte Addressability And Beyond | 158 | | David Brown |
11 May 24 | Re: Byte Addressability And Beyond | 1 | | Anton Ertl |
27 May 24 | Re: Byte Addressability And Beyond | 156 | | Lawrence D'Oliveiro |
27 May 24 | Re: Byte Addressability And Beyond | 155 | | John Levine |
27 May 24 | Re: Byte Addressability And Beyond | 154 | | Lawrence D'Oliveiro |
27 May 24 | Re: Byte Addressability And Beyond | 153 | | John Levine |
27 May 24 | Re: Byte Addressability And Beyond | 149 | | John Levine |
27 May 24 | Re: Byte Addressability And Beyond | 1 | | MitchAlsup1 |
28 May 24 | Re: Byte Addressability And Beyond | 147 | | Lawrence D'Oliveiro |
28 May 24 | Re: encoding conversion, Byte Addressability And Beyond | 1 | | John Levine |
28 May 24 | Re: Byte Addressability And Beyond | 145 | | Thomas Koenig |
29 May 24 | Re: Byte Addressability And Beyond | 137 | | Lawrence D'Oliveiro |
29 May 24 | Re: Byte Addressability And Beyond | 136 | | Anton Ertl |
29 May 24 | Re: Byte Addressability And Beyond | 12 | | Stefan Monnier |
29 May 24 | Re: Byte Addressability And Beyond | 10 | | Stefan Monnier |
29 May 24 | Re: Byte Addressability And Beyond | 3 | | John Levine |
30 May 24 | Re: Byte Addressability And Beyond | 2 | | George Neuner |
4 Jun 24 | Re: Byte Addressability And Beyond | 1 | | George Neuner |
30 May 24 | Re: Byte Addressability And Beyond | 6 | | Anton Ertl |
4 Jun 24 | Re: Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
4 Jun 24 | Re: Byte Addressability And Beyond | 4 | | Stefan Monnier |
7 Jun 24 | Re: Byte Addressability And Beyond | 1 | | Terje Mathisen |
7 Jun 24 | Re: Character non-equivalence, was Byte Addressability And Beyond | 2 | | John Levine |
9 Jun 24 | Re: Character non-equivalence, was Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
30 May 24 | Re: Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
30 May 24 | Re: Byte Addressability And Beyond | 117 | | Lawrence D'Oliveiro |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 66 | | John Levine |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Stephen Fuld |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 22 | | Anton Ertl |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 21 | | Thomas Koenig |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 8 | | Michael S |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Thomas Koenig |
30 May 24 | Re: IBM architectural goals, Byte Addressability And Beyond | 5 | | John Levine |
30 May 24 | Re: IBM architectural goals, Byte Addressability And Beyond | 2 | | Michael S |
30 May 24 | Re: IBM architectural goals, Byte Addressability And Beyond | 1 | | John Levine |
30 May 24 | Re: IBM architectural goals, Byte Addressability And Beyond | 2 | | Thomas Koenig |
30 May 24 | Re: IBM architectural goals, Byte Addressability And Beyond | 1 | | John Levine |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Anton Ertl |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 3 | | Anton Ertl |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | John Levine |
30 May 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Thomas Koenig |
31 May 24 | Re: architectural goals, Byte Addressability And Beyond | 5 | | Terje Mathisen |
1 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 4 | | Thomas Koenig |
1 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 3 | | Anton Ertl |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 2 | | John Levine |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Stefan Monnier |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 4 | | Lawrence D'Oliveiro |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | MitchAlsup1 |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Lynn Wheeler |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Stefan Monnier |
31 May 24 | Re: architectural goals, Byte Addressability And Beyond | 42 | | John Savard |
31 May 24 | Re: architectural goals, Byte Addressability And Beyond | 41 | | John Levine |
1 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 31 | | John Savard |
1 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 20 | | Thomas Koenig |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 6 | | John Savard |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 5 | | Thomas Koenig |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 3 | | John Levine |
3 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 2 | | OrangeFish |
3 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | John Levine |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 13 | | Lawrence D'Oliveiro |
5 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 12 | | Lawrence D'Oliveiro |
5 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
6 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 10 | | George Neuner |
6 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 6 | | John Levine |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 4 | | Lawrence D'Oliveiro |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 3 | | Stephen Fuld |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 2 | | Lawrence D'Oliveiro |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Stephen Fuld |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Terje Mathisen |
6 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Lynn Wheeler |
6 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | OrangeFish |
7 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 10 | | John Dallman |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | Michael S |
2 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 1 | | John Dallman |
4 Jun 24 | Re: architectural goals, Byte Addressability And Beyond | 7 | | Lawrence D'Oliveiro |
30 May 24 | Re: Byte Addressability And Beyond | 49 | | Stephen Fuld |
30 May 24 | Re: Byte Addressability And Beyond | 1 | | Anton Ertl |
30 May 24 | Re: Byte Addressability And Beyond | 2 | | Lawrence D'Oliveiro |
30 May 24 | Re: Byte Addressability And Beyond | 4 | | Terje Mathisen |
30 May 24 | Re: Byte Addressability And Beyond | 7 | | Terje Mathisen |
28 May 24 | Re: Byte Addressability And Beyond | 3 | | Lawrence D'Oliveiro |
12 May 24 | Re: python text, Byte Addressability And Beyond | 14 | | John Levine |
12 May 24 | Re: Byte Addressability And Beyond | 178 | | Thomas Koenig |
27 May 24 | Re: Byte Addressability And Beyond | 1 | | Lawrence D'Oliveiro |
8 May 24 | Re: Byte Addressability And Beyond | 1 | | Michael S |
2 May 24 | Re: Byte Addressability And Beyond | 10 | | MitchAlsup1 |
2 May 24 | Re: Byte Addressability And Beyond | 3 | | Michael S |
2 May 24 | Re: Byte Addressability And Beyond | 18 | | Anton Ertl |
1 May 24 | Byte Order (was: Byte Addressability And Beyond) | 4 | | Anton Ertl |
1 May 24 | Re: Byte Addressability And Beyond | 17 | | Stefan Monnier |
1 May 24 | Re: Byte Addressability And Beyond | 40 | | MitchAlsup1 |
1 May 24 | Re: Byte Addressability And Beyond | 15 | | Thomas Koenig |
1 May 24 | Re: Byte Addressability And Beyond | 3 | | Michael S |
2 May 24 | Re: Byte Addressability And Beyond | 4 | | Lawrence D'Oliveiro |
3 May 24 | Re: Byte Addressability And Beyond | 75 | | Anton Ertl |
5 May 24 | Re: Byte Addressability And Beyond | 20 | | John Savard |
5 May 24 | Re: Byte Addressability And Beyond | 1 | | John Savard |