Re: Tcl9: source files are interpreted as utf-8 by default

Liste des GroupesRevenir à cl tcl 
Sujet : Re: Tcl9: source files are interpreted as utf-8 by default
De : rich (at) *nospam* example.invalid (Rich)
Groupes : comp.lang.tcl
Date : 09. Jan 2025, 04:57:14
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vlnheq$36t4o$1@dont-email.me>
References : 1 2 3 4 5 6 7 8 9 10 11 12 13 14
User-Agent : tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.139 (x86_64))
Luc <luc@sep.invalid> wrote:
On Wed, 8 Jan 2025 22:53:40 -0000 (UTC), Rich wrote:
 
Won't that cause problems if the system is iso-8859-1? 
>
Only if windows tries to interpret the UTF-8 data as iso-8859
characters.  But as far as the Tcl scripts go, once the scripts are
UTF-8, and [source] is using UTF-8 to read them, the fact that windows
system might be iso-8859 is irrelivant.
 
I was thinking that if the Windows user edits the file on Windows,
maybe Windows will write it as iso-8859. I honestly don't know.

I don't know what windows does either.  A /reasonable/ approach (which
likely means windows deliberately does not do this) is to write it as
the "system encoding" unless the user explicitly says to use something
else, or unless something in the file indicated it was originally some
other encoding.

8.6.6 handled Unicode fine.  In fact, 8.5 handled Unicode (so long as
one stuck to the BMP) just fine. 
 
I am positive that 8.6.6 only partially supports Unicode.

What were the codepoint values at issue?  8.6.6 worked fine with the
BMP (code points 0000 to FFFF) characters.

I found many characters that would not display correctly on a text
widget

Display depends upon whether your font being used had a glyph for the
codepoint - no glyph in the font, no display in the text widget (even
though 8.6.6 likely transparently handled the code point properly,
assuming it was within the BMP).

and would be saved as garbled content if captured in the widget and
written to file.

That also depends upon what your system encoding was set to, and
whether you forced a specific encoding when writing the file.  If the
code points were in the BMP, and you explicitly set utf-8 encoding
before writing to the file, then the file's contents were properly
encoded even as far back as 8.5 (I know this one because I processed
millions of utf-8 files with only BMP code points through 8.5 for $work
with zero utf-8 encoding issues).

I even had problems with glob and other commands when applied to
some file names. For example, some html page I had downloaded from
somewhere had something to do with countries and the page title had
Unicode flags in the title, so the title and the flags carried over
to the file name when I saved it.

Country flags are very likely characters that are beyond the BMP, and
yes, 8.6.6 likely did not handle those properly.

The complete implementation of Unicode begins in 8.6.10 or 8.6.13, I
can't remember which, I think it's 8.6.13.

That is probably when support for the extended Unicode characters
(planes beyond the BMP) started to be added.


Date Sujet#  Auteur
13 Dec 24 * Tcl9: source files are interpreted as utf-8 by default40Uwe Schmitz
13 Dec 24 +* Re: Tcl9: source files are interpreted as utf-8 by default2Harald Oehlmann
13 Dec 24 i`- Re: Tcl9: source files are interpreted as utf-8 by default1Uwe Schmitz
7 Jan 25 `* Re: Tcl9: source files are interpreted as utf-8 by default37Uwe Schmitz
7 Jan 25  +* Re: Tcl9: source files are interpreted as utf-8 by default5Harald Oehlmann
8 Jan 25  i`* Re: Tcl9: source files are interpreted as utf-8 by default4Uwe Schmitz
8 Jan 25  i `* Re: Tcl9: source files are interpreted as utf-8 by default3Harald Oehlmann
8 Jan 25  i  `* Re: Tcl9: source files are interpreted as utf-8 by default2Harald Oehlmann
8 Jan 25  i   `- Re: Tcl9: source files are interpreted as utf-8 by default1Uwe Schmitz
7 Jan 25  `* Re: Tcl9: source files are interpreted as utf-8 by default31Luc
8 Jan 25   `* Re: Tcl9: source files are interpreted as utf-8 by default30Uwe Schmitz
8 Jan 25    `* Re: Tcl9: source files are interpreted as utf-8 by default29Luc
8 Jan 25     `* Re: Tcl9: source files are interpreted as utf-8 by default28Luc
8 Jan 25      `* Re: Tcl9: source files are interpreted as utf-8 by default27Uwe Schmitz
8 Jan 25       `* Re: Tcl9: source files are interpreted as utf-8 by default26Luc
8 Jan 25        `* Re: Tcl9: source files are interpreted as utf-8 by default25Rich
8 Jan 25         `* Re: Tcl9: source files are interpreted as utf-8 by default24Luc
8 Jan 25          `* Re: Tcl9: source files are interpreted as utf-8 by default23Rich
8 Jan 25           +* Re: Tcl9: source files are interpreted as utf-8 by default12Luc
8 Jan 25           i+- Re: Tcl9: source files are interpreted as utf-8 by default1ted@loft.tnolan.com (Ted Nolan
8 Jan 25           i`* Re: Tcl9: source files are interpreted as utf-8 by default10Rich
9 Jan 25           i `* Re: Tcl9: source files are interpreted as utf-8 by default9Uwe Schmitz
9 Jan 25           i  `* Re: Tcl9: source files are interpreted as utf-8 by default8Rich
9 Jan 25           i   +* Re: Tcl9: source files are interpreted as utf-8 by default6Luc
10 Jan 25           i   i`* Re: Tcl9: source files are interpreted as utf-8 by default5Rich
10 Jan 25           i   i `* Re: Tcl9: source files are interpreted as utf-8 by default4Luc
10 Jan 25           i   i  `* Re: Tcl9: source files are interpreted as utf-8 by default3eric
10 Jan 25           i   i   `* Re: Tcl9: source files are interpreted as utf-8 by default2Luc
10 Jan 25           i   i    `- Re: Tcl9: source files are interpreted as utf-8 by default1Rich
10 Jan 25           i   `- Re: Tcl9: source files are interpreted as utf-8 by default1Uwe Schmitz
8 Jan 25           `* Re: Tcl9: source files are interpreted as utf-8 by default10saito
9 Jan 25            `* Re: Tcl9: source files are interpreted as utf-8 by default9Rich
9 Jan 25             `* Re: Tcl9: source files are interpreted as utf-8 by default8Luc
9 Jan 25              `* Re: Tcl9: source files are interpreted as utf-8 by default7Rich
9 Jan 25               `* Re: Tcl9: source files are interpreted as utf-8 by default6Luc
9 Jan 25                +* Re: Tcl9: source files are interpreted as utf-8 by default4Uwe Schmitz
9 Jan 25                i`* Re: Tcl9: source files are interpreted as utf-8 by default3Harald Oehlmann
10 Jan 25                i `* Re: Tcl9: source files are interpreted as utf-8 by default2Uwe Schmitz
10 Jan 25                i  `- Re: Tcl9: source files are interpreted as utf-8 by default1Harald Oehlmann
9 Jan 25                `- Re: Tcl9: source files are interpreted as utf-8 by default1Rich

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal