Sujet : Re: Decoding bytes to text strings in Python 2
De : usenet202101 (at) *nospam* magic-cookie.co.ukNOSPAMPLEASE (Rayner Lucas)
Groupes : comp.lang.pythonDate : 22. Jun 2024, 14:26:00
Autres entêtes
Organisation : The Lumber Cartel (TINLC)
Message-ID : <MPG.40e0d331681f012e9896e1@news.eternal-september.org>
References : 1 2
User-Agent : MicroPlanet-Gravity/3.0.4
In article <
Text-20240621184010@ram.dialup.fu-berlin.de>,
ram@zedat.fu-berlin.de says...
I didn't really do a super thorough deep dive on this,
but I'm just giving the initial impression without
actually being familiar with Tkinter under Python 2,
so I might be wrong!
The Text widget typically expects text in Tcl encoding,
which is usually UTF-8.
This is independent of the result returned by sys.get-
defaultencoding()!
If a UTF-8 string is inserted directly as a bytes object,
its code points will be displayed correctly by the Text
widget as long as they are in the BMP (Basic Multilingual
Plane), as you already found out yourself.
Many thanks, you've helped me greatly in understanding what's happening.
When I tried running my example code on a different system (Python
2.7.18 on Linux, with Tcl/Tk 8.5), I got the error:
_tkinter.TclError: character U+1f40d is above the range (U+0000-U+FFFF)
allowed by Tcl
So, as your reply suggests, the problem is ultimately a limitation of
Tcl/Tk itself. Perhaps I should have spent more time studying the docs
for that instead of puzzling over the details of character encodings in
Python! I'm not sure why it doesn't give the same error on Windows, but
at least now I know where the root of the issue is.
I am now much better informed about how to migrate the code I'm working
on, so I am very grateful for your help.
Thanks,
Rayner