On 1/17/2023 8:46 PM, rbowman wrote:
On Tue, 17 Jan 2023 12:47:29 +0000, Stephen Tucker wrote:

2. Does the IDLE in Python 3.x behave the same way?

fwiw

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license()" for more information.
str = ""
for c in range(157, 169):
     str += chr(c) + ""

print(str)
žŸ ¡¢£¤¥¦§¨
str = ""
for c in range(140, 169):
     str += chr(c) + " "

print(str)
Œ  Ž   ‘ ’ “ ” • – — ˜ ™ š › œ  ž Ÿ   ¡ ¢ £ ¤ ¥
¦ § ¨


I don't know how this will appear since Pan is showing the icon for a
character not in its set.  However, even with more undefined characters
the printable one do not change. I get the same output running Python3
from the terminal so it's not an IDLE thing.

I'm not sure what explanation is being asked for here. Let's take Python3, so we can be sure that the strings are in unicode. The font being used by the console isn't mentioned, but there's no reason it should have glyphs for any random unicode character. In my case, I see the same missing and printable characters as in the previous post (above). The font is Source Code Pro Medium.

Changing the console's code page won't magically provide the missing glyphs.

I wrote these characters to a file using utf-8 encoding and opened it in an editor that recognized the content as utf-8 (EditPlus). It displayed the same characters but had fewer leading spaces (i.e., missing glyphs), and did not show any default "missing-character" glyphs. The editor is using the Cousine font.

The second factor that could be in play is what the default character encoding is, which is set by Windows and could be different in different places (locales). I don't recall just now how Python3 handles this. Since Python2 strings are not unicode unless specified, and Python2 probably handles the locale/default encoding differently from Python3, it would not be a surprise if the two give different results.

If you print such a Python2 string, you will get glyphs for (non-ascii) ord(chr) > 127 that come from the Windows code page table, which will be different from what Python3 will display.

Python3 uses Windows Unicode API functions, and isn't subject to the same limitations as Python2 was - Python2 had to go though the Windows code page apparatus and didn't use the Unicode API. See PEP 528 - https://peps.python.org/pep-0528/)

IDLE sets up its own window itself, and probably uses a different font from the default Windows console, so there could be some differences there too, especially as to whether missing glyphs show a visible symbol or not.

Code Page 65001 was often claimed to be for utf-8. It's not really correct in general, but it's OK for many utf-8 characters. But in Python2, the codecs module does not know about code page 65001 - unless you apply a simple patch - so if you try to set the console to cp65001, you cannot get anything printed. You get an exception raised instead.

Yes, it's all confusing, and especially with Python2.


--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to