Terry J. Reedy <tjre...@udel.edu> added the comment:

[Yes, indexing will still be O(1), though I personally consider that less 
important than most make it to be. Consistency across platforms and total time 
and space performance of typical apps should be the concern. There is ongoing 
work on improving the new implementation. Some operations already take less 
space and run faster.]

The traceback may very well be helpful. It implies that copying a supplemental 
char does not produce proper utf-8 encoded bytes. Or if it does, tkinter (or tk 
underneath it) does not recognize them. But then the problem should be the 
initial byte, not the continuation bytes, which are the same for all chars and 
which all have 10 for their two high order bits. See
https://secure.wikimedia.org/wikipedia/en/wiki/Utf-8
for a fuller explanation.

Line 1009 is the definition of Misc.mainloop(). I believe self.tk represents 
the embedded tcl interpreter, which is a black box from Python's viewpoint. 
Perhaps we should wrap the call with

try:
  self.tk.mainloop(n)
except Exception as e:
  <print error message with all info attached to e before exiting>

This should catch any miscellaneous crashes which are not otherwise caught and 
maybe turn the crash issues into bug reports -- the same way that running from 
the command line did. (It will still be good to catch what we can at error 
sites and give better, more specific messages.) (What I am not familiar with is 
how the command line interpreter might turn a tcl error into a python exception 
and why IDLE does not.)

When I copy 'š’¢' and paste into the command line interpreter or Notepad++, I get 
'??'. I am guessing that ?? represent a surrogate pair and that Windows 
separately encodes each. The result would be 'illegal' utf-8 with an illegal 
continuation chars. An application can choose to decode the 'illegal' utf-8 -- 
or not. Python can when "errors='surrogate_escape" (or something like that) is 
specified. It might be possible to access the raw undecoded bytes of the 
clipboard with the third party pythonwin module. I do not know if there is 
anyway to do so with tk.

I wonder if tcl is calling back to Python for decoding and whether there was a 
change in the default for errors or the callback specification that would 
explain a change from 2.7 to 3.2.

Ezio, do you know anything about these speculations?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13153>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to