Dave Malcolm <dmalc...@redhat.com> added the comment:

I'm attaching a new version of the patch, for the py3k branch.

I changed my mind back about the breakpoint, using "id" and "builtin_id" as in 
my original patch.  I prefer it since it has a single argument, which makes it 
very convenient to work with in the various tests - textiowrapper_write takes 
an args tuple, which makes things like corrupting the pointer slightly more 
tricky.

The big change here is that I've changed the output format throughout to try to 
emulate Python 3 literals: a PyLongObject instance is now printed as digits, 
without a trailing "L".  I feel that the fact that gdb is running python 2 is 
really just an implementation detail, and that the pretty-printer ought to 
print in a format reflecting the language being debugged.

This also removes the 'u' prefix from strings, and I've added tests for 'bytes' 
(which get a "b" prefix).  I've also (I believe) correctly implemented the 
Python 3's literal representation for empty and non-empty sets and frozensets ( 
e.g. "{1, 2, 3}", as opposed to Python 2's "set([1, 2, 3])" )

More controversially, a PyUnicodeObject instance is printed using an emulation 
of Python 3's unicode_repr algorithm, which means that gdb prints unicode to 
sys.stdout, so that gdb will potentially print non-ASCII characters, using the 
encoding of sys.stdout.  This will only work if gdb's encoding is set to 
something that can cope with said characters:

Python 3.2a0 (py3k:80312M, Apr 21 2010, 17:00:02) 
[GCC 4.4.3 20100127 (Red Hat 4.4.3-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> id('文字化け')

Breakpoint 1, builtin_id (self=<module at remote 0x7ffff7fd7df8>, v='文字化け') at 
Python/bltinmodule.c:912
912             return PyLong_FromVoidPtr(v);

Note the unicode characters in the rendering of "v" in the breakpoint.

I suspect that this is a change too far (for example, I'm assuming a UTF-8 
locale).

Any suggestions on what the output should look like for the unicode case?  

Would it be better if I coerce everything back to an escaped literal syntax 
that's encodable as ASCII?  That would probably avoid encoding and locale 
issues, but lose immediate readability for people able to read non-ASCII 
scripts.

All tests pass with both UCS2 and UCS4 builds on this Fedora 12 x86_64 box, 
building with --with-pydebug in both cases.

----------
Added file: http://bugs.python.org/file17031/port-gdb7-hooks-to-py3k-002.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8380>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to