Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-06 Thread Victor Stinner
Hi,

2015-02-06 0:27 GMT+01:00 Francis Giraldeau :
> I need to access frame members from within a signal handler for tracing
> purpose.

IMO you have a big technical or design issue here. Accessing Python
internals in a signal handler is not reliable. A signal can occur
anytime, between two instructions.

> However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
> which is not reentrant. If the signal handler nest over PyObject_Malloc(),
> it causes a segfault, and it could also deadlock.

Yes, the list of async signal-safe function is very very short :-)
It's something like: read(), write(), and use the stack (but not too
much stack or you will get a stack overflow).

I spent many weeks to implement the faulthandler module (try to write
a safe and portable implementation). To write the traceback, I only
use write(). But to read the traceback, I inspect Python internals
which is completly unsafe. faulthandler is written to only be called
when something really bad happen (a "crash"), so it's not so important
if it does crash too :-)

See also the tracemalloc module which also inspects the traceback, but
it does *not* use signals (which would be unsafe). It uses hooks on
the memory allocator.

Python has sys.settrace() and sys.setprofile(). Why not using these functions?

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-06 Thread M.-A. Lemburg
On 06.02.2015 00:27, Francis Giraldeau wrote:
> I need to access frame members from within a signal handler for tracing
> purpose. My first attempt to access co_filename was like this (omitting
>  error checking):
> 
> PyFrameObject *frame = PyEval_GetFrame();
> PyObject *ob = PyUnicode_AsUTF8String(frame->f_code->co_filename)
> char *str = PyBytes_AsString(ob)
> 
> However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
> which is not reentrant. If the signal handler nest over PyObject_Malloc(),
> it causes a segfault, and it could also deadlock.
> 
> Instead, I access members directly:
> char *str = PyUnicode_DATA(frame->f_code->co_filename);
> size_t len = PyUnicode_GET_DATA_SIZE(frame->f_code->co_filename);
> 
> Is it safe to assume that unicode objects co_filename and co_name are
> always UTF-8 data for loaded code? I looked at the PyTokenizer_FromString()
> and it seems to convert everything to UTF-8 upfront, and I would like to
> make sure this assumption is valid.

The macros won't work in all cases, as they don't pay attention
to the different kinds used in the Unicode implementation.

I don't think there's any API you can use to extract the
underlying data without going through PyObject_Malloc()
at some point (you may be lucky if there already is a
UTF-8 version available, but it's not guaranteed).

I guess your best bet is to write your own UTF-8
codec which then copies the data to a buffer that
you can control. Have a look at Objects/stringlib/codecs.h:
utf8_encode.

Alternatively, you can copy the data to a Py_UCS4 buffer
which you allocate using code such as this (untested,
adapted from the UTF-8 encoder):

Py_UCS4 *p;
enum PyUnicode_Kind repkind;
void *repdata;
Py_ssize_t repsize, k;

if (PyUnicode_READY(rep) < 0)
goto error;
repkind = PyUnicode_KIND(rep);
repdata = PyUnicode_DATA(rep);
repsize = PyUnicode_GET_LENGTH(rep);

p = malloc((repsize + 1) * sizeof(Py_UCS4));
for(k=0; k>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...   http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-06 Thread Armin Rigo
Hi,

On 6 February 2015 at 08:24, Maciej Fijalkowski  wrote:
> I don't think it's safe to assume f_code is properly filled by the
> time you might read it, depending a bit where you find the frame
> object. Are you sure it's not full of garbage?

Yes, before discussing how to do the utf8 decoding, we should realize
that it is really unsafe code starting from the line before.  From a
signal handler you're only supposed to read data that was written to
"volatile" fields.  So even PyEval_GetFrame(), which is done by
reading the thread state's "frame" field, is not safe: this is not a
volatile.  This means that the compiler is free to do crazy things
like *first* write into this field and *then* initialize the actual
content of the frame.  The uninitialized content may be garbage, not
just NULLs.


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2015-02-06 Thread Python tracker

ACTIVITY SUMMARY (2015-01-30 - 2015-02-06)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open4765 ( +1)
  closed 30398 (+47)
  total  35163 (+48)

Open issues with patches: 2224 


Issues opened (29)
==

#16632: Enable DEP and ASLR
http://bugs.python.org/issue16632  reopened by haypo

#23354: Loading 2 GiLOC file which raises exception causes wrong trace
http://bugs.python.org/issue23354  opened by SoniEx2

#23356: In argparse docs simplify example about argline
http://bugs.python.org/issue23356  opened by py.user

#23357: pyvenv help shows incorrect usage
http://bugs.python.org/issue23357  opened by raulcd

#23359: Speed-up set_lookkey()
http://bugs.python.org/issue23359  opened by rhettinger

#23360: Content-Type when sending data with urlopen()
http://bugs.python.org/issue23360  opened by vadmium

#23361: integer overflow in winapi_createprocess
http://bugs.python.org/issue23361  opened by pkt

#23362: integer overflow in string translate
http://bugs.python.org/issue23362  opened by pkt

#23367: integer overflow in unicodedata.normalize
http://bugs.python.org/issue23367  opened by pkt

#23368: integer overflow in _PyUnicode_AsKind
http://bugs.python.org/issue23368  opened by pkt

#23371: mimetypes initialization fails on Windows because of TypeError
http://bugs.python.org/issue23371  opened by Slion

#23372: defaultdict.fromkeys should accept a callable factory
http://bugs.python.org/issue23372  opened by justanr

#23374: pydoc 3.x raises UnicodeEncodeError on sqlite3 package
http://bugs.python.org/issue23374  opened by skip.montanaro

#23375: test_py3kwarn fails on Windows
http://bugs.python.org/issue23375  opened by serhiy.storchaka

#23376: getargs.c: redundant C-contiguity check
http://bugs.python.org/issue23376  opened by skrah

#23377: HTTPResponse may drop buffer holding next response
http://bugs.python.org/issue23377  opened by vadmium

#23378: argparse.add_argument action parameter should allow value exte
http://bugs.python.org/issue23378  opened by the.mulhern

#23382: Maybe can not shutdown ThreadPoolExecutor when call the method
http://bugs.python.org/issue23382  opened by miles

#23383: Clean up bytes formatting
http://bugs.python.org/issue23383  opened by serhiy.storchaka

#23384: urllib.proxy_bypass_registry slow down under Windows if websit
http://bugs.python.org/issue23384  opened by aristotel

#23387: test_urllib2 fails with HTTP Error 502: Bad Gateway
http://bugs.python.org/issue23387  opened by berker.peksag

#23388: datetime.strftime('%s') does not take timezone into account
http://bugs.python.org/issue23388  opened by cameris

#23389: pkgutil.find_loader raises an ImportError on PEP 420 implicit 
http://bugs.python.org/issue23389  opened by alynn

#23391: Documentation of EnvironmentError (OSError) arguments disappea
http://bugs.python.org/issue23391  opened by vadmium

#23394: No garbage collection at end of main thread
http://bugs.python.org/issue23394  opened by François.Trahan

#23395: _thread.interrupt_main() errors if SIGINT handler in SIG_DFL, 
http://bugs.python.org/issue23395  opened by takluyver

#23397: PEP 431 implementation
http://bugs.python.org/issue23397  opened by berker.peksag

#23400: Inconsistent behaviour of multiprocessing.Queue() if sem_open 
http://bugs.python.org/issue23400  opened by olebole

#23401: Add pickle support of Mapping views
http://bugs.python.org/issue23401  opened by serhiy.storchaka



Most recent 15 issues with no replies (15)
==

#23400: Inconsistent behaviour of multiprocessing.Queue() if sem_open 
http://bugs.python.org/issue23400

#23395: _thread.interrupt_main() errors if SIGINT handler in SIG_DFL, 
http://bugs.python.org/issue23395

#23391: Documentation of EnvironmentError (OSError) arguments disappea
http://bugs.python.org/issue23391

#23387: test_urllib2 fails with HTTP Error 502: Bad Gateway
http://bugs.python.org/issue23387

#23384: urllib.proxy_bypass_registry slow down under Windows if websit
http://bugs.python.org/issue23384

#23378: argparse.add_argument action parameter should allow value exte
http://bugs.python.org/issue23378

#23377: HTTPResponse may drop buffer holding next response
http://bugs.python.org/issue23377

#23368: integer overflow in _PyUnicode_AsKind
http://bugs.python.org/issue23368

#23367: integer overflow in unicodedata.normalize
http://bugs.python.org/issue23367

#23354: Loading 2 GiLOC file which raises exception causes wrong trace
http://bugs.python.org/issue23354

#23331: Add non-interactive version of Bdb.runcall
http://bugs.python.org/issue23331

#23330: h2py.py regular expression missing
http://bugs.python.org/issue23330

#23325: Turn SIG_DFL and SIG_IGN into functions
http://bugs.python.org/issue23325

#23319: Missing SWAP_INT in I_set_sw
http://bugs.python.org/issue23319

#23314: Disabling CRT asserts in de

Re: [Python-Dev] Encoding of PyFrameObject members

2015-02-06 Thread Francis Giraldeau
2015-02-06 6:04 GMT-05:00 Armin Rigo :

> Hi,
>
> On 6 February 2015 at 08:24, Maciej Fijalkowski  wrote:
> > I don't think it's safe to assume f_code is properly filled by the
> > time you might read it, depending a bit where you find the frame
> > object. Are you sure it's not full of garbage?


> Yes, before discussing how to do the utf8 decoding, we should realize
> that it is really unsafe code starting from the line before.  From a
> signal handler you're only supposed to read data that was written to
> "volatile" fields.  So even PyEval_GetFrame(), which is done by
> reading the thread state's "frame" field, is not safe: this is not a
> volatile.  This means that the compiler is free to do crazy things
> like *first* write into this field and *then* initialize the actual
> content of the frame.  The uninitialized content may be garbage, not
> just NULLs.
>

Thanks for these comments. Of course accessing frames withing a signal
handler is racy. I confirm that code encoded in non-ascii is not accessible
from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes
the data and caches it in the unicode object. Later access returns the byte
buffer without memory allocation and re-encoding.

I think it is possible to solve both safety problems by registering a
handler with PyPyEval_SetProfile(). On function entry, the handler will
call PyUnicode_AsUTF8() on the required frame members to make sure the utf8
encoded string is available. Then, we increment the refcount of the frame
and assign it to a thread local pointer. On function return, the refcount
is decremented. These operations occurs in the normal context and they are
not racy. The signal handler will use the thread local frame pointer
instead of calling PyEval_GetFrame(). Does that sounds good?

Thanks again for your feedback!

Francis
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] installing python 2.7.9 on a Mac

2015-02-06 Thread Laura Creighton
webmaster just got mail from a novice who is trying to learn Python in
an introductory class.  She got a "The version of Tcl/Tk (8.5.7) in
use may be unstable" message.

I think that the download page should have a link.
If you get 
download and install .  Any reason we cannot do that?

Laura

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com