Bugs item #1579370, was opened at 2006-10-17 19:23 Message generated for change (Comment added) made by nnorwitz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1579370&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Interpreter Core Group: Python 2.5 Status: Open Resolution: None >Priority: 9 Private: No Submitted By: Mike Klaas (mklaas) Assigned to: Nobody/Anonymous (nobody) Summary: Segfault provoked by generators and exceptions Initial Comment: A reproducible segfault when using heavily-nested generators and exceptions. Unfortunately, I haven't yet been able to provoke this behaviour with a standalone python2.5 script. There are, however, no third-party c extensions running in the process so I'm fairly confident that it is a problem in the core. The gist of the code is a series of nested generators which leave scope when an exception is raised. This exception is caught and re-raised in an outer loop. The old exception was holding on to the frame which was keeping the generators alive, and the sequence of generator destruction and new finalization caused the segfault. ---------------------------------------------------------------------- >Comment By: Neal Norwitz (nnorwitz) Date: 2007-01-16 23:01 Message: Logged In: YES user_id=33168 Originator: NO Bumping priority to see if this should go into 2.5.1. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2007-01-04 02:42 Message: Logged In: YES user_id=21627 Originator: NO Why do frame objects have a thread state in the first place? In particular, why does PyTraceBack_Here get the thread state from the frame, instead of using the current thread? Introduction of f_tstate goes back to r7882, but it is not clear why it was done that way. ---------------------------------------------------------------------- Comment By: Andrew Waters (awaters) Date: 2007-01-04 01:35 Message: Logged In: YES user_id=1418249 Originator: NO This fixes the segfault problem that I was able to reliably reproduce on Linux. We need to get this applied (assuming it is the correct fix) to the source to make Python 2.5 usable for me in production code. ---------------------------------------------------------------------- Comment By: Mike Klaas (mklaas) Date: 2006-11-27 10:41 Message: Logged In: YES user_id=1611720 Originator: YES The following patch resets the thread state of the generator when it is resumed, which prevents the segfault for me: Index: Objects/genobject.c =================================================================== --- Objects/genobject.c (revision 52849) +++ Objects/genobject.c (working copy) @@ -77,6 +77,7 @@ Py_XINCREF(tstate->frame); assert(f->f_back == NULL); f->f_back = tstate->frame; + f->f_tstate = tstate; gen->gi_running = 1; result = PyEval_EvalFrameEx(f, exc); ---------------------------------------------------------------------- Comment By: Eric Noyau (eric_noyau) Date: 2006-11-27 10:07 Message: Logged In: YES user_id=1388768 Originator: NO We are experiencing the same segfault in our application, reliably. Running our unit test suite just segfault everytime on both Linux and Mac OS X. Applying Martin's patch fixes the segfault, and makes everything fine and dandy, at the cost of some memory leaks if I understand properly. This particular bug prevents us to upgrade to python 2.5 in production. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2006-10-27 22:18 Message: Logged In: YES user_id=31435 > I tried Tim's hope.py on Linux x86_64 and > Mac OS X 10.4 with debug builds and neither > one crashed. Tim's guess looks pretty damn > good too. Neal, note that it's the /Windows/ malloc that fills freed memory with "dangerous bytes" in a debug build -- this really has nothing to do with that it's a debug build of /Python/ apart from that on Windows a debug build of Python also links in the debug version of Microsoft's malloc. The valgrind report is pointing at the same thing. Whether this leads to a crash is purely an accident of when and how the system malloc happens to reuse the freed memory. ---------------------------------------------------------------------- Comment By: Neal Norwitz (nnorwitz) Date: 2006-10-27 21:56 Message: Logged In: YES user_id=33168 Mike, what platform are you having the problem on? I tried Tim's hope.py on Linux x86_64 and Mac OS X 10.4 with debug builds and neither one crashed. Tim's guess looks pretty damn good too. Here's the result of valgrind: Invalid read of size 8 at 0x4CEBFE: PyTraceBack_Here (traceback.c:117) by 0x49C1F1: PyEval_EvalFrameEx (ceval.c:2515) by 0x4F615D: gen_send_ex (genobject.c:82) by 0x4F6326: gen_close (genobject.c:128) by 0x4F645E: gen_del (genobject.c:163) by 0x4F5F00: gen_dealloc (genobject.c:31) by 0x44D207: _Py_Dealloc (object.c:1928) by 0x44534E: dict_dealloc (dictobject.c:801) by 0x44D207: _Py_Dealloc (object.c:1928) by 0x4664FF: subtype_dealloc (typeobject.c:686) by 0x44D207: _Py_Dealloc (object.c:1928) by 0x42325D: instancemethod_dealloc (classobject.c:2287) Address 0x56550C0 is 88 bytes inside a block of size 152 free'd at 0x4A1A828: free (vg_replace_malloc.c:233) by 0x4C3899: tstate_delete_common (pystate.c:256) by 0x4C3926: PyThreadState_DeleteCurrent (pystate.c:282) by 0x4D4043: t_bootstrap (threadmodule.c:448) by 0x4B24C48: pthread_start_thread (in /lib/libpthread-0.10.so) The only way I can think to fix this is to keep a set of active generators in the PyThreadState and calling gen_send_ex(exc=1) for all the active generators before killing the tstate in t_bootstrap. ---------------------------------------------------------------------- Comment By: Michael Hudson (mwh) Date: 2006-10-19 00:58 Message: Logged In: YES user_id=6656 > and for some reason Python uses the system malloc directly > to obtain memory for thread states. This bit is fairly easy: they are allocated without the GIL being held, which breaks an assumption of PyMalloc. No idea about the real problem, sadly. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2006-10-18 17:38 Message: Logged In: YES user_id=31435 I've attached a much simplified pure-Python script (hope.py) that reproduces a problem very quickly, on Windows, in a /debug/ build of current trunk. It typically prints: exiting generator joined thread at most twice before crapping out. At the time, the `next` argument to newtracebackobject() is 0xdddddddd, and tracing back a level shows that, in PyTraceBack_Here(), frame->tstate is entirely filled with 0xdd bytes. Note that this is not a debug-build obmalloc gimmick! This is Microsoft's similar debug-build gimmick for their malloc, and for some reason Python uses the system malloc directly to obtain memory for thread states. The Microsoft debug free() fills newly-freed memory with 0xdd, which has the same meaning as the debug-build obmalloc's DEADBYTE (0xdb). So somebody is accessing a thread state here after it's been freed. Best guess is that the generator is getting "cleaned up" after the thread that created it has gone away, so the generator's frame's f_tstate is trash. Note that a PyThreadState (a frame's f_tstate) is /not/ a Python object -- it's just a raw C struct, and its lifetime isn't controlled by refcounts. ---------------------------------------------------------------------- Comment By: Mike Klaas (mklaas) Date: 2006-10-18 17:12 Message: Logged In: YES user_id=1611720 Despite Tim's reassurrance, I'm afraid that Martin's patch does infact prevent the segfault. Sounds like it also introduces a memleak. ---------------------------------------------------------------------- Comment By: Tim Peters (tim_one) Date: 2006-10-18 14:57 Message: Logged In: YES user_id=31435 > Can anybody tell why gi_frame *isn't* incref'ed when > the generator is created? As documented (in concrete.tex), PyGen_New(f) steals a reference to the frame passed to it. Its only call site (well, in the core) is in ceval.c, which returns immediately after PyGen_New takes over ownership of the frame the caller created: """ /* Create a new generator that owns the ready to run frame * and return that as the value. */ return PyGen_New(f); """ In short, that PyGen_New() doesn't incref the frame passed to it is intentional. It's possible that the intent is flawed ;-), but offhand I don't see how. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-10-18 14:05 Message: Logged In: YES user_id=21627 Can you please review/try attached patch? Can anybody tell why gi_frame *isn't* incref'ed when the generator is created? ---------------------------------------------------------------------- Comment By: Mike Klaas (mklaas) Date: 2006-10-18 12:47 Message: Logged In: YES user_id=1611720 I cannot yet produce an only-python script which reproduces the problem, but I can give an overview. There is a generator running in one thread, an exception being raised in another thread, and as a consequent, the generator in the first thread is garbage-collected (triggering an exception due to the new generator cleanup). The problem is extremely sensitive to timing--often the insertion/removal of print statements, or reordering the code, causes the problem to vanish, which is confounding my ability to create a simple test script. def getdocs(): def f(): <some somehwat time-consuming operation> while True: f() yield None # ----------------------------------------------------------------------------- class B(object): def __init__(self,): pass def doit(self): # must be an instance var to trigger segfault self.docIter = getdocs() print self.docIter # this is the generator referred-to in the traceback for i, item in enumerate(self.docIter): if i > 9: break print 'exiting generator' class A(object): """ Process entry point / main thread """ def __init__(self): while True: try: self.func() except Exception, e: print 'right after raise' def func(self): b = B() thread = threading.Thread(target=b.doit) thread.start() start_t = time.time() while True: try: if time.time() - start_t > 1: raise Exception except Exception: print 'right before raise' # SIGSEGV here. If this is changed to # 'break', no segfault occurs raise if __name__ == '__main__': A() ---------------------------------------------------------------------- Comment By: Mike Klaas (mklaas) Date: 2006-10-18 12:37 Message: Logged In: YES user_id=1611720 I've produced a simplified traceback with a single generator . Note the frame being used in the traceback (#0) is the same frame being dealloc'd (#11). The relevant call in traceback.c is: PyTraceBack_Here(PyFrameObject *frame) { PyThreadState *tstate = frame->f_tstate; PyTracebackObject *oldtb = (PyTracebackObject *) tstate->curexc_traceback; PyTracebackObject *tb = newtracebackobject(oldtb, frame); and I can verify that oldtb contains garbage: (gdb) print frame $1 = (PyFrameObject *) 0x8964d94 (gdb) print frame->f_tstate $2 = (PyThreadState *) 0x895b178 (gdb) print $2->curexc_traceback $3 = (PyObject *) 0x66 #0 0x080e4296 in PyTraceBack_Here (frame=0x8964d94) at Python/traceback.c:94 #1 0x080b9ab7 in PyEval_EvalFrameEx (f=0x8964d94, throwflag=1) at Python/ceval.c:2459 #2 0x08101a40 in gen_send_ex (gen=0xb7cca4ac, arg=0x81333e0, exc=1) at Objects/genobject.c:82 #3 0x08101c0f in gen_close (gen=0xb7cca4ac, args=0x0) at Objects/genobject.c:128 #4 0x08101cde in gen_del (self=0xb7cca4ac) at Objects/genobject.c:163 #5 0x0810195b in gen_dealloc (gen=0xb7cca4ac) at Objects/genobject.c:31 #6 0x080815b9 in dict_dealloc (mp=0xb7cc913c) at Objects/dictobject.c:801 #7 0x080927b2 in subtype_dealloc (self=0xb7cca76c) at Objects/typeobject.c:686 #8 0x0806028d in instancemethod_dealloc (im=0xb7d07f04) at Objects/classobject.c:2285 #9 0x080815b9 in dict_dealloc (mp=0xb7cc90b4) at Objects/dictobject.c:801 #10 0x080927b2 in subtype_dealloc (self=0xb7cca86c) at Objects/typeobject.c:686 #11 0x081028c5 in frame_dealloc (f=0x8964a94) at Objects/frameobject.c:416 #12 0x080e41b1 in tb_dealloc (tb=0xb7cc1fcc) at Python/traceback.c:34 #13 0x080e41c2 in tb_dealloc (tb=0xb7cc1f7c) at Python/traceback.c:33 #14 0x08080dca in insertdict (mp=0xb7f99824, key=0xb7ccd020, hash=1492466088, value=0xb7ccd054) at Objects/dictobject.c:394 #15 0x080811a4 in PyDict_SetItem (op=0xb7f99824, key=0xb7ccd020, value=0xb7ccd054) at Objects/dictobject.c:619 #16 0x08082dc6 in PyDict_SetItemString (v=0xb7f99824, key=0x8129284 "exc_traceback", item=0xb7ccd054) at Objects/dictobject.c:2103 #17 0x080e2837 in PySys_SetObject (name=0x8129284 "exc_traceback", v=0xb7ccd054) at Python/sysmodule.c:82 #18 0x080bc9e5 in PyEval_EvalFrameEx (f=0x895f934, throwflag=0) at Python/ceval.c:2954 ---Type <return> to continue, or q <return> to quit--- #19 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7f6ade8, globals=0xb7fafa44, locals=0x0, args=0xb7cc5ff8, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2833 #20 0x08104083 in function_call (func=0xb7cc7294, arg=0xb7cc5fec, kw=0x0) at Objects/funcobject.c:517 #21 0x0805a660 in PyObject_Call (func=0xb7cc7294, arg=0xb7cc5fec, kw=0x0) at Objects/abstract.c:1860 ---------------------------------------------------------------------- Comment By: Mike Klaas (mklaas) Date: 2006-10-17 19:23 Message: Logged In: YES user_id=1611720 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1208400192 (LWP 26235)] 0x080e4296 in PyTraceBack_Here (frame=0x9c2d7b4) at Python/traceback.c:94 94 if ((next != NULL && !PyTraceBack_Check(next)) || (gdb) bt #0 0x080e4296 in PyTraceBack_Here (frame=0x9c2d7b4) at Python/traceback.c:94 #1 0x080b9ab7 in PyEval_EvalFrameEx (f=0x9c2d7b4, throwflag=1) at Python/ceval.c:2459 #2 0x08101a40 in gen_send_ex (gen=0xb64f880c, arg=0x81333e0, exc=1) at Objects/genobject.c:82 #3 0x08101c0f in gen_close (gen=0xb64f880c, args=0x0) at Objects/genobject.c:128 #4 0x08101cde in gen_del (self=0xb64f880c) at Objects/genobject.c:163 #5 0x0810195b in gen_dealloc (gen=0xb64f880c) at Objects/genobject.c:31 #6 0x080b9912 in PyEval_EvalFrameEx (f=0x9c2802c, throwflag=1) at Python/ceval.c:2491 #7 0x08101a40 in gen_send_ex (gen=0xb64f362c, arg=0x81333e0, exc=1) at Objects/genobject.c:82 #8 0x08101c0f in gen_close (gen=0xb64f362c, args=0x0) at Objects/genobject.c:128 #9 0x08101cde in gen_del (self=0xb64f362c) at Objects/genobject.c:163 #10 0x0810195b in gen_dealloc (gen=0xb64f362c) at Objects/genobject.c:31 #11 0x080815b9 in dict_dealloc (mp=0xb64f4a44) at Objects/dictobject.c:801 #12 0x080927b2 in subtype_dealloc (self=0xb64f340c) at Objects/typeobject.c:686 #13 0x0806028d in instancemethod_dealloc (im=0xb796a0cc) at Objects/classobject.c:2285 #14 0x080815b9 in dict_dealloc (mp=0xb64f78ac) at Objects/dictobject.c:801 #15 0x080927b2 in subtype_dealloc (self=0xb64f810c) at Objects/typeobject.c:686 #16 0x081028c5 in frame_dealloc (f=0x9c272bc) at Objects/frameobject.c:416 #17 0x080e41b1 in tb_dealloc (tb=0xb799166c) at Python/traceback.c:34 #18 0x080e41c2 in tb_dealloc (tb=0xb4071284) at Python/traceback.c:33 #19 0x080e41c2 in tb_dealloc (tb=0xb7991824) at Python/traceback.c:33 #20 0x08080dca in insertdict (mp=0xb7f56824, key=0xb3fb9930, hash=1492466088, value=0xb3fb9914) at Objects/dictobject.c:394 #21 0x080811a4 in PyDict_SetItem (op=0xb7f56824, key=0xb3fb9930, value=0xb3fb9914) at Objects/dictobject.c:619 #22 0x08082dc6 in PyDict_SetItemString (v=0xb7f56824, key=0x8129284 "exc_traceback", item=0xb3fb9914) at Objects/dictobject.c:2103 #23 0x080e2837 in PySys_SetObject (name=0x8129284 "exc_traceback", v=0xb3fb9914) at Python/sysmodule.c:82 #24 0x080bc9e5 in PyEval_EvalFrameEx (f=0x9c10e7c, throwflag=0) at Python/ceval.c:2954 #25 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7bbc890, globals=0xb7bbe57c, locals=0x0, args=0x9b8e2ac, argcount=1, kws=0x9b8e2b0, kwcount=0, defs=0xb7b7aed8, defcount=1, closure=0x0) at Python/ceval.c:2833 #26 0x080bd62a in PyEval_EvalFrameEx (f=0x9b8e16c, throwflag=0) at Python/ceval.c:3662 #27 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7bbc848, globals=0xb7bbe57c, locals=0x0, args=0xb7af9d58, argcount=1, kws=0x9b7a818, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2833 #28 0x08104083 in function_call (func=0xb7b79c34, arg=0xb7af9d4c, kw=0xb7962c64) at Objects/funcobject.c:517 #29 0x0805a660 in PyObject_Call (func=0xb7b79c34, arg=0xb7af9d4c, kw=0xb7962c64) at Objects/abstract.c:1860 #30 0x080bcb4b in PyEval_EvalFrameEx (f=0x9b82c0c, throwflag=0) at Python/ceval.c:3846 #31 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7cd6608, globals=0xb7cd4934, locals=0x0, args=0x9b7765c, argcount=2, kws=0x9b77664, kwcount=0, defs=0x0, defcount=0, closure=0xb7cfe874) at Python/ceval.c:2833 #32 0x080bd62a in PyEval_EvalFrameEx (f=0x9b7751c, throwflag=0) at Python/ceval.c:3662 #33 0x080bdf70 in PyEval_EvalFrameEx (f=0x9a9646c, throwflag=0) at Python/ceval.c:3652 #34 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7f39728, globals=0xb7f6ca44, locals=0x0, args=0x9b7a00c, argcount=0, kws=0x9b7a00c, kwcount=0, defs=0x0, defcount=0, closure=0xb796410c) at Python/ceval.c:2833 #35 0x080bd62a in PyEval_EvalFrameEx (f=0x9b79ebc, throwflag=0) at Python/ceval.c:3662 #36 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7f39770, globals=0xb7f6ca44, locals=0x0, args=0x99086c0, argcount=0, kws=0x99086c0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2833 #37 0x080bd62a in PyEval_EvalFrameEx (f=0x9908584, throwflag=0) at Python/ceval.c:3662 #38 0x080bfda3 in PyEval_EvalCodeEx (co=0xb7f397b8, globals=0xb7f6ca44, locals=0xb7f6ca44, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2833 ---Type <return> to continue, or q <return> to quit--- #39 0x080bff32 in PyEval_EvalCode (co=0xb7f397b8, globals=0xb7f6ca44, locals=0xb7f6ca44) at Python/ceval.c:494 #40 0x080ddff1 in PyRun_FileExFlags (fp=0x98a4008, filename=0xbfffd4a3 "scoreserver.py", start=257, globals=0xb7f6ca44, locals=0xb7f6ca44, closeit=1, flags=0xbfffd298) at Python/pythonrun.c:1264 #41 0x080de321 in PyRun_SimpleFileExFlags (fp=Variable "fp" is not available. ) at Python/pythonrun.c:870 #42 0x08056ac4 in Py_Main (argc=1, argv=0xbfffd334) at Modules/main.c:496 #43 0x00a69d5f in __libc_start_main () from /lib/libc.so.6 #44 0x08056051 in _start () ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1579370&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com