Anders Wegge Keller <we...@wegge.dk> writes: > ... > I have an ongoing issue with my usenet setup. I'm that one dude who don't > want to learn perl. That means that I have to build inn from source, so I > can enable the python interpreter. That's not so bad, and the errors that > show up have been something that I have been able to figure out by myself. > At least up until now. I have an almost 100% repeatable crash, when nnrpd > performs the user authentication step. Backtracing the core dum gives this: > > #0 0x0000564a864e2d63 in ?? () > #1 0x00007f9609567091 in call_function (oparg=<optimized out>, > pp_stack=0x7ffda2d801b0) at ../Python/ceval.c:4352 > > Note: Line 4352 C_TRACE(x, PyCFunction_Call(func,callargs,NULL)); > > #2 PyEval_EvalFrameEx ( > f=Frame 0x7f9604758050, for file /etc/news/filter/nnrpd_auth.py, > line 67, in __init__ (self=<AUTH(dbCursor=<Cursor(_result=None, > description=None, rownumber=None, messages=[], _executed=None, > > ... > > Weird observation #1: Sometimes the reason is SIGSEGV, sometimes it's > SIGILL.
Python tends to be sensitive to the stack size. In previous times, there have often be problems because the stack size for threads has not been large enough. Not sure, whether "nnrpd" is multi threaded and provides a sufficiently large stack for its threads. A "SIGILL" often occurs because a function call has destroyed part of the stack content and the return is erroneous (returning in the midst of an instruction). > ... > I'm not ready to give up yet, but I need some help proceeding from here. > What do the C_TRACE really do, The casing (all upper case letters) indicates a C preprocessor macro. Search the "*.h" files for its definition. I suppose that with a normal Python build (no debug build), the macro will just call "PyCFunction_Call". Alternatively, it might provide support for debugging, tracing (activated by e.g. "pdb.set_trace()"). > and is there some way of getting a level > deeper, to see what cause the SEGV. Also, how can the C code end up with an > illegal instruction_ A likely cause for both "SIGSEGV" and "SIGILL" could be stack corruption leading to a bad return or badly restored register values. I would look at the maschine instructions (i.e. look at the assembler rather than the C level) to find out precisely, which instruction caused the signal. Unfortunately, stack corruption is a non local problem (the point where the problem is caused is usually far away from the point where it is observed). If the problem is not "too small stack size", you might need a tool to analyse memory overrides. -- https://mail.python.org/mailman/listinfo/python-list