I've been trying to debug this for two days now, and it's a longshot but I'm hoping that someone here might recognize a solution. I've got a C extension which calls a function in a C library, which calls another function in another library, which calls another function, which calls fmod from the standard C math library. All of these are shared libraries on Linux (x86 Gentoo 2.6.9). In other words, the calling looks like this:
Python -> Python C extension -> Function from library 1 -> Function from library 2 -> fmod from libm.so The critical line of C code looks like this: ans = fmod ( x, 2.0 ); .. where x is a double with value 2.0 (confirmed by gdb) and ans is a double. Now, fmod(2.0, 2.0) should be 0.0. The problem? ans is getting assigned nan! I have stepped through it in the debugger now dozens of times. Either fmod is putting the wrong return value on the stack, or the stack is getting corrupted by something else and "ans" is getting assigned the wrong value. This happens only inside of the layered Python extension mess; if I try to compile a test C program that makes similar calls to fmod, they work just fine. Likewise, a very simple Python wrapper around fmod also works (e.g. Python -> Python C extension -> fmod) This all runs in a single thread, so it doesn't seem like it would be a threading issue unless Python is making some threads under the hood. All of the intermediary libraries were compiled with (-g -fPIC), no optimization. The intermediary libraries represent thousands of lines of very old code. It is very possible that all sorts of memory leaks and other subtle bugs exist, but what kind of memory leak could even cause this kind of glitch!? How can I even approach debugging it? My next step right now is going to be stepping through the individual instructions... arrrrrrrrrrrrrgggh. Versions: Python 2.4.1, gcc 3.3.6, glibc 2.3.5, Gentoo Linux 2.6.9. -- http://mail.python.org/mailman/listinfo/python-list