[ https://issues.apache.org/jira/browse/PYLUCENE-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376483#comment-17376483 ]
Erik Groeneveld commented on PYLUCENE-58: ----------------------------------------- Hi Andy, thanks for your efforts! I had a lengthy holiday, so I am a bit late with this response. Too bad you were not able to reproduce it. It is persistent here. Stepping with gdb told me that the segv was in PyErr_BadInternalCall() that tried to format an error messsage because PyDict_Check(op) said op was invalid. But 'op' in this case was a freshly minted dict for the the new lucene module. How can that be invalid? My only hypothesis I can think of is that memory/gc is corrupted earlier, because the code that crashes does nothing with (pylucene) input parameters, it just creates a dict. That means that the missing INCREF could very well be the problem. So I reran the test with trunk. Building the trunk did not work because: # the Makefile assumes ~/apache/lucene.git # it generates the 8.9.0 version # it does not have the lucene jars To work around this, I downloaded 8.9.0 sources, and patched the trunk over it with svn export --force. Indeed there is INCREF in makeType now. Unfortunately, the SEGV still happens. I tracked it down to a point where a Py_DECREF is done on a temporary unicode string. That objects type has invalid values for most of its tp_xxx fields. 0x27, 0x47 etc. Is SEGVs on trying to retrieve the tp_dealloc function. It looks like the type object for unicode strings is gone and ob_type (of the unicode object) is pointing to garbage. This could be a bug in Python, however, the code that creates a dict is completely generic (no args related to pylucene or whatsoever) and I think that is not the case. I am suspecting a DECREF too much or some sort of other bug in the initialisation of PyLucene. However, I can not pinpoint it at the moment and I have little clues about how to proceed from here. Maybe you have a suggestion? Best regards, Erik > SEGV on import lucene > --------------------- > > Key: PYLUCENE-58 > URL: https://issues.apache.org/jira/browse/PYLUCENE-58 > Project: PyLucene > Issue Type: Bug > Environment: Debian Buster, Python 3.7 > Reporter: Erik Groeneveld > Priority: Critical > > Hi Andy, > Thanks again for your great work on PyLucene and JCC! > Recently, after porting everything to python3, we get occasional SEGV's on > shutdown. It happens very late, when the garbage collector starts cleaning up. > Using python3-dbg exposed another problem however. With python3-dbg, "import > lucene" already triggers SEGV. Here is the top of the backtrace: > > {code:bash} > #0 0x0000000000000060 in ?? () > #1 0x00007fe8aee51d6e in unicode_fromformat_write_cstr > (writer=writer@entry=0x7ffdc0dcd170, str=<optimized out>, > width=width@entry=-1, precision=<optimized out>) at > ../Objects/unicodeobject.c:2596 > #2 0x00007fe8aee525ec in unicode_fromformat_arg (vargs=0x7ffdc0dcd150, > f=<optimized out>, writer=0x7ffdc0dcd170) at ../Objects/unicodeobject.c:2797 > #3 PyUnicode_FromFormatV (format=<optimized out>, vargs=<optimized out>) at > ../Objects/unicodeobject.c:2914 > #4 0x00007fe8aedca3dd in PyErr_FormatV (exception=<type at remote 0x811cc0>, > format=0x7fe8aefe2568 "%s:%d: bad argument to internal function", > vargs=vargs@entry=0x7ffdc0dcd210) at ../Python/errors.c:835 > #5 0x00007fe8aedca4a4 in PyErr_Format (exception=<optimized out>, > format=<optimized out>) at ../Python/errors.c:852 > #6 0x00007fe8aee89fcd in PyDict_SetItem (op=<optimized out>, key=<optimized > out>, value=<optimized out>) at ../Objects/dictobject.c:1448 > #7 PyDict_SetItem (op=<optimized out>, key=<optimized out>, value=<optimized > out>, op=<optimized out>, key=<optimized out>, value=<optimized out>) at > ../Objects/dictobject.c:1443 > #8 0x00007fe8aee76f4a in module_init_dict (md_dict=<unknown at remote > 0x7fe8ae9f6060>, name=name@entry=<unknown at remote 0x7fe8ae9f5030>, > doc=None, doc@entry=0x0, mod=<optimized out>) at ../Objects/moduleobject.c:72 > #9 0x00007fe8aee7da83 in PyModule_NewObject (name=name@entry=<unknown at > remote 0x7fe8ae9f5030>) at ../Objects/moduleobject.c:103 > #10 0x00007fe8aee7de2a in PyModule_New (name=name@entry=0x7fe8b32bfa20 > "lucene._lucene") at ../Objects/moduleobject.c:120 > #11 0x00007fe8aee7deec in _PyModule_CreateInitialized (module=0x7fe8b2612080 > <_lucene_def>, module_api_version=<optimized out>) at > ../Objects/moduleobject.c:215 > #12 0x00007fe8b1238de7 in PyInit__lucene () from > /data/bouwen/van_kras/pylucene-8.6.1/build/test/lucene-8.6.1-py3.7-linux-x86_64.egg/lucene/_lucene.cpython-37m-x86_64-linux-gnu.so > {code} > It could be that this goes undetected with normal python, yet causes an SEGV > on shutdown. > > The error above can be reproduced with the following script that downloads > the sources, builds JCC and PyLucene and the executes: python3-dbg -c "import > lucene" > > {code:bash} > # Environment > # debian buster > # ant 1.10.5-2 > export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 > export PYTHON=/usr/bin/python3 > export PYLUCENE="pylucene-8.6.1" > rm ${PYLUCENE}-src.tar.gz ${PYLUCENE} -rf > wget > https://ftp.nluug.nl/internet/apache/lucene/pylucene/${PYLUCENE}-src.tar.gz > tar xzf ${PYLUCENE}-src.tar.gz > (cd ${PYLUCENE} > (cd jcc > export JCC_JDK=${JAVA_HOME} > export > JCC_INCLUDES=/usr/include/python3.7m:${JAVA_HOME}/include:${JAVA_HOME}/include/linux > ${PYTHON} setup.py build > ) > export NUM_FILES=10 > export ANT=/usr/bin/ant > export JCC="${PYTHON} -m jcc --shared" > make > make test > ) > PYTHONPATH='pylucene-8.6.1/build/test/lucene-8.6.1-py3.7-linux-x86_64.egg' > ${PYTHON}-dbg -c "import lucene" > {code} > Would you be as kind as to look into this? Perhaps our problem is solved, or > it enables us to find an other problem at shutdown. > Best regards, > Erik -- This message was sent by Atlassian Jira (v8.3.4#803005)