[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-07-18 Thread Howard A. Landman


New submission from Howard A. Landman :

I have a program qtd.py that reliably dies with free(): invalid pointer after 
about 13 hours of runtime (on a RPi3B+). This is hard to debug because (1) 
try:except: will not catch SIGABRT
(2) signal.signal(signal.SIGABRT, sigabrt_handler) also fails to catch SIGABRT
(3) python3-dbg does not work with libraries like RPi.GPIO and spidev()
This happens under both Buster and Stretch, so I don't think it's OS-dependent.

I managed to get a core dump and gdb back trace gives:
warning: core file may not match specified executable file.
[New LWP 10277]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Core was generated by `python3 qtd.py'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x76c5e308 in __GI_abort () at abort.c:100
#2  0x76cae51c in __libc_message (action=action@entry=do_abort, 
fmt=) at ../sysdeps/posix/libc_fatal.c:181
#3  0x76cb5044 in malloc_printerr (str=) at malloc.c:5341
#4  0x76cb6d50 in _int_free (av=0x76d917d4 , 
p=0x43417c , have_lock=)
at malloc.c:4165
#5  0x001b3bb4 in list_ass_item (a=, i=, 
v=, a=, i=, v=)
at ../Objects/listobject.c:739
Backtrace stopped: Cannot access memory at address 0x17abeff8

So, as far as I can tell, it looks like list_ass_item() is where the error 
happens. I'm willing to help debug and maybe even fix this, but is there any 
way to remove any of the roadblocks (1-3) above?

NOTE: qtd.py will exit unless there is a Texas Instruments TDC7201 chip 
attached to the RPI's SPI pins.

--
components: Interpreter Core
hgrepos: 389
messages: 373911
nosy: Howard_Landman
priority: normal
severity: normal
status: open
title: free(): invalid pointer in list_ass_item() in Python 3.7.3
type: crash
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-07-18 Thread Howard A. Landman


Change by Howard A. Landman :


--
hgrepos:  -389

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-07-22 Thread Howard A. Landman


Howard A. Landman  added the comment:

This is not a memory leak problem. "Top" reports VIRT 21768 RES 13516 for the 
whole run, and Python internal resource reporting says 13564 kb for the whole 
run. So that's less than 1 kb leaked in 118.6M measurement cycles; most likely 
zero.

--

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12423] signal handler doesn't handle SIGABRT from os.abort

2020-07-22 Thread Howard A. Landman


Howard A. Landman  added the comment:

I don't think changing the documentation makes this not be a bug. My situation: 
I have a Python 3.7.3 program that reliably dies (after about 13 hours, having 
called its measure() method between 118.6M and 118.7M times) with free(): 
invalid pointer, which calls abort(). I believe that this is a bug in Python; 
and it's NOT a memory leak, since the size of the program doesn't change at all 
over time and is under 14 MB (real). I would like to debug it. This basically 
says "you're screwed". I can't catch the abort, and I can't use python3-dbg 
because it won't bind with the RPi.GPIO or spidev libraries. So all I can get 
from a core dump is that free() is being called by list_ass_item() at 
../Objects/listobject.c line 739. Assuming that I'm right and that this is a 
bug in Python, how do you expect anyone to ever debug it? At a bare minimum, 
there needs to be an easy way to get a full stack trace through both Python and 
C.

--
nosy: +Howard_Landman

___
Python tracker 
<https://bugs.python.org/issue12423>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-07-26 Thread Howard A. Landman


Howard A. Landman  added the comment:

After a quick glance at the source code for the spidev library, I think it is 
unlikely but not impossible to be the home for the bug. It does do malloc() and 
free(), but only for data that is greater than 256 bytes. Short tx and rx data 
is kept in static local buffers. Also, these calls do not match the partial 
stack trace I got. There is a small amount of allocating and deallocating 
Python objects, however, including calls to PyList_New(), PyList_SET_ITEM(), 
and Py_TYPE(self)->tp_free((PyObject *)self), so it's possible that the bug is 
buried under one of those.

--
nosy:  -christian.heimes, stestagg

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-07-27 Thread Howard A. Landman


Howard A. Landman  added the comment:

I'm running under 32-bit Raspbian, so let's assume the magic number is 13. 
There are only two places in my own code where the number 13 appears:

(1) My result_list is 14 items long, i.e. 0 to 13. Relevant code from qtd.py:
cum_results = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
...
while batches != 0:
...
result_list = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
for m in range(ITERS):
...
result = tdc.measure(...)
result_list[result] += 1
...
for i in range(len(result_list)):
cum_results[i] += result_list[i]

I notice that result_list is getting thrown away each cycle, and thus must be 
garbage-collected. I could try changing that and see if it has any effect.

(2) in the tdc7201 library, 13 occurs as a possible (error) result code from 
measure(). However several of the failing runs had zero errors of this class, 
which means the code with 13 in it never got executed even once. I am not using 
pin 13 of the RPi's header, so I never send that number to RPi.GPIO. GPIO13 is 
pin 33 of the header, and I am not using that pin either, so I don't think 
RPi.GPIO is translating one of my pin number arguments to 13 internally. (But I 
should check Broadcom Mode numbers.)

--

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-08-02 Thread Howard A. Landman


Howard A. Landman  added the comment:

Made some progress. By running it under gdb and breaking in malloc_printerr(), 
I got a better stack trace:

Breakpoint 1, malloc_printerr (str=0x76e028f8 "free(): invalid pointer")
at malloc.c:5341
5341malloc.c: No such file or directory.
(gdb) bt
#0  malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1  0x76d44d50 in _int_free (av=0x76e1f7d4 , 
p=0x43417c , have_lock=)
at malloc.c:4165
#2  0x001b4f40 in list_dealloc (op=0x760ef288) at ../Objects/listobject.c:324
#3  0x001be784 in frame_dealloc (
f=Frame 0x765fcc30, for file 
/home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 719, in read_regs24 
(i=40, reg=28)) at ../Objects/frameobject.c:470
#4  function_code_fastcall (globals=, nargs=, 
args=, co=) at ../Objects/call.c:291
#5  _PyFunction_FastCallKeywords (func=, stack=, 
nargs=, kwnames=) at ../Objects/call.c:408
#6  0x0011f984 in call_function (kwnames=0x0, oparg=, 
pp_stack=0x7effed40) at ../Python/ceval.c:4554
#7  _PyEval_EvalFrameDefault (f=, throwflag=)
at ../Python/ceval.c:3110
#8  0x0011d63c in PyEval_EvalFrameEx (throwflag=0, 
f=Frame 0x760e8960, for file 
/home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 962, in measure 
(self=, reg1=[130, 66, 0, 7, 255, 
255, 6, 64, 0, 1, 65535, 1600, Breakpoint 1, malloc_printerr (str=0x76e028f8 
"free(): invalid pointer")
at malloc.c:5341
5341malloc.c: No such file or directory.
(gdb) bt
#0  malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1  0x76d44d50 in _int_free (av=0x76e1f7d4 , 
p=0x43417c , have_lock=)
at malloc.c:4165
#2  0x001b4f40 in list_dealloc (op=0x760ef288) at ../Objects/listobject.c:324
#3  0x001be784 in frame_dealloc (
f=Frame 0x765fcc30, for file 
/home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 719, in read_regs24 
(i=40, reg=28)) at ../Objects/frameobject.c:470
#4  function_code_fastcall (globals=, nargs=, 
args=, co=) at ../Objects/call.c:291
#5  _PyFunction_FastCallKeywords (func=, stack=, 
nargs=, kwnames=) at ../Objects/call.c:408
#6  0x0011f984 in call_function (kwnames=0x0, oparg=, 
pp_stack=0x7effed40) at ../Python/ceval.c:4554
#7  _PyEval_EvalFrameDefault (f=, throwflag=)
at ../Python/ceval.c:3110
#8  0x0011d63c in PyEval_EvalFrameEx (throwflag=0, 
f=Frame 0x760e8960, for file 
/home/pi/src/QTD/src/tdc7201/tdc7201/__init__.py, line 962, in measure 
(self=, reg1=[130, 66, 0, 7, 255, 
255, 6, 64, 0, 1, 65535, 1600, 1, None, None, None, 1284, 322, 2371, 0, 0, 0, 
0, 0, 0, 0, 0, 2322, 23172, None], chip_select=1, sclk=23, miso=21, mosi=19, 
cs1=24, cs2=26, enable=12, osc_enable=16, trig1=7, int1=37, trig2=None, 
int2=None, start=18, stop=22, meas_mode=2, cal_pers=10, cal_count=, norm_lsb=, tof1=, tof2=, tof3=0, tof4=0, tof5=0) 
at remote 0x76191750>, simulate=True, error_prefix='59835 ', 
log_file=<_io.TextIOWrapper at remote 0x761df4b0>, cf1=130, timeout=, n_stop=3, threshold=, pulse=2))
at ../Python/ceval.c:3930
1, None, None, None, 1284, 322, 2371, 0, 0, 0, 0, 0, 0, 0, 0, 2322, 23172, 
None], chip_select=1, sclk=23, miso=21, mosi=19, cs1=24, cs2=26, enable=12, 
osc_enable=16, trig1=7, int1=37, trig2=None, int2=None, start=18, stop=22, 
meas_mode=2, cal_pers=10, cal_count=, 
norm_lsb=, tof1=, 
tof2=, tof3=0, tof4=0, tof5=0) at remote 
0x76191750>, simulate=True, error_prefix='59835 ', log_file=<_io.TextIOWrapper 
at remote 0x761df4b0>, cf1=130, timeout=, n_stop=3, 
threshold=, pulse=2))
at ../Python/ceval.c:3930
...

read_regs24 (i=40, reg=28) performs an SPI read of 13 24-bit registers (so 39 
bytes); the corresponding Python code is:

def read_regs24(self):
"""Read all 24-bit chip registers, using auto-increment feature."""
result24 = self._spi.xfer([self.MINREG24|self._AI,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 
0x00,
   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00])
#print("AI read 24-bits =", result24)
#print("length =", len(result24))
i = 1
for reg in range(self.MINREG24, self.MAXREG24+1):
# Data comes in MSB first.
self.reg1[reg] = (result24[i] << 16) | (result24[i+1] << 8) | 
result24[i+2]
i += 3

The zero padding is necessary to make sure that enough SPI clock cycles are 
sent for the data bytes to get clocked back in. The _AI flag turns on 
auto-increment, so each byte is from the next address.

So it looks like the bug is either in the spidev library, or in Python 
deallocation of this f

[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-08-03 Thread Howard A. Landman


Howard A. Landman  added the comment:

Getting closer to isolating this. A small program that does nothing but call 
read_regs24() over and over dies with the same error after about 107.4M 
iterations. So it seems to be definitely in that method. But that only takes a 
few hours, not half a day. Going to try to separate the SPI operation from the 
remaining Python code next.

--

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-08-03 Thread Howard A. Landman


Howard A. Landman  added the comment:

It appears to be in the spidev library xfer method, when used for reading. 
Calling that 107.4M times (using the same code as inside my read_regs24() 
method) causes the free() error.

Breakpoint 1, malloc_printerr (str=0x76e028f8 "free(): invalid pointer")
at malloc.c:5341
5341malloc.c: No such file or directory.
(gdb) bt
#0  malloc_printerr (str=0x76e028f8 "free(): invalid pointer") at malloc.c:5341
#1  0x76d44d50 in _int_free (av=0x76e1f7d4 , 
p=0x43417c , have_lock=) at 
malloc.c:4165
#2  0x001b4f40 in list_dealloc (op=0x766d0f58) at ../Objects/listobject.c:324
#3  0x7669b660 in ?? ()
   from /usr/lib/python3/dist-packages/spidev.cpython-37m-arm-linux-gnueabihf.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Should I be worried about the "corrupt stack" message?

--

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-08-03 Thread Howard A. Landman


Howard A. Landman  added the comment:

OK, this has been filed against the spidev library:
https://github.com/doceme/py-spidev/issues/107

Do you want it closed, or left open until that gets resolved?

--
resolution:  -> third party

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41335] free(): invalid pointer in list_ass_item() in Python 3.7.3

2020-08-04 Thread Howard A. Landman


Howard A. Landman  added the comment:

As far as we can tell, this is a known Py_DECREF problem with spidev==3.4. 
Testing on spidev==3.5 has not triggered the bug so far, so it appears to be 
already fixed.

Under 3.4, changing the list to a tuple did not affect the behavior.

--

___
Python tracker 
<https://bugs.python.org/issue41335>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com