Also, is it possible that the field is not an Int64Array?

On Wed, May 19, 2021 at 10:19 PM Yibo Cai <yibo....@arm.com> wrote:
>
> On 5/20/21 4:15 AM, Rares Vernica wrote:
> > Hello,
> >
> > I'm using Arrow for accessing data outside the SciDB database engine. It
> > generally works fine but we are running into Segmentation Faults in a
> > corner multi-threaded case. I identified two threads that work on the same
> > Record Batch. I wonder if there is something internal about RecordBatch
> > that might help solve the mystery.
> >
> > We are using Arrow 0.16.0. The backtrace of the triggering thread looks
> > like this:
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fdad5fb4700 (LWP 3748)]
> > 0x00007fdaa805abe0 in ?? ()
> > (gdb) thread
> > [Current thread is 2 (Thread 0x7fdad5fb4700 (LWP 3748))]
> > (gdb) bt
> > #0  0x00007fdaa805abe0 in ?? ()
> > #1  0x0000000000850212 in
> > std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
> > #2  0x00007fdae4b1fbf1 in
> > std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count
> > (this=0x7fdad5fb1ae8, __in_chrg=<optimized out>) at
> > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:666
> > #3  0x00007fdae4b39d74 in std::__shared_ptr<arrow::Array,
> > (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fdad5fb1ae0,
> > __in_chrg=<optimized out>) at
> > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:914
> > #4  0x00007fdae4b39da8 in std::shared_ptr<arrow::Array>::~shared_ptr
> > (this=0x7fdad5fb1ae0, __in_chrg=<optimized out>) at
> > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr.h:93
> > #5  0x00007fdae4b6a8e1 in scidb::XChunkIterator::getCoord
> > (this=0x7fdaa807f9f0, dim=1, index=1137) at XArray.cpp:358
> > #6  0x00007fdae4b68ecb in scidb::XChunkIterator::XChunkIterator
> > (this=0x7fdaa807f9f0, chunk=..., iterationMode=0, arrowBatch=<error reading
> > variable: Cannot access memory at address 0xd5fb1b90>) at XArray.cpp:157
>
> FWIW, this "error reading variable" looks suspicious. Maybe the argument
> 'arrowBatch' is trashed accidentally (stack overflow)?
> https://github.com/Paradigm4/bridge/blob/master/src/XArray.cpp#L132
>
> > ...
> >
> > The backtrace of the other thread working on exactly the same Record Batch
> > looks like this:
> >
> > (gdb) thread
> > [Current thread is 3 (Thread 0x7fdad61b5700 (LWP 3746))]
> > (gdb) bt
> > #0  0x00007fdae3bc1ec7 in arrow::SimpleRecordBatch::column(int) const ()
> > from /lib64/libarrow.so.16
> > #1  0x00007fdae4b6a888 in scidb::XChunkIterator::getCoord
> > (this=0x7fdab00c0bb0, dim=0, index=71) at XArray.cpp:357
> > #2  0x00007fdae4b6a5a2 in scidb::XChunkIterator::operator++
> > (this=0x7fdab00c0bb0) at XArray.cpp:305
> > ...
> >
> > In both cases, the last non-Arrow code is the getCorord function
> > https://github.com/Paradigm4/bridge/blob/master/src/XArray.cpp#L355
> >
> >      int64_t XChunkIterator::getCoord(size_t dim, int64_t index)
> >      {
> >          return std::static_pointer_cast<arrow::Int64Array>(
> >              _arrowBatch->column(_nAtts + dim))->raw_values()[index];
> >      }
> > ...
> > std::shared_ptr<const arrow::RecordBatch> _arrowBatch;
> >
> > Do you see anything suspicious about this code? What would trigger the
> > shared_ptr destruction which takes place in thread 2?
> >
> > Thank you!
> > Rares
> >

Reply via email to