> /opt/rh/devtoolset-3/root/usr/bin/g++ -v
Using built-in specs.
COLLECT_GCC=/opt/rh/devtoolset-3/root/usr/bin/g++
COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-3/root/usr/libexec/gcc/x86_64-redhat-linux/4.9.2/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/opt/rh/devtoolset-3/root/usr
--mandir=/opt/rh/devtoolset-3/root/usr/share/man
--infodir=/opt/rh/devtoolset-3/root/usr/share/info --with-bugurl=
http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--enable-languages=c,c++,fortran,lto --enable-plugin
--with-linker-hash-style=gnu --enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC)

Installed Packages
Name        : glibc
Arch        : x86_64
Version     : 2.17
Release     : 307.el7.1
Size        : 13 M
Repo        : installed
>From repo   : base


On Wed, May 19, 2021 at 2:22 PM Weston Pace <weston.p...@gmail.com> wrote:

> What compiler / glibc version are you using?
> arrow::SimpleRecordBatch::column does some non-trivial caching which
> uses std::atomic_load[1] which is not implemented properly on gcc < 5
> so our behavior is different depending on the compiler version.
>
> [1] https://en.cppreference.com/w/cpp/atomic/atomic_load
>
> On Wed, May 19, 2021 at 10:15 AM Rares Vernica <rvern...@gmail.com> wrote:
> >
> > Hello,
> >
> > I'm using Arrow for accessing data outside the SciDB database engine. It
> > generally works fine but we are running into Segmentation Faults in a
> > corner multi-threaded case. I identified two threads that work on the
> same
> > Record Batch. I wonder if there is something internal about RecordBatch
> > that might help solve the mystery.
> >
> > We are using Arrow 0.16.0. The backtrace of the triggering thread looks
> > like this:
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fdad5fb4700 (LWP 3748)]
> > 0x00007fdaa805abe0 in ?? ()
> > (gdb) thread
> > [Current thread is 2 (Thread 0x7fdad5fb4700 (LWP 3748))]
> > (gdb) bt
> > #0  0x00007fdaa805abe0 in ?? ()
> > #1  0x0000000000850212 in
> > std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() ()
> > #2  0x00007fdae4b1fbf1 in
> > std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count
> > (this=0x7fdad5fb1ae8, __in_chrg=<optimized out>) at
> >
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:666
> > #3  0x00007fdae4b39d74 in std::__shared_ptr<arrow::Array,
> > (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fdad5fb1ae0,
> > __in_chrg=<optimized out>) at
> >
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:914
> > #4  0x00007fdae4b39da8 in std::shared_ptr<arrow::Array>::~shared_ptr
> > (this=0x7fdad5fb1ae0, __in_chrg=<optimized out>) at
> > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr.h:93
> > #5  0x00007fdae4b6a8e1 in scidb::XChunkIterator::getCoord
> > (this=0x7fdaa807f9f0, dim=1, index=1137) at XArray.cpp:358
> > #6  0x00007fdae4b68ecb in scidb::XChunkIterator::XChunkIterator
> > (this=0x7fdaa807f9f0, chunk=..., iterationMode=0, arrowBatch=<error
> reading
> > variable: Cannot access memory at address 0xd5fb1b90>) at XArray.cpp:157
> > ...
> >
> > The backtrace of the other thread working on exactly the same Record
> Batch
> > looks like this:
> >
> > (gdb) thread
> > [Current thread is 3 (Thread 0x7fdad61b5700 (LWP 3746))]
> > (gdb) bt
> > #0  0x00007fdae3bc1ec7 in arrow::SimpleRecordBatch::column(int) const ()
> > from /lib64/libarrow.so.16
> > #1  0x00007fdae4b6a888 in scidb::XChunkIterator::getCoord
> > (this=0x7fdab00c0bb0, dim=0, index=71) at XArray.cpp:357
> > #2  0x00007fdae4b6a5a2 in scidb::XChunkIterator::operator++
> > (this=0x7fdab00c0bb0) at XArray.cpp:305
> > ...
> >
> > In both cases, the last non-Arrow code is the getCorord function
> > https://github.com/Paradigm4/bridge/blob/master/src/XArray.cpp#L355
> >
> >     int64_t XChunkIterator::getCoord(size_t dim, int64_t index)
> >     {
> >         return std::static_pointer_cast<arrow::Int64Array>(
> >             _arrowBatch->column(_nAtts + dim))->raw_values()[index];
> >     }
> > ...
> > std::shared_ptr<const arrow::RecordBatch> _arrowBatch;
> >
> > Do you see anything suspicious about this code? What would trigger the
> > shared_ptr destruction which takes place in thread 2?
> >
> > Thank you!
> > Rares
>

Reply via email to