> /opt/rh/devtoolset-3/root/usr/bin/g++ -v Using built-in specs. COLLECT_GCC=/opt/rh/devtoolset-3/root/usr/bin/g++ COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-3/root/usr/libexec/gcc/x86_64-redhat-linux/4.9.2/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --prefix=/opt/rh/devtoolset-3/root/usr --mandir=/opt/rh/devtoolset-3/root/usr/share/man --infodir=/opt/rh/devtoolset-3/root/usr/share/info --with-bugurl= http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,fortran,lto --enable-plugin --with-linker-hash-style=gnu --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC)
Installed Packages Name : glibc Arch : x86_64 Version : 2.17 Release : 307.el7.1 Size : 13 M Repo : installed >From repo : base On Wed, May 19, 2021 at 2:22 PM Weston Pace <weston.p...@gmail.com> wrote: > What compiler / glibc version are you using? > arrow::SimpleRecordBatch::column does some non-trivial caching which > uses std::atomic_load[1] which is not implemented properly on gcc < 5 > so our behavior is different depending on the compiler version. > > [1] https://en.cppreference.com/w/cpp/atomic/atomic_load > > On Wed, May 19, 2021 at 10:15 AM Rares Vernica <rvern...@gmail.com> wrote: > > > > Hello, > > > > I'm using Arrow for accessing data outside the SciDB database engine. It > > generally works fine but we are running into Segmentation Faults in a > > corner multi-threaded case. I identified two threads that work on the > same > > Record Batch. I wonder if there is something internal about RecordBatch > > that might help solve the mystery. > > > > We are using Arrow 0.16.0. The backtrace of the triggering thread looks > > like this: > > > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7fdad5fb4700 (LWP 3748)] > > 0x00007fdaa805abe0 in ?? () > > (gdb) thread > > [Current thread is 2 (Thread 0x7fdad5fb4700 (LWP 3748))] > > (gdb) bt > > #0 0x00007fdaa805abe0 in ?? () > > #1 0x0000000000850212 in > > std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () > > #2 0x00007fdae4b1fbf1 in > > std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count > > (this=0x7fdad5fb1ae8, __in_chrg=<optimized out>) at > > > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:666 > > #3 0x00007fdae4b39d74 in std::__shared_ptr<arrow::Array, > > (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fdad5fb1ae0, > > __in_chrg=<optimized out>) at > > > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:914 > > #4 0x00007fdae4b39da8 in std::shared_ptr<arrow::Array>::~shared_ptr > > (this=0x7fdad5fb1ae0, __in_chrg=<optimized out>) at > > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr.h:93 > > #5 0x00007fdae4b6a8e1 in scidb::XChunkIterator::getCoord > > (this=0x7fdaa807f9f0, dim=1, index=1137) at XArray.cpp:358 > > #6 0x00007fdae4b68ecb in scidb::XChunkIterator::XChunkIterator > > (this=0x7fdaa807f9f0, chunk=..., iterationMode=0, arrowBatch=<error > reading > > variable: Cannot access memory at address 0xd5fb1b90>) at XArray.cpp:157 > > ... > > > > The backtrace of the other thread working on exactly the same Record > Batch > > looks like this: > > > > (gdb) thread > > [Current thread is 3 (Thread 0x7fdad61b5700 (LWP 3746))] > > (gdb) bt > > #0 0x00007fdae3bc1ec7 in arrow::SimpleRecordBatch::column(int) const () > > from /lib64/libarrow.so.16 > > #1 0x00007fdae4b6a888 in scidb::XChunkIterator::getCoord > > (this=0x7fdab00c0bb0, dim=0, index=71) at XArray.cpp:357 > > #2 0x00007fdae4b6a5a2 in scidb::XChunkIterator::operator++ > > (this=0x7fdab00c0bb0) at XArray.cpp:305 > > ... > > > > In both cases, the last non-Arrow code is the getCorord function > > https://github.com/Paradigm4/bridge/blob/master/src/XArray.cpp#L355 > > > > int64_t XChunkIterator::getCoord(size_t dim, int64_t index) > > { > > return std::static_pointer_cast<arrow::Int64Array>( > > _arrowBatch->column(_nAtts + dim))->raw_values()[index]; > > } > > ... > > std::shared_ptr<const arrow::RecordBatch> _arrowBatch; > > > > Do you see anything suspicious about this code? What would trigger the > > shared_ptr destruction which takes place in thread 2? > > > > Thank you! > > Rares >