Is there a better (safer) way of accessing a specific Int64 cell in a RecordBatch? Currently I'm doing something like this:
std::static_pointer_cast<arrow::Int64Array>(batch->column(i))->raw_values()[j] On Wed, May 19, 2021 at 3:09 PM Rares Vernica <rvern...@gmail.com> wrote: > > /opt/rh/devtoolset-3/root/usr/bin/g++ -v > Using built-in specs. > COLLECT_GCC=/opt/rh/devtoolset-3/root/usr/bin/g++ > > COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-3/root/usr/libexec/gcc/x86_64-redhat-linux/4.9.2/lto-wrapper > Target: x86_64-redhat-linux > Configured with: ../configure --prefix=/opt/rh/devtoolset-3/root/usr > --mandir=/opt/rh/devtoolset-3/root/usr/share/man > --infodir=/opt/rh/devtoolset-3/root/usr/share/info --with-bugurl= > http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared > --enable-threads=posix --enable-checking=release --enable-multilib > --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions > --enable-gnu-unique-object --enable-linker-build-id > --enable-languages=c,c++,fortran,lto --enable-plugin > --with-linker-hash-style=gnu --enable-initfini-array --disable-libgcj > --with-isl=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/isl-install > --with-cloog=/builddir/build/BUILD/gcc-4.9.2-20150212/obj-x86_64-redhat-linux/cloog-install > --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 > --build=x86_64-redhat-linux > Thread model: posix > gcc version 4.9.2 20150212 (Red Hat 4.9.2-6) (GCC) > > Installed Packages > Name : glibc > Arch : x86_64 > Version : 2.17 > Release : 307.el7.1 > Size : 13 M > Repo : installed > From repo : base > > > On Wed, May 19, 2021 at 2:22 PM Weston Pace <weston.p...@gmail.com> wrote: > >> What compiler / glibc version are you using? >> arrow::SimpleRecordBatch::column does some non-trivial caching which >> uses std::atomic_load[1] which is not implemented properly on gcc < 5 >> so our behavior is different depending on the compiler version. >> >> [1] https://en.cppreference.com/w/cpp/atomic/atomic_load >> >> On Wed, May 19, 2021 at 10:15 AM Rares Vernica <rvern...@gmail.com> >> wrote: >> > >> > Hello, >> > >> > I'm using Arrow for accessing data outside the SciDB database engine. It >> > generally works fine but we are running into Segmentation Faults in a >> > corner multi-threaded case. I identified two threads that work on the >> same >> > Record Batch. I wonder if there is something internal about RecordBatch >> > that might help solve the mystery. >> > >> > We are using Arrow 0.16.0. The backtrace of the triggering thread looks >> > like this: >> > >> > Program received signal SIGSEGV, Segmentation fault. >> > [Switching to Thread 0x7fdad5fb4700 (LWP 3748)] >> > 0x00007fdaa805abe0 in ?? () >> > (gdb) thread >> > [Current thread is 2 (Thread 0x7fdad5fb4700 (LWP 3748))] >> > (gdb) bt >> > #0 0x00007fdaa805abe0 in ?? () >> > #1 0x0000000000850212 in >> > std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() () >> > #2 0x00007fdae4b1fbf1 in >> > std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count >> > (this=0x7fdad5fb1ae8, __in_chrg=<optimized out>) at >> > >> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:666 >> > #3 0x00007fdae4b39d74 in std::__shared_ptr<arrow::Array, >> > (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fdad5fb1ae0, >> > __in_chrg=<optimized out>) at >> > >> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr_base.h:914 >> > #4 0x00007fdae4b39da8 in std::shared_ptr<arrow::Array>::~shared_ptr >> > (this=0x7fdad5fb1ae0, __in_chrg=<optimized out>) at >> > /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/bits/shared_ptr.h:93 >> > #5 0x00007fdae4b6a8e1 in scidb::XChunkIterator::getCoord >> > (this=0x7fdaa807f9f0, dim=1, index=1137) at XArray.cpp:358 >> > #6 0x00007fdae4b68ecb in scidb::XChunkIterator::XChunkIterator >> > (this=0x7fdaa807f9f0, chunk=..., iterationMode=0, arrowBatch=<error >> reading >> > variable: Cannot access memory at address 0xd5fb1b90>) at XArray.cpp:157 >> > ... >> > >> > The backtrace of the other thread working on exactly the same Record >> Batch >> > looks like this: >> > >> > (gdb) thread >> > [Current thread is 3 (Thread 0x7fdad61b5700 (LWP 3746))] >> > (gdb) bt >> > #0 0x00007fdae3bc1ec7 in arrow::SimpleRecordBatch::column(int) const () >> > from /lib64/libarrow.so.16 >> > #1 0x00007fdae4b6a888 in scidb::XChunkIterator::getCoord >> > (this=0x7fdab00c0bb0, dim=0, index=71) at XArray.cpp:357 >> > #2 0x00007fdae4b6a5a2 in scidb::XChunkIterator::operator++ >> > (this=0x7fdab00c0bb0) at XArray.cpp:305 >> > ... >> > >> > In both cases, the last non-Arrow code is the getCorord function >> > https://github.com/Paradigm4/bridge/blob/master/src/XArray.cpp#L355 >> > >> > int64_t XChunkIterator::getCoord(size_t dim, int64_t index) >> > { >> > return std::static_pointer_cast<arrow::Int64Array>( >> > _arrowBatch->column(_nAtts + dim))->raw_values()[index]; >> > } >> > ... >> > std::shared_ptr<const arrow::RecordBatch> _arrowBatch; >> > >> > Do you see anything suspicious about this code? What would trigger the >> > shared_ptr destruction which takes place in thread 2? >> > >> > Thank you! >> > Rares >> >