It seems this could be due to our use of MAP_PRIVATE for read-only memory maps
https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/file.cc#L393 Some more investigation would be required On Wed, May 22, 2019 at 7:43 PM John Muehlhausen <j...@jgm.org> wrote: > > Is there an example somewhere of referring to the RecordBatch data in a > memory-mapped IPC File in a zero-copy manner? > > I tried to do this in Python and must be doing something wrong. (I don't > really care whether the example is Python or C++) > > In the attached test, when I get to the first prompt and hit return, I get > the same content again. Likewise when I hit return on the second prompt I > get the same content again. > > However, if before hitting return on the first prompt I issue: > > dd conv=notrunc if=/dev/urandom of=/tmp/test.batch bs=478 count=1 > > > i.e. overwrite the contents of the file, I get a garbled result. (Replace > 478 with the size of your file.) > > However, if I wait until the second prompt to issue the dd command before > hitting return, I do not get an error. Instead, batch.to_pandas() works the > same both before and after the data is overwritten. This was not expected as > I thought that the batch object was looking at the file in-place, i.e. > zero-copy? > > Am I tying together the memory-mapping and the batch construction in the > wrong way? > > Thanks, > John