For a system to use Arrow, it only needs to be able to understand the columnar memory layout (and typically the metadata -- still to be defined). Separately, many of us are interested in developing tools to share Arrow data through various shared memory mechanisms (for example, memory-mapped files and other OS-level shared memory tools). You wouldn't be required to use a particular memory-sharing protocol, though, so that is the responsibility of the application developer.
- Wes On Wed, Mar 9, 2016 at 4:02 AM, Yiannis Gkoufas <johngou...@gmail.com> wrote: > Hi Venkat, > > as far as I understand, arrow works on infrastructures which already > support RDMA and it's not included in the core arrow library. > Can the arrow developers confirm that assumption? > > Thanks > > On 5 March 2016 at 01:54, Venkat Krishnamurthy <nivik...@gmail.com> wrote: > >> All >> >> I've been following along with great interest, and have a n00b question. >> >> What happens when any of the arrow-capable processing tools needs to work >> with a data set or structure that is bigger than a single node's memory >> capacity? Does arrow itself handle the distribution of the resulting >> columnar representation over multiple nodes >> >> Or is this the responsibility of the framework/tool/data service that uses >> arrow? >> >> (i know arrow is defined to be a library, but i did see mentions of scatter >> gather reads and RDMA, so i'm somewhat confused) >> >> On the same topic, is there any intent of using a distributed shared memory >> like http://grappa.io? the PGAS (partitioned global address space) model >> seems like a natural fit for arrow's aspirations >> >> Thanks >> Venkat >>