All

I've been following along with great interest, and have a n00b question.

What happens when any of the arrow-capable processing tools needs to work
with a data set or structure that is bigger than a single node's memory
capacity? Does arrow itself handle the distribution of the resulting
columnar representation over multiple nodes

Or is this the responsibility of the framework/tool/data service that uses
arrow?

(i know arrow is defined to be a library, but i did see mentions of scatter
gather reads and RDMA, so i'm somewhat confused)

On the same topic, is there any intent of using a distributed shared memory
like http://grappa.io? the PGAS (partitioned global address space) model
seems like a natural fit for arrow's aspirations

Thanks
Venkat

Reply via email to