Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-12 Thread John Groves
Matt, (See my reply to Antoine for some clarifications about famfs that may or may not have been obvious). In current-day workloads sharding and reshuffling are unavoidable, and will probably never become fully avoidable. Your Arrow Communication Extensions look to me like sensible extensions to

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-12 Thread John Groves
On 24/04/10 05:56PM, Antoine Pitrou wrote: > > Hello John, > > Arrow IPC files can be backed quite naturally by shared memory, simply by > memory-mapping them for reading. So if you have some pieces of shared memory > containing Arrow IPC files, and they are reachable using a filesystem mount > p

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-10 Thread Antoine Pitrou
Hello John, Arrow IPC files can be backed quite naturally by shared memory, simply by memory-mapping them for reading. So if you have some pieces of shared memory containing Arrow IPC files, and they are reachable using a filesystem mount point, you're pretty much done. You can see an exam

Re: [RFC] Enabling data frames in disaggregated shared memory

2024-04-10 Thread Matt Topol
Hi John, I recently proposed on the mailing list an experimental extension of the Arrow IPC protocol that would make it easier to leverage disaggregated shared memory along with non-cpu memory via utilities such as UCX and libfabric [1]. I'll be putting together a more formal description of it tha

[RFC] Enabling data frames in disaggregated shared memory

2024-04-09 Thread John Groves
This is a request for comments from the Arrow developer community. I’m reaching out to start making the Arrow community aware of work that my team at Micron has recently open-sourced. Because of the Compute Express Link (CXL) standard, sharable disaggregated memory is coming – this is memory share