I'd suggest explicitly chunking the table into batches of maybe ~2 MiB (it appears the table is one contiguous chunk and I believe it'll just try to send that entire table as one chunk). IIRC the Flight benchmark over localhost should be up to a couple GiB/s. (That said, that doesn't match up to the sharedmemory results still.)
Flight-UCX with shared memory transport* was more like ~7GiB/s? as I recall. But fundamentally Flight is a client/server RPC framework and not an interprocess shared memory cache so while you can build a caching service with Flight, it won't be exactly the same as Plasma. That said, if the goal is just convenience and not necessarily performance, I wonder if providing a reference implementation or recipe based on Flight or even possibly redis or memcached would suffice... * IIRC, this means that UCX uses shared memory to copy buffers between processes, which is still different than Plasma just mapping the same (immutable) buffer into multiple processes. On Thu, Mar 16, 2023, at 10:35, Antoine Pitrou wrote: > 0.5 GB/second for local Flight transfer seems unexpectedly slow (one > could expect 10x more), but perhaps tuning of default parameters needs > to be improving. David Li can probably elaborate on that. > > I'll add that Unix sockets might not be the fastest anymore these days. > It may be worth testing on TCP. > > Regards > > Antoine. > > > Le 15/03/2023 à 22:23, Will Jones a écrit : >> Hello all, >> >> First, a reminder that Plasma has been deprecated and will be removed in >> the 12.0.0 release of the C++, Python, and Java Arrow libraries. [1] >> >> I know some used Plasma as a convenient way to share Arrow data between >> Python processes, so I pulled together a quick performance comparison >> against two supported alternatives: Flight over unix domain socket and the >> Python sharedmemory module. [2] The shared memory example performs >> comparably to Plasma, but I don't think is accessible from other languages. >> The Flight test is slower than shared memory, but still fairly fast, and of >> course works across languages. I wrote a little more about the shared >> memory case in a stackoverflow answer [3]. >> >> If you have migrated off of Plasma and want to share with other users what >> you moved to, please do so in this thread. >> >> Best, >> >> Will Jones >> >> [1] https://github.com/apache/arrow/issues/33243 >> [2] https://github.com/wjones127/arrow-ipc-bench >> [3] https://stackoverflow.com/a/75402621/2048858 >>