Domink's point is that the IPC reader currently first writes the whole thing into a Vec<u8>, and then copies all of that to buffers using IPC::Buffer offsets and lengths. Thus, it performs 2 memcopies of the whole data and needs to hold 2x the required memory (the Vec<u8> and the arrow::Buffers).
I noticed this while going through it on my proposal repo, and I rewrote it using `Reader::Seek` <https://github.com/jorgecarleitao/arrow2/blob/main/src/io/ipc/read/deserialize.rs#L66> to write directly to typed buffers. Coincidentally, this also enabled reading from big endian, as we know what is on each buffer, and thus know how to handle endianness using to_le and from_be implemented on Rust 's native types. Best, Jorge On Mon, Mar 8, 2021 at 11:12 PM Andrew Lamb <al...@influxdata.com> wrote: > Thank you for filing the ticket. > > I wonder if you mean this reader: > > https://docs.rs/arrow/3.0.0/arrow/ipc/reader/struct.FileReader.html#method.try_new > > If so, while it is called a `FileReader` I think that is somewhat > misleading. It requires something that implements `std::io::Read` -- which > `&[u8]` does. > > https://doc.rust-lang.org/std/io/trait.Read.html#impl-Read-2 > > So you should be able to read directly from the `[u8]` without having to do > any copies > > I may perhaps be missing something > > On Thu, Mar 4, 2021 at 10:53 AM Dominik Moritz <domor...@cmu.edu> wrote: > > > I just remembered a bigger issue I ran into. I wanted to read from IPC > but > > I don’t have a file. I do have the data as [u8] already. The current API > > incurs more copies than necessary (I think) and therefore the performance > > of reading IPC is worse than in JS. ( > > https://issues.apache.org/jira/projects/ARROW/issues/ARROW-11696). > > > > On Mar 1, 2021 at 23:29:18, Dominik Moritz <domor...@cmu.edu> wrote: > > > > > I am looking forward to speaking with you then. I’ll talk about the > > > motivation. > > > > > > My experience with the library has been good. I ran into a few > > limitations > > > that I filed Jiras for. I struggled a bit with some of the error > handling > > > and Arc types but that’s probably because I am now very experienced > with > > > Rust and wasm-bindgen doesn’t support all Rust features. > > > > > > I had some bigger issues with the DataFusion and Parquet libraries as > > they > > > don’t support wasm right now (also filed Jiras for those). > > > > > > On Feb 27, 2021 at 11:14:27, Andrew Lamb <al...@influxdata.com> wrote: > > > > > >> Hi Dominik, > > >> > > >> That sounds really interesting -- thank you for the offer > > >> > > >> I for one would enjoy seeing a demo and suggest that 10 minutes might > > be a > > >> good length. The next call (details are also on the announcement [1]) > is > > >> scheduled for Wednesday March 10, 2021 at 09:00 PST / 12:00 EST / > 17:00 > > >> UTC. The link is https://meet.google.com/ctp-yujs-aee > > >> > > >> I would personally be interested in hearing about your experience as a > > >> user > > >> of the Rust library (what was good, what was challenging, how can we > > >> improve). > > >> > > >> Thanks! > > >> Andrew > > >> > > >> [1] > > >> > > >> > > > https://lists.apache.org/thread.html/raa72e1a8a3ad5dbb8366e9609a041eccca87f85545c3bc3d85170cfc%40%3Cdev.arrow.apache.org%3E > > >> > > >> On Fri, Feb 26, 2021 at 4:17 AM Fernando Herrera < > > >> fernando.j.herr...@gmail.com> wrote: > > >> > > >> Hi Dominic, > > >> > > >> > > >> I would be interested in a demo. Im curious to see your implementation > > and > > >> > > >> what advantages you have seen over javascript > > >> > > >> > > >> thanks > > >> > > >> Fernando > > >> > > >> > > >> On Thu, Feb 25, 2021 at 10:39 PM Dominik Moritz <domor...@cmu.edu> > > wrote: > > >> > > >> > > >> > Hello Rust Arrow Devs, > > >> > > >> > > > >> > > >> > I have been working on a wasm version of Arrow using the Rust > library > > ( > > >> > > >> > https://github.com/domoritz/arrow-wasm). I was wondering whether > you > > >> > > >> would > > >> > > >> > be interested in having me demo it in the Arrow Rust sync call. If > so, > > >> > > >> when > > >> > > >> > would be the next one and how much time would you want to allocate > for > > >> > > >> it? > > >> > > >> > Also, would you be interested for me to dive into something in > > >> > > >> particular? > > >> > > >> > > > >> > > >> > Cheers, > > >> > > >> > Dominik > > >> > > >> > > > >> > > >> > > >> > > >