hi Jayjeet — have you run prof to see where those 1000ms are being spent? How many arrays (the sum of the number of chunks across all columns) in total are there? I would guess that the problem is all the little Buffer memcopies. I don't think that the C Interface is going to help you.
- Wes On Thu, Jun 10, 2021 at 1:48 PM Jayjeet Chakraborty <jayjeetchakrabort...@gmail.com> wrote: > > Hello Arrow Community, > > I am a student working on a project where I need to serialize an in-memory > Arrow Table of size around 700MB to a uint8_t* buffer. I am currently using > the arrow::ipc::RecordBatchStreamWriter API to serialize the table to a > arrow::Buffer, but it is taking nearly 1000ms to serialize the whole table, > and that is harming the performance of my performance-critical application. I > basically want to get hold of the underlying memory of the table as bytes and > send it over the network. How do you suggest I tackle this problem? I was > thinking of using the C Data interface for this, so that I convert my > arrow::Table to ArrowArray and ArrowSchema and serialize the structs to send > them over the network, but seems like serializing structs is another complex > problem on its own. It will be great to have some suggestions on this. > Thanks a lot. > > Best, > Jayjeet >