hi Jayjeet — have you run prof to see where those 1000ms are being
spent? How many arrays (the sum of the number of chunks across all
columns) in total are there? I would guess that the problem is all the
little Buffer memcopies. I don't think that the C Interface is going
to help you.

- Wes

On Thu, Jun 10, 2021 at 1:48 PM Jayjeet Chakraborty
<jayjeetchakrabort...@gmail.com> wrote:
>
> Hello Arrow Community,
>
> I am a student working on a project where I need to serialize an in-memory 
> Arrow Table of size around 700MB to a uint8_t* buffer. I am currently using 
> the arrow::ipc::RecordBatchStreamWriter API to serialize the table to a 
> arrow::Buffer, but it is taking nearly 1000ms to serialize the whole table, 
> and that is harming the performance of my performance-critical application. I 
> basically want to get hold of the underlying memory of the table as bytes and 
> send it over the network. How do you suggest I tackle this problem? I was 
> thinking of using the C Data interface for this, so that I convert my 
> arrow::Table to ArrowArray and ArrowSchema and serialize the structs to send 
> them over the network, but seems like serializing structs is another complex 
> problem on its own.  It will be great to have some suggestions on this. 
> Thanks a lot.
>
> Best,
> Jayjeet
>

Reply via email to