Hello,

I'm storing RecordBatch objects in a local cache to improve performance. I
want to keep track of the memory usage to stay within bounds. The arrays
stored in the batch are not nested.

The best way I came up to compute the size of a RecordBatch is:

            size_t arrowSize = 0;
            for (auto i = 0; i < arrowBatch->num_columns(); ++i) {
                auto column = arrowBatch->column_data(i);
                if (column->buffers[0])
                    arrowSize += column->buffers[0]->size();
                if (column->buffers[1])
                    arrowSize += column->buffers[1]->size();
            }

Does this look reasonable? I guess we are over estimating a bit due to the
buffer alignment but that should be fine.

Thanks!
Rares

Reply via email to