Re: Create large IPC format record batch(es) in-place without copy or prior data analysis

2021-11-05 Thread Micah Kornfield
Hi John, > > Any thoughts on creating large IPC format record batch(es) in-place in a > single pre-allocated buffer, that could be used with mmap? This seems doable today "by hand" today, it seems like this would be valuable to potentially contribute. The idea was to allow > record batch lengths

Create large IPC format record batch(es) in-place without copy or prior data analysis

2021-10-20 Thread John Muehlhausen
Motivation: We have memory-mappable Arrow IPC files with N batches where column(s) are sorted to support binary search. Because log2(n) < log2(n/2)+log2(n/2) and binary search is required on each batch, we prefer the batches to be as large as possible to reduce total search time... perhaps larger