Re: [Format] Passing selection masks with Arrow record batches

2019-01-28 Thread Francois Saint-Jacques
On Mon, Jan 28, 2019 at 12:53 AM Wes McKinney wrote: > I was having a discussion recently about Arrow and the topic of > server-side filtering vs. client-side filtering came up. > > The basic problem is this: > > If you have a RecordBatch that you wish to filter out some of the > "rows", one way

Re: [Format] Passing selection masks with Arrow record batches

2019-01-27 Thread Paul Taylor
We’ve been doing this in a few different ways at Graphistry, mostly guided by use case and device characteristics. For temporary/in-memory/microservice CPU workloads, we’ll compute a set of valid row indices as one side of a DictionaryVector, with the original table/column as the dictionary sid

Re: [Format] Passing selection masks with Arrow record batches

2019-01-27 Thread Ravindra Pindikura
> On Jan 28, 2019, at 11:47 AM, Wes McKinney wrote: > > On Mon, Jan 28, 2019 at 12:05 AM Ravindra Pindikura > wrote: >> >> >> >>> On Jan 28, 2019, at 11:22 AM, Wes McKinney wrote: >>> >>> I was having a discussion recently about Arrow and the topic of >>> serve

Re: [Format] Passing selection masks with Arrow record batches

2019-01-27 Thread Wes McKinney
On Mon, Jan 28, 2019 at 12:05 AM Ravindra Pindikura wrote: > > > > > On Jan 28, 2019, at 11:22 AM, Wes McKinney wrote: > > > > I was having a discussion recently about Arrow and the topic of > > server-side filtering vs. client-side filtering came up. > > > > The basic problem is this: > > > > If

Re: [Format] Passing selection masks with Arrow record batches

2019-01-27 Thread Ravindra Pindikura
> On Jan 28, 2019, at 11:22 AM, Wes McKinney wrote: > > I was having a discussion recently about Arrow and the topic of > server-side filtering vs. client-side filtering came up. > > The basic problem is this: > > If you have a RecordBatch that you wish to filter out some of the > "rows", on

[Format] Passing selection masks with Arrow record batches

2019-01-27 Thread Wes McKinney
I was having a discussion recently about Arrow and the topic of server-side filtering vs. client-side filtering came up. The basic problem is this: If you have a RecordBatch that you wish to filter out some of the "rows", one way to track this in-memory is to create a separate array of true/false