Re: [DISCUSS][C++] How about adding arrow::ArrayStatistics?

2024-07-11 Thread Sutou Kouhei
Hi, I've updated the PoC of arrow::ArrayStatistics: https://github.com/apache/arrow/pull/42133 Changes: * Move arrow::ArrayStatistics from arrow::ArrayData to arrow::Array * Because arrow::ArrayData is mutable. If we change arrow::ArrayData, we should update or remove associated stat

Re: [DISCUSS] Statistics through the C data interface

2024-07-11 Thread Sutou Kouhei
Hi, >for non-standard statistics from open-source products the key=0 > combined with string label is the way to go Where do we store the string label? I think that we're considering the following schema: >> map< >> // The column index or null if the statistics refer to whole table

Re: [DISCUSS] Statistics through the C data interface

2024-07-11 Thread Felipe Oliveira Carvalho
On Thu, Jul 11, 2024 at 5:04 AM Sutou Kouhei wrote: > Hi, > > >for non-standard statistics from open-source products the > key=0 > > combined with string label is the way to go > > Where do we store the string label? > > I think that we're considering the following schema: > > >> map<

Re: [DISCUSS] Statistics through the C data interface

2024-07-11 Thread Sutou Kouhei
Hi, >> map, >> dense_union<...needed types based on stat kinds in the keys...>> >> > > Yes. That's my suggestion. And to leverage the fact that libraries handles > unions gracefully, this could be: > > map, dense_union<...needed types based on stat kinds > in the keys...>> > > X is either s