Re: Adding a Map logical type to the Arrow metadata

2017-07-20 Thread Wes McKinney
I created a PR to assist with further discussion: https://github.com/apache/arrow/pull/876 On Wed, Jul 19, 2017 at 4:14 PM, Julian Hyde wrote: > I see. It took me a while to understand, but it all made sense when I > realized that we are not looking at one Map instance but multiple > rows, each w

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
I see. It took me a while to understand, but it all made sense when I realized that we are not looking at one Map instance but multiple rows, each with a Map instance, and the constituents parts of those Maps are stored end-to-end. Julian On Wed, Jul 19, 2017 at 11:42 AM, Wes McKinney wrote: >

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Wes McKinney
The only structural difference between List> and Struct, List> is that in the latter case, the "key" value and the "value" value have different offset vectors and thus can have different lengths. So in the first case we have buffer structure: - list null bitmap (map value is null / not null)

Re: Adding a Map logical type to the Arrow metadata

2017-07-19 Thread Julian Hyde
List> isn’t the only physical representation that makes sense. Because it doesn’t take advantage of the fact that (a) keys can be re-ordered, (b) keys are unique. So, another viable physical representation would be Struct, List>, with the keys sorted. If keys are constant width and in contiguou