I created a PR to assist with further discussion:
https://github.com/apache/arrow/pull/876
On Wed, Jul 19, 2017 at 4:14 PM, Julian Hyde wrote:
> I see. It took me a while to understand, but it all made sense when I
> realized that we are not looking at one Map instance but multiple
> rows, each w
I see. It took me a while to understand, but it all made sense when I
realized that we are not looking at one Map instance but multiple
rows, each with a Map instance, and the constituents parts of those
Maps are stored end-to-end.
Julian
On Wed, Jul 19, 2017 at 11:42 AM, Wes McKinney wrote:
>
The only structural difference between
List>
and
Struct, List>
is that in the latter case, the "key" value and the "value" value have
different offset vectors and thus can have different lengths.
So in the first case we have buffer structure:
- list null bitmap (map value is null / not null)
List> isn’t the only physical representation that makes sense.
Because it doesn’t take advantage of the fact that (a) keys can be re-ordered,
(b) keys are unique.
So, another viable physical representation would be Struct, List>,
with the keys sorted. If keys are constant width and in contiguou