Thanks for quick response, I'll update the discussion in case of progress.

On Mon, Feb 25, 2019 at 6:01 PM Wes McKinney <wesmck...@gmail.com> wrote:
>
> hi Igor,
>
> We have Map as a top-level logical data type in the columnar metadata:
>
> https://github.com/apache/arrow/blob/master/format/Schema.fbs#L55
>
> There isn't anything more than this right now. We have not implemented
> container types in Java or C++ yet, for the Map type, but I don't view
> it to be an extremely large project because Map is an alias for a list
> of structs. If you'd like to contribute this to the Java library it
> would be appreciated.
>
> For fast key retrieval, the keys should be sorted (and this property
> would be set in the metadata) so that a binary search can be used
> instead of linear search.
>
> Thanks,
> Wes
>
> On Mon, Feb 25, 2019 at 8:39 AM Ihor Huzenko <ihor.huzenko....@gmail.com> 
> wrote:
> >
> > Hello Arrow Team,
> >
> > My name is Igor Guzenko. I'm currently working on task related to
> > complex types in Apache Drill [1], and bumped into an issue that Drill
> > hasn't
> > appropriate vector for representing canonical (java-like) Map datatype
> > [2]. So I'm looking for inspiration how the efficient
> > columnar map vector can be implemented. I believe that such map can be
> > composed of three value vectors (like in Hive):
> >   1) keys vector;
> >   2) values vector;
> >   3) offsets vector which points to start index of each next map in
> > two previous vectors.
> > But there is a major issue with such implementation. It's hard to
> > quickly retrieve values using key, some advanced tricks required
> > to do this efficiently.
> >
> > I would be happy if you guys can share your expertise on this topic.
> > After learning some history of changes in Arrow, I found
> > that old map vector was renamed to struct and map datatype was
> > declared as list of structs, each of them containing vector for keys
> > and values.
> > I'm still very interested how Maps work internally in Arrow and I'd
> > like to implement similar one in Drill (so later future integration
> > with Arrow could be made more smoothly). Also, if you need new vector
> > for Map too, I would be happy to contribute it to both Drill and
> > Arrow projects.
> >
> > [1] : https://issues.apache.org/jira/browse/DRILL-3290
> > [2] : https://github.com/paul-rogers/drill/wiki/Drill-Maps
> >
> > Thanks for attention,
> > Igor Guzenko

Reply via email to