Thanks Bryan. Let's have a discussion about this and the other format-1.0 issues after 0.10.0 goes out
On Wed, Jul 11, 2018 at 12:01 PM, Bryan Cutler <cutl...@gmail.com> wrote: > Thanks Wes, sure I will add a section in the wiki. > > On Tue, Jul 10, 2018 at 3:07 PM, Wes McKinney <wesmck...@gmail.com> wrote: > >> hi Bryan, >> >> Thanks for bringing this up again. I will reply in some more detail, >> but to help could you create a major section in >> >> https://cwiki.apache.org/confluence/display/ARROW/ >> Columnar+Format+1.0+Milestone >> >> and include these details? We are falling significantly short of >> hardening a v1.0 iteration of the columnar format, and having a single >> document listing out all the work that needs to be done (including the >> Map type) is a good way to help herd the cats. >> >> Thanks >> Wes >> >> On Tue, Jul 10, 2018 at 5:30 PM, Bryan Cutler <cutl...@gmail.com> wrote: >> > Hello All, >> > >> > I would like to start moving forward with Map type support and begin >> > working on implementations. I believe we just need to define the >> specifics >> > of the metadata representation before getting started. Previously, there >> > was a thread [1] that discussed adding Map as a logical type and I'll try >> > to summarize where we are currently. >> > >> > Map has been added as a logical type and defined in the Flatbuffer schema >> > format with 1 field "keysSorted" which indicates if the child keys vector >> > has been presorted. A Map is a nested type that is represented as >> > List<entry: Struct<key: K, value: V>>. >> > >> > I think these are the 2 main issues of the metadata that need to be >> agreed >> > upon: >> > >> > - Same memory layout as List<entry: Struct<key: K, value: V>>. This is so >> > implementations lacking Map can alias as repeated struct values. >> > >> > - `Struct` and `K` fields are constrained to be non-nullable, other >> fields >> > can be nullable >> > >> > >> > Here is a sample JSON metadata representation: >> > >> > { >> > "name" : "MapName", >> > "nullable" : true|false, >> > "type" : { >> > "name" : "map", >> > "keysSorted" : true|false >> > }, >> > "children" : [{ >> > "name" : "entry", >> > "nullable" : false, >> > "type" : { >> > "name" : "struct" >> > }, >> > "children" : [{ >> > "name" : "key", >> > "nullable" : false, >> > "type" : { >> > "name" : K >> > }, >> > "children" : [] >> > },{ >> > "name" : "value", >> > "nullable" : true|false, >> > "type" : { >> > "name" : V >> > }, >> > "children" : [] >> > }] >> > }] >> > >> > >> > Any concerns or objections to the above? Hopefully that covers what >> needs >> > to be discussed, please correct me if I missed something. Thanks! >> > >> > Bryan >> > >> > >> > [1]: >> > https://lists.apache.org/thread.html/d61f21924159718fb31d27f5c85d58 >> d393a88708f76dff510c8da322@%3Cdev.arrow.apache.org%3E >>