Hey Li, In Ray we need the second type of union, since there can be arbitrary nesting.
-- Philipp. On Thu, Jan 25, 2018 at 8:56 AM, Li Jin <ice.xell...@gmail.com> wrote: > Hi All, > > I'd like to bump this thread to get some more feedbacks from other people. > I think what Wes says makes sense, there seems to be two requirement for > union types and it might make sense to make them different types. > > I think Dremio has more use case for the first type of union. I think Ray > also has use case for union but I am not sure if it's closer to the first > or the second. How do people feel about spec out details for the first > union type? > > On Thu, Jan 11, 2018 at 2:39 PM, Wes McKinney <wesmck...@gmail.com> wrote: > > > hi all, > > > > So one of the conflicts that keeps coming up re: unions is the > > following two notions: > > > > * A union as a "variant of primitives" type. Here, values are > > constrained to be one of Arrow's primitive types (integer, floating > > point, string, boolean, etc.). The value types are statically declared > > and thus the union type codes have a fixed interpretation (e.g. 0 is > > always boolean, 1 always int8, etc. and so on). > > > > * A union as a composition of any child types (including nested > > types). In this model, a union internally is like a struct plus type > > codes, which refer to a collection of any fields, which may include > > other nested types > > > > IMHO, these are two different and totally valid things to support. The > > former can be viewed as a special case of the latter, but there are > > benefits to computation engines to rely on the assumptions of the > > former (like the type codes having a static interpretation rather than > > a dynamic one). > > > > Not having the latter union type seems troublesome to me. For example, > > other data serialization systems support this > > > > * oneof in Protocol Buffers > > https://developers.google.com/protocol-buffers/docs/proto#oneof > > * union in Flatbuffers https://google.github.io/ > > flatbuffers/md__schemas.html > > * union in Thrift (not documented very well unfortunately) > > * union in Avro (I think this is the same) > > > > Thanks > > Wes > > > > On Thu, Jan 11, 2018 at 11:16 AM, Li Jin <ice.xell...@gmail.com> wrote: > > > Hi All, > > > > > > Here is a summary of the state and issue of union vector (to the best > of > > my > > > knowledge). > > > > > > I have summarized some possible solutions based on the discussion so > far. > > > However, this is not a proposal as there are still a lot of things that > > are > > > not clear at this moment. > > > > > > I'd like to share this as a base for further discussion and move > towards > > a > > > proposal. Thank you. > > > > > > https://docs.google.com/document/d/1zSwSZDVxgmoDol_ > > PKfyTDHD5wbw1eALs5eTS9kyjtYU/edit?usp=sharing > > > > > > Li > > >