I don't see any other reasonable interpretation. (One could use this as an argument to only support one field at a time, to make the potential explosion in data size all the more obvious.)
On Thu, Jan 14, 2021 at 11:30 AM Reuven Lax <re...@google.com> wrote: > And the result is essentially a cross product of all the different array > elements? > > On Thu, Jan 14, 2021 at 11:25 AM Robert Bradshaw <rober...@google.com> > wrote: > >> I think it makes sense to allow specifying more than one, if desired. >> This is equivalent to just stacking multiple Unnests. (Possibly one could >> even have a special syntax like "*" for all array fields.) >> >> On Thu, Jan 14, 2021 at 10:05 AM Reuven Lax <re...@google.com> wrote: >> >>> Should Unnest be allowed to specify multiple array fields, or just one? >>> >>> On Wed, Jan 13, 2021 at 11:59 PM Manninger, Matyas < >>> matyas.mannin...@veolia.com> wrote: >>> >>>> I would also not unnest arrays nested in arrays just the top-level >>>> array of the specified fields. >>>> >>>> On Wed, 13 Jan 2021 at 20:58, Reuven Lax <re...@google.com> wrote: >>>> >>>>> Nested fields are not part of standard SQL AFAIK. Beam goes further >>>>> and supports array of array, etc. >>>>> >>>>> On Wed, Jan 13, 2021 at 11:42 AM Kenneth Knowles <k...@apache.org> >>>>> wrote: >>>>> >>>>>> Just the fields specified, IMO. When in doubt, copy SQL. (and I mean >>>>>> SQL generally, not just Beam SQL) >>>>>> >>>>>> Kenn >>>>>> >>>>>> On Wed, Jan 13, 2021 at 11:17 AM Reuven Lax <re...@google.com> wrote: >>>>>> >>>>>>> Definitely could be a top-level transform. Should it automatically >>>>>>> unnest all arrays, or just the fields specified? >>>>>>> >>>>>>> We do have to define the semantics for nested arrays as well. >>>>>>> >>>>>>> On Wed, Jan 13, 2021 at 10:57 AM Robert Bradshaw < >>>>>>> rober...@google.com> wrote: >>>>>>> >>>>>>>> Ah, thanks for the clarification. UNNEST does sound like what you >>>>>>>> want here, and would likely make sense as a top-level relational >>>>>>>> transform >>>>>>>> as well as being supported by SQL. >>>>>>>> >>>>>>>> On Wed, Jan 13, 2021 at 10:53 AM Tao Li <t...@zillow.com> wrote: >>>>>>>> >>>>>>>>> @Kyle Weaver <kcwea...@google.com> sure thing! So the >>>>>>>>> input/output definition for the Flatten.Iterables >>>>>>>>> <https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/transforms/Flatten.Iterables.html> >>>>>>>>> is: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Input: PCollection<Iterable<T> >>>>>>>>> >>>>>>>>> Output: PCollection<T> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The input/output for a explode transform would look like this: >>>>>>>>> >>>>>>>>> Input: PCollection<Row> The row schema has a field which is an >>>>>>>>> array of T >>>>>>>>> >>>>>>>>> Output: PCollection<Row> The array type field from input schema is >>>>>>>>> replaced with a new field of type T. The elements from the array type >>>>>>>>> field >>>>>>>>> are flattened into multiple rows in the new table (other fields of >>>>>>>>> input >>>>>>>>> table are just duplicated. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hope this clarification helps! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *From: *Kyle Weaver <kcwea...@google.com> >>>>>>>>> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org> >>>>>>>>> *Date: *Tuesday, January 12, 2021 at 4:58 PM >>>>>>>>> *To: *"user@beam.apache.org" <user@beam.apache.org> >>>>>>>>> *Cc: *Reuven Lax <re...@google.com> >>>>>>>>> *Subject: *Re: Is there an array explode function/transform? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> @Reuven Lax <re...@google.com> yes I am aware of that transform, >>>>>>>>> but that’s different from the explode operation I was referring to: >>>>>>>>> https://spark.apache.org/docs/latest/api/sql/index.html#explode >>>>>>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191408293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IjXWhmHTGsbpgbxa1gJ5LcOFI%2BoiGIDYBwXPnukQfxk%3D&reserved=0> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> How is it different? It'd help if you could provide the signature >>>>>>>>> (input and output PCollection types) of the transform you have in >>>>>>>>> mind. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jan 12, 2021 at 4:49 PM Tao Li <t...@zillow.com> wrote: >>>>>>>>> >>>>>>>>> @Reuven Lax <re...@google.com> yes I am aware of that transform, >>>>>>>>> but that’s different from the explode operation I was referring to: >>>>>>>>> https://spark.apache.org/docs/latest/api/sql/index.html#explode >>>>>>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191418249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XuUUmNB3fgBasjDj0Dq1Z2g6%2Bc5fbvluf%2BnAp2m8cuE%3D&reserved=0> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *From: *Reuven Lax <re...@google.com> >>>>>>>>> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org> >>>>>>>>> *Date: *Tuesday, January 12, 2021 at 2:04 PM >>>>>>>>> *To: *user <user@beam.apache.org> >>>>>>>>> *Subject: *Re: Is there an array explode function/transform? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Have you tried Flatten.iterables >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jan 12, 2021, 2:02 PM Tao Li <t...@zillow.com> wrote: >>>>>>>>> >>>>>>>>> Hi community, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Is there a beam function to explode an array (similarly to spark >>>>>>>>> sql’s explode())? I did some research but did not find anything. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> BTW I think we can potentially use FlatMap to implement the >>>>>>>>> explode functionality, but a Beam provided function would be very >>>>>>>>> handy. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks a lot! >>>>>>>>> >>>>>>>>>