I don't see any other reasonable interpretation. (One could use this as an
argument to only support one field at a time, to make the potential
explosion in data size all the more obvious.)

On Thu, Jan 14, 2021 at 11:30 AM Reuven Lax <re...@google.com> wrote:

> And the result is essentially a cross product of all the different array
> elements?
>
> On Thu, Jan 14, 2021 at 11:25 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> I think it makes sense to allow specifying more than one, if desired.
>> This is equivalent to just stacking multiple Unnests. (Possibly one could
>> even have a special syntax like "*" for all array fields.)
>>
>> On Thu, Jan 14, 2021 at 10:05 AM Reuven Lax <re...@google.com> wrote:
>>
>>> Should Unnest be allowed to specify multiple array fields, or just one?
>>>
>>> On Wed, Jan 13, 2021 at 11:59 PM Manninger, Matyas <
>>> matyas.mannin...@veolia.com> wrote:
>>>
>>>> I would also not unnest arrays nested in arrays just the top-level
>>>> array of the specified fields.
>>>>
>>>> On Wed, 13 Jan 2021 at 20:58, Reuven Lax <re...@google.com> wrote:
>>>>
>>>>> Nested fields are not part of standard SQL AFAIK. Beam goes further
>>>>> and supports array of array, etc.
>>>>>
>>>>> On Wed, Jan 13, 2021 at 11:42 AM Kenneth Knowles <k...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Just the fields specified, IMO. When in doubt, copy SQL. (and I mean
>>>>>> SQL generally, not just Beam SQL)
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Wed, Jan 13, 2021 at 11:17 AM Reuven Lax <re...@google.com> wrote:
>>>>>>
>>>>>>> Definitely could be a top-level transform. Should it automatically
>>>>>>> unnest all arrays, or just the fields specified?
>>>>>>>
>>>>>>> We do have to define the semantics for nested arrays as well.
>>>>>>>
>>>>>>> On Wed, Jan 13, 2021 at 10:57 AM Robert Bradshaw <
>>>>>>> rober...@google.com> wrote:
>>>>>>>
>>>>>>>> Ah, thanks for the clarification. UNNEST does sound like what you
>>>>>>>> want here, and would likely make sense as a top-level relational 
>>>>>>>> transform
>>>>>>>> as well as being supported by SQL.
>>>>>>>>
>>>>>>>> On Wed, Jan 13, 2021 at 10:53 AM Tao Li <t...@zillow.com> wrote:
>>>>>>>>
>>>>>>>>> @Kyle Weaver <kcwea...@google.com> sure thing! So the
>>>>>>>>> input/output definition for the Flatten.Iterables
>>>>>>>>> <https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/transforms/Flatten.Iterables.html>
>>>>>>>>> is:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Input: PCollection<Iterable<T>
>>>>>>>>>
>>>>>>>>> Output: PCollection<T>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The input/output for a explode transform would look like this:
>>>>>>>>>
>>>>>>>>> Input:  PCollection<Row> The row schema has a field which is an
>>>>>>>>> array of T
>>>>>>>>>
>>>>>>>>> Output: PCollection<Row> The array type field from input schema is
>>>>>>>>> replaced with a new field of type T. The elements from the array type 
>>>>>>>>> field
>>>>>>>>> are flattened into multiple rows in the new table (other fields of 
>>>>>>>>> input
>>>>>>>>> table are just duplicated.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hope this clarification helps!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From: *Kyle Weaver <kcwea...@google.com>
>>>>>>>>> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org>
>>>>>>>>> *Date: *Tuesday, January 12, 2021 at 4:58 PM
>>>>>>>>> *To: *"user@beam.apache.org" <user@beam.apache.org>
>>>>>>>>> *Cc: *Reuven Lax <re...@google.com>
>>>>>>>>> *Subject: *Re: Is there an array explode function/transform?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> @Reuven Lax <re...@google.com> yes I am aware of that transform,
>>>>>>>>> but that’s different from the explode operation I was referring to:
>>>>>>>>> https://spark.apache.org/docs/latest/api/sql/index.html#explode
>>>>>>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191408293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IjXWhmHTGsbpgbxa1gJ5LcOFI%2BoiGIDYBwXPnukQfxk%3D&reserved=0>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How is it different? It'd help if you could provide the signature
>>>>>>>>> (input and output PCollection types) of the transform you have in 
>>>>>>>>> mind.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jan 12, 2021 at 4:49 PM Tao Li <t...@zillow.com> wrote:
>>>>>>>>>
>>>>>>>>> @Reuven Lax <re...@google.com> yes I am aware of that transform,
>>>>>>>>> but that’s different from the explode operation I was referring to:
>>>>>>>>> https://spark.apache.org/docs/latest/api/sql/index.html#explode
>>>>>>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191418249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XuUUmNB3fgBasjDj0Dq1Z2g6%2Bc5fbvluf%2BnAp2m8cuE%3D&reserved=0>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From: *Reuven Lax <re...@google.com>
>>>>>>>>> *Reply-To: *"user@beam.apache.org" <user@beam.apache.org>
>>>>>>>>> *Date: *Tuesday, January 12, 2021 at 2:04 PM
>>>>>>>>> *To: *user <user@beam.apache.org>
>>>>>>>>> *Subject: *Re: Is there an array explode function/transform?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Have you tried Flatten.iterables
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jan 12, 2021, 2:02 PM Tao Li <t...@zillow.com> wrote:
>>>>>>>>>
>>>>>>>>> Hi community,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Is there a beam function to explode an array (similarly to spark
>>>>>>>>> sql’s explode())? I did some research but did not find anything.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> BTW I think we can potentially use FlatMap to implement the
>>>>>>>>> explode functionality, but a Beam provided function would be very 
>>>>>>>>> handy.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks a lot!
>>>>>>>>>
>>>>>>>>>

Reply via email to