@Kyle Weaver<mailto:kcwea...@google.com> sure thing! So the input/output 
definition for the 
Flatten.Iterables<https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/transforms/Flatten.Iterables.html>
 is:

Input: PCollection<Iterable<T>
Output: PCollection<T>

The input/output for a explode transform would look like this:
Input:  PCollection<Row> The row schema has a field which is an array of T
Output: PCollection<Row> The array type field from input schema is replaced 
with a new field of type T. The elements from the array type field are 
flattened into multiple rows in the new table (other fields of input table are 
just duplicated.

Hope this clarification helps!

From: Kyle Weaver <kcwea...@google.com>
Reply-To: "user@beam.apache.org" <user@beam.apache.org>
Date: Tuesday, January 12, 2021 at 4:58 PM
To: "user@beam.apache.org" <user@beam.apache.org>
Cc: Reuven Lax <re...@google.com>
Subject: Re: Is there an array explode function/transform?

@Reuven Lax<mailto:re...@google.com> yes I am aware of that transform, but 
that’s different from the explode operation I was referring to: 
https://spark.apache.org/docs/latest/api/sql/index.html#explode<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191408293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IjXWhmHTGsbpgbxa1gJ5LcOFI%2BoiGIDYBwXPnukQfxk%3D&reserved=0>

How is it different? It'd help if you could provide the signature (input and 
output PCollection types) of the transform you have in mind.

On Tue, Jan 12, 2021 at 4:49 PM Tao Li 
<t...@zillow.com<mailto:t...@zillow.com>> wrote:
@Reuven Lax<mailto:re...@google.com> yes I am aware of that transform, but 
that’s different from the explode operation I was referring to: 
https://spark.apache.org/docs/latest/api/sql/index.html#explode<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191418249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XuUUmNB3fgBasjDj0Dq1Z2g6%2Bc5fbvluf%2BnAp2m8cuE%3D&reserved=0>

From: Reuven Lax <re...@google.com<mailto:re...@google.com>>
Reply-To: "user@beam.apache.org<mailto:user@beam.apache.org>" 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Tuesday, January 12, 2021 at 2:04 PM
To: user <user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: Is there an array explode function/transform?

Have you tried Flatten.iterables

On Tue, Jan 12, 2021, 2:02 PM Tao Li <t...@zillow.com<mailto:t...@zillow.com>> 
wrote:
Hi community,

Is there a beam function to explode an array (similarly to spark sql’s 
explode())? I did some research but did not find anything.

BTW I think we can potentially use FlatMap to implement the explode 
functionality, but a Beam provided function would be very handy.

Thanks a lot!

Reply via email to