Thanks @Anant Damle<mailto:ana...@google.com> for fixing the issue with 
BEAM-11460 and BEAM-11527 so quickly!

From: Anant Damle <ana...@google.com>
Date: Friday, February 26, 2021 at 6:49 AM
To: Tao Li <t...@zillow.com>
Cc: "user@beam.apache.org" <user@beam.apache.org>, Brian Hulette 
<bhule...@google.com>
Subject: Re: Potential bug with BEAM-11460?

@Tao Li, I have added the Unit Test for your use-case and in this 
commit<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F14078%2Fcommits%2Ff5459bb3533194de48712229957a555ef79f17ef&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623820239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Weg3VhsVq45SYPdYxJc0EVjCXBlvQJOMvmveqxKYIRY%3D&reserved=0>.

On Fri, Feb 26, 2021 at 10:13 PM Anant Damle 
<ana...@google.com<mailto:ana...@google.com>> wrote:
Thanks Tao,
Let me try and put this as a test-case.
I am also looking into 
BEAM-11527<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-11527&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623820239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BC88BEp985JnhebqzzRmEaiWuhn5MmVOWnn9DqfgcLQ%3D&reserved=0>.

Thanks,
Anant

On Fri, Feb 26, 2021 at 9:30 AM Tao Li 
<t...@zillow.com<mailto:t...@zillow.com>> wrote:
@Brian Hulette<mailto:bhule...@google.com> I think the main issue I am trying 
to reporting is that I see this error message “Specify it explicitly using 
withCoder().” But I did not find withCoder() API available from ParquetIO. So 
maybe we need to add that method.
Getting back to your ask, here is roughly the code I was running. Hope this 
helps.
PCollection<Row> inputDataTest = 
pipeline.apply(ParquetIO.parseGenericRecords(new 
SerializableFunction<GenericRecord, Row>() {
                            public Row apply(GenericRecord record) {
                                return AvroUtils.toBeamRowStrict(record, null);
                            }
                        })
                        .from(path));





From: Brian Hulette <bhule...@google.com<mailto:bhule...@google.com>>
Reply-To: "user@beam.apache.org<mailto:user@beam.apache.org>" 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Date: Thursday, February 25, 2021 at 3:11 PM
To: Anant Damle <ana...@google.com<mailto:ana...@google.com>>
Cc: user <user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: Potential bug with BEAM-11460?

Hi Tao,
Thanks for reporting this! Could you share more details about your use-case, 
Anant mentioned that he's having trouble coming up with a test case where 
inferCoder doesn't work [1].

Brian

[1] 
https://github.com/apache/beam/pull/14078#issuecomment-786293576<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F14078%23issuecomment-786293576&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623830201%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jCJzq6jPAzlgIAqbUMaRIBVHeMuXZG4450fNsCpb61c%3D&reserved=0>

On Wed, Feb 24, 2021 at 6:49 PM Anant Damle 
<ana...@google.com<mailto:ana...@google.com>> wrote:
Hi Brian,
I think you are right. Create 
BEAM-11861<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-11861&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623830201%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7Yy66M9t%2FDD2vQTjTm7pYKTScWlp%2BbRyJ7wz5TYZkY8%3D&reserved=0>,
 will send a PR today.
Present workaround is to provide .setCoder directly on the Output PCollection.

On Thu, Feb 25, 2021 at 5:25 AM Brian Hulette 
<bhule...@google.com<mailto:bhule...@google.com>> wrote:
+Anant Damle<mailto:ana...@google.com> is this an oversight in 
https://github.com/apache/beam/pull/13616<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F13616&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623840148%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jSuJ1kPMoE4vje6a6RNOnQxBluiBT1Pq7gPg5hOJlac%3D&reserved=0>?
 What would be the right way to fix this?

On Tue, Feb 23, 2021 at 5:24 PM Tao Li 
<t...@zillow.com<mailto:t...@zillow.com>> wrote:
Hi Beam community,

I cannot log into Beam jira so I am asking this question here. I am testing 
this new feature from Beam 2.28 and see below error:

Exception in thread "main" java.lang.IllegalArgumentException: Unable to infer 
coder for output of parseFn. Specify it explicitly using withCoder().
                at 
org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.inferCoder(ParquetIO.java:554)
                at 
org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.expand(ParquetIO.java:521)
                at 
org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.expand(ParquetIO.java:483)
                at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:547)

However ParquetIO builder does not have this withCoder() method. I think this 
error message is mimicking AvroIO: 
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java#L1010<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fblob%2Fmaster%2Fsdks%2Fjava%2Fcore%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fbeam%2Fsdk%2Fio%2FAvroIO.java%23L1010&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623840148%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=x42TgU3zVZ6l207GriWVcUPfI796pRhrkZNheyMxkEE%3D&reserved=0>

Should we add this method to ParquetIO? Thanks!

Reply via email to