Re: Potential bug with ParquetIO.read when reading arrays

2021-02-26 Thread Tao Li
ant Damle<mailto:ana...@google.com> @Brian Hulette<mailto:bhule...@google.com> thanks so much for your support and help! From: Tao Li Reply-To: "user@beam.apache.org" Date: Wednesday, February 3, 2021 at 10:51 PM To: "user@beam.apache.org" Subject: Re: Potenti

Re: Potential bug with ParquetIO.read when reading arrays

2021-02-03 Thread Tao Li
2021 at 11:55 AM To: "user@beam.apache.org" Subject: Re: Potential bug with ParquetIO.read when reading arrays Hi all, Thanks for all the discussions so far (including discussions in BEAM-11721 and offline discussions). We will use BEAM-11650 to track the request of making avr

Re: Potential bug with ParquetIO.read when reading arrays

2021-02-03 Thread Tao Li
s not the one I want deal with in the following beam transforms. Instead I want to retain the original schema defined in spark which is simply an array of integers. Is there an easy way to retain the original schema when using ParquetIO to read spark created fields? Did anyone run into this need?

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-30 Thread Tao Li
chema, is it possible to make the avro schema specification optional for ParquetIO.read? Thanks! From: Tao Li Reply-To: "user@beam.apache.org" Date: Saturday, January 30, 2021 at 1:54 PM To: "user@beam.apache.org" Subject: Re: Potential bug with ParquetIO.read when reading arr

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-30 Thread Tao Li
the schema? I briefly looked at the ParquetIO source code but has not figured it out yet. From: Tao Li Reply-To: "user@beam.apache.org" Date: Friday, January 29, 2021 at 3:37 PM To: Chamikara Jayalath , "user@beam.apache.org" Subject: Re: Potential bug with ParquetIO.read when r

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Tao Li
at 7:45 AM To: "user@beam.apache.org<mailto:user@beam.apache.org>" mailto:user@beam.apache.org>> Subject: Re: Potential bug with ParquetIO.read when reading arrays Hi community, Can someone take a look at this issue? It is kind of a blocker to me right now. Really ap

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Chamikara Jayalath
.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > > *From: *Chamikara Jayalath > *Reply-To: *"user@beam.apache.org" > *Date: *Friday,

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Tao Li
1 at 10:53 AM To: user Subject: Re: Potential bug with ParquetIO.read when reading arrays Thanks. It might be something good to document in case other users run into this as well. Can you file a JIRA with the details ? On Fri, Jan 29, 2021 at 10:47 AM Tao Li mailto:t...@zillow.com>> wrote: OK

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Chamikara Jayalath
ble. > > > > *From: *Tao Li > *Reply-To: *"user@beam.apache.org" > *Date: *Friday, January 29, 2021 at 7:45 AM > *To: *"user@beam.apache.org" > *Subject: *Re: Potential bug with ParquetIO.read when reading arrays > > > > Hi community, > >

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Tao Li
basically using parquet jars from spark distributable directly and now everything is compatible. From: Tao Li Reply-To: "user@beam.apache.org" Date: Friday, January 29, 2021 at 7:45 AM To: "user@beam.apache.org" Subject: Re: Potential bug with ParquetIO.read when reading arrays

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-29 Thread Tao Li
ParquetIO.read when reading arrays BTW I tried avro 1.8 and 1.9 and both have the same error. So we can probably rule out any avro issue. From: Tao Li Reply-To: "user@beam.apache.org" Date: Thursday, January 28, 2021 at 9:07 AM To: "user@beam.apache.org" Subject: Potent

Re: Potential bug with ParquetIO.read when reading arrays

2021-01-28 Thread Tao Li
BTW I tried avro 1.8 and 1.9 and both have the same error. So we can probably rule out any avro issue. From: Tao Li Reply-To: "user@beam.apache.org" Date: Thursday, January 28, 2021 at 9:07 AM To: "user@beam.apache.org" Subject: Potential bug with ParquetIO.read when rea

Potential bug with ParquetIO.read when reading arrays

2021-01-28 Thread Tao Li
Hi Beam community, I am seeing an error when reading an array field using ParquetIO. I was using beam 2.25 and the direct runner for testing. Is this a bug or a known issue? Am I missing anything here? Please help me root cause this issue. Thanks so much! Attached are the avro schema and the pa