Hi Etienne, Sorry for the late reply,
I just merged your bug fixing. I think you can submit a PR for release-1.12. Best, Jingsong On Fri, Mar 12, 2021 at 12:22 AM Etienne Chauchot <echauc...@apache.org> wrote: > Hi, > > I forgot to mention that I submitted the new ParquetAvroInputFormat to > master (1.13) but it is made to work for 1.12.x (last release) also and > I'm using it with Flink 1.12.x. > > Maybe it could be a good candidate to be included in an upcoming 1.12.3 > release, WDYT ? > > Best > > Etienne > > On 11/03/2021 17:17, Etienne Chauchot wrote: > > > > Hi all, > > > > I just submitted another parquet PR that adds ParquetAvroInputFormat > > (I'm using it in a benchmark I'm coding). If anyone is interested in > > reviewing it, be my guest: > > > > https://github.com/apache/flink/pull/15156 > > > > I have also an older parquet PR that fixes a format conversion bug > > that is waiting for merge if anyone can review it also (already 1 > > approval of a non-committer, thanks @HuangZhenQiu > > <https://github.com/HuangZhenQiu>): > > > > https://github.com/apache/flink/pull/14961 > > > > If I have time, I'll also tackle the other parquet tickets that I > > opened lately > > > > Best > > > > Etienne > > > > On 25/02/2021 08:34, Jingsong Li wrote: > >> Hi Etienne, > >> > >> ParquetColumnarRowInputFormat is not fully functional yet, it has a good > >> performance, but it is hard to support complex types, like array and > map... > >> So I think a migrated ParquetInputFormat version is required. > >> > >> Best, > >> Jingsong > >> > >> On Wed, Feb 24, 2021 at 3:43 PM Etienne Chauchot<echauc...@apache.org> > >> wrote: > >> > >>> Hi, > >>> > >>> Thanks guys for the comments ! > >>> > >>> I did not know it was legacy. I will give the new sources a try. > >>> > >>> Jingsong, when you say "migrate ParquetInputFormat to the new > BulkFormat > >>> interface", do you mean that the new ParquetColumnarRowInputFormat is > >>> not fully functional yet? > >>> > >>> In the meantime, if you agree, I think I'm still gonna submit a PR for > >>> https://issues.apache.org/jira/browse/FLINK-21393 because I need it > on > >>> an urgent task I'm doing. > >>> > >>> Best > >>> > >>> Etienne > >>> > >>> On 24/02/2021 03:41, Peter Huang wrote: > >>>> Hi Jingsong, > >>>> > >>>> Thanks for pointing this out. Actually, I planned to work on changing > >>>> interfaces ParquetTableSource and ParquetInputFormat. > >>>> After refactoring the code, I may also help to fix the issue in > >>>> https://issues.apache.org/jira/browse/FLINK-21468. > >>>> > >>>> Best Regards > >>>> Peter Huang > >>>> > >>>> On Tue, Feb 23, 2021 at 6:35 PM Jingsong Li<jingsongl...@gmail.com> > >>> wrote: > >>>>> Hi Etienne, > >>>>> > >>>>> Thanks for your reporting. > >>>>> > >>>>> There are indeed many problems. There is no doubt that we need to > >>> improve > >>>>> our current format implementation. > >>>>> > >>>>> But ParquetTableSource and ParquetInputFormat are legacy > implementations > >>>>> with legacy interfaces. We have introduced new interfaces for > execution > >>> and > >>>>> SQL. You can see: > >>>>> - ParquetColumnarRowInputFormat with BulkFormat interface. It is just > >>> for > >>>>> columnar row reading, not support complex types, we need > >>>>> migrate ParquetInputFormat to the new BulkFormat interface. > >>>>> - FileSystemTableSource with DynamicTableSource interface, It is a > >>> generic > >>>>> FileSystem source for all formats, we can just use it for parquet > too. > >>>>> > >>>>> Considering ParquetTableSource and ParquetInputFormat are legacy > >>>>> interfaces, I think we can finish migration work first, what do you > >>> think? > >>>>> Best, > >>>>> Jingsong > >>>>> > >>>>> On Wed, Feb 24, 2021 at 12:46 AM Etienne Chauchot < > echauc...@apache.org > >>>>> wrote: > >>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> I've been playing with Parquet with SQL and Avro lately. I've found > >>> some > >>>>>> bugs: > >>>>>> > >>>>>> 1.https://issues.apache.org/jira/browse/FLINK-21388 : I already > >>>>>> submitted a PR on this one ( > https://github.com/apache/flink/pull/14961 > >>> ) > >>>>>> 2.https://issues.apache.org/jira/browse/FLINK-21389 > >>>>>> > >>>>>> 3.https://issues.apache.org/jira/browse/FLINK-21468 > >>>>>> > >>>>>> I've already started to work on this ticket: > >>>>>> https://issues.apache.org/jira/browse/FLINK-21393 > >>>>>> > >>>>>> > >>>>>> I'd be happy to receive your comments on these tickets > >>>>>> > >>>>>> > >>>>>> Best > >>>>>> > >>>>>> Etienne Chauchot > >>>>>> > >>>>>> > >>>>>> > >>>>> -- > >>>>> Best, Jingsong Lee > >>>>> > -- Best, Jingsong Lee