Hi, Shun. Thanks for the contribution. I'll have a look first and then find some committers help review & merge.
Best regards, Yuxia ----- 原始邮件 ----- 发件人: "sunshun18" <sunshu...@126.com> 收件人: "dev" <dev@flink.apache.org> 发送时间: 星期一, 2022年 12 月 05日 上午 11:54:38 主题: Patch to support Parquet schema evolution Hi there, I find an null-value issue when using Flink to read parquet files with multi versions of schema (V1->V2->V3->..->Vn). Assuming there are two fileds in given parquet schema as below, and filed F2 only exist in version 2. Version1: F1 Version2: F1, F2 Currently the value of filed F2 will be empty when reading data from parquet file using schema version2. I explore the implementation, and find Flink use a collection named `unknownFieldsIndices` to track the nonexistent fields, applied to all parquet files under given path. I draft a patch to fix this issue with unit test. https://issues.apache.org/jira/browse/FLINK-29527 https://github.com/apache/flink/pull/21149 As these PR is pending for a long time, I hope any commitor can help review it and provide any feedback if possible. Thanks! Shun