Sounds good to me. I'll work on that this morning On Mon, Nov 4, 2024 at 10:16 AM Amogh Jahagirdar <2am...@gmail.com> wrote:
> Thanks for identifying this issue and bringing it up here Cheng, that's > really appreciated! @Russell Spitzer <russell.spit...@gmail.com> I do > think it is an issue, if I understand the issue correctly, there are > certain cases in 1.14.3 where we may not be reading the dictionary filter > in its entirety, leading to correctness issues/data loss Concretely, I see > we'd exercise that path in ParquetDictionaryRowGroupFilter when doing > dictionaries.readDictionaryPage and in ParquetUtil#readDictionaryPage but > someone else please double check my read of it. > > I'd advocate for going back to Parquet 1.13.1 and cutting another release? > > Thanks, > > Amogh Jahagirdar > > On Mon, Nov 4, 2024 at 8:57 AM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> We are currently including 1.14.3 as a build dependency, is that an issue? >> >> On Sun, Nov 3, 2024 at 12:47 PM Cheng Pan <pan3...@gmail.com> wrote: >> >>> FYI, I just identified a Parquet data loss issue(newly introduced in >>> 1.14.0), and I confirmed it affects the Spark use cases, I’m not sure if it >>> also affects Iceberg cases. >>> >>> https://github.com/apache/parquet-java/issues/3040 >>> >>> Thanks, >>> Cheng Pan >>> >>> >>> >>> On Oct 31, 2024, at 06:06, Russell Spitzer <russell.spit...@gmail.com> >>> wrote: >>> >>> Hey Y'all, >>> >>> I propose that we release the following RC as the official Apache >>> Iceberg 1.7.0 release. >>> >>> The commit ID is 91e04c9c88b63dc01d6c8e69dfdc8cd27ee811cc >>> * This corresponds to the tag: apache-iceberg-1.7.0-rc0 >>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.7.0-rc0 >>> * >>> https://github.com/apache/iceberg/tree/91e04c9c88b63dc01d6c8e69dfdc8cd27ee811cc >>> >>> The release tarball, signature, and checksums are here: >>> * >>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.7.0-rc0 >>> >>> You can find the KEYS file here: >>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS >>> >>> Convenience binary artifacts are staged on Nexus. The Maven repository >>> URL is: >>> * https://repository.apache.org/content/repositories/orgapacheiceberg- >>> <ID>/ >>> >>> Please download, verify, and test. >>> >>> Please vote in the next 72 hours. >>> >>> [ ] +1 Release this as Apache Iceberg 1.7.0 >>> [ ] +0 >>> [ ] -1 Do not release this because... >>> >>> Only PMC members have binding votes, but other community members are >>> encouraged to cast >>> non-binding votes. This vote will pass if there are 3 binding +1 votes >>> and more binding >>> +1 votes than -1 votes. >>> >>> >>>