Re: Java 1.3.0 around mid May?

2023-05-22 Thread Anton Okolnychyi
Waiting for tests on PR #7637 and a few cherry-picks and should be ready to cut an RC. - Anton > On May 16, 2023, at 4:06 PM, Jahagirdar, Amogh > wrote: > > Just following up on this thread, I was tracking the 1.3.0 release milestone > https://github.com/apache/iceberg/milestone/26 >

Re: [Proposal] Partition stats in Iceberg

2023-05-22 Thread Ryan Blue
Thanks, Ajantha. I think it's safe to say that we should continue assuming that we will have one partition stats file. I agree that it should be small and we don't want to block the progress here. On Mon, May 22, 2023 at 5:07 AM Ajantha Bhat wrote: > Hi Anton and Ryan, > > The Partition stats sp

Re: Scan statistics

2023-05-22 Thread Russell Spitzer
Yeah does seem like we may have more use cases for this. The more Peter and I discuss this the more I think it makes sense to add in. On Mon, May 22, 2023 at 8:24 AM Péter Váry wrote: > The feature could be useful for Spark as well. See: > https://github.com/apache/iceberg/pull/7636#pullrequestr

Re: Scan statistics

2023-05-22 Thread Péter Váry
The feature could be useful for Spark as well. See: https://github.com/apache/iceberg/pull/7636#pullrequestreview-1434981224 Maybe we should add this as a topic for the next Iceberg Community Sync. Also when trying out possible solutions, I have found that some of the statistics are modifiable. I

Re: [Proposal] Partition stats in Iceberg

2023-05-22 Thread Ajantha Bhat
Hi Anton and Ryan, The Partition stats spec PR didn't move forward as Anton wanted to conduct some experiments to conclude whether single-file writing or multiple files is better. I conducted the experiments myself and attached some numbers in the PR.