If it’s easy, would it make sense to include Russell’s fix as well for Metadata tables query , as it affects Spark 3.1 (a regression from Spark 3.0)? https://github.com/apache/iceberg/pull/2877/files
The issue : https://github.com/apache/iceberg/issues/2783 was at some point marked for 0.12 release. I had mentioned it’s ok to remove, if it takes too long to fix, and now it is indeed fixed. Thanks, Szehon > On 9 Aug 2021, at 11:36, Ryan Blue <b...@tabular.io> wrote: > > Thanks for pointing that one out, Jack! That would be good to get in as well. > > On Mon, Aug 9, 2021 at 11:02 AM Jack Ye <yezhao...@gmail.com > <mailto:yezhao...@gmail.com>> wrote: > If we are considering recutting the branch, please also include this PR > https://github.com/apache/iceberg/pull/2943 > <https://github.com/apache/iceberg/pull/2943> which fixes the validation when > creating a schema with identifier fields, thank you! > > -Jack Ye > > On Mon, Aug 9, 2021 at 9:08 AM Wing Yew Poon <wyp...@cloudera.com.invalid> > wrote: > Ryan, > Thanks for the review. Let me look into implementing your refactoring > suggestion. > - Wing Yew > > > On Mon, Aug 9, 2021 at 8:41 AM Ryan Blue <b...@tabular.io > <mailto:b...@tabular.io>> wrote: > Yeah, I agree. We should fix this for the 0.12.0 release. That said, I plan > to continue testing this RC because it won't change that much since this > affects the Spark extensions in 3.1. Other engines and Spark 3.0 or older > should be fine. > > I left a comment on the PR. I think it looks good, but we should try to > refactor to make sure we don't have more issues like this. I think when we > update our extensions to be compatible with multiple Spark versions, we > should introduce a factory method to create the Catalyst plan node and use > that everywhere. That will hopefully cut down on the number of times this > happens. > > Thank you, Wing Yew! > > On Sun, Aug 8, 2021 at 2:52 PM Carl Steinbach <cwsteinb...@gmail.com > <mailto:cwsteinb...@gmail.com>> wrote: > Hi Wing Yew, > > I will create a new RC once this patch is committed. > > Thanks. > > - Carl > > On Sat, Aug 7, 2021 at 4:29 PM Wing Yew Poon <wyp...@cloudera.com.invalid> > wrote: > Sorry to bring this up so late, but this just came up: there is a Spark 3.1 > (runtime) compatibility issue (not found by existing tests), which I have a > fix for in https://github.com/apache/iceberg/pull/2954 > <https://github.com/apache/iceberg/pull/2954>. I think it would be really > helpful if it can go into 0.12.0. > - Wing Yew > > > On Fri, Aug 6, 2021 at 11:36 AM Jack Ye <yezhao...@gmail.com > <mailto:yezhao...@gmail.com>> wrote: > +1 (non-binding) > > Verified release test and AWS integration test, issue found in test but not > blocking for release (https://github.com/apache/iceberg/pull/2948 > <https://github.com/apache/iceberg/pull/2948>) > > Verified Spark 3.1 and 3.0 operations and new SQL extensions and procedures > on EMR. > > Thanks, > Jack Ye > > On Fri, Aug 6, 2021 at 1:19 AM Kyle Bendickson <kjbendick...@gmail.com > <mailto:kjbendick...@gmail.com>> wrote: > +1 (binding) > > I verified: > - KEYS signature & checksum > - ./gradlew clean build (tests, etc) > - Ran Spark jobs on Kubernetes after building from the tarball at > https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/ > <https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/> > - Spark 3.1.1 batch jobs against both Hadoop and Hive tables, using HMS > for Hive catalog > - Verified default FileIO and S3FileIO > - Basic read and writes > - Jobs using Spark procedures (remove unreachable files) > - Special mention: verified that Spark catalogs can override hadoop > configurations using configs prefixed with > "spark.sql.catalog.(catalog-name).hadoop." > - one of my contributions to this release that has been asked about by > several customers internally > - tested using `spark.sql.catalog.(catalog-name).hadoop.fs.s3a.impl` for > two catalogs, both values respected as opposed to the default globally > configured value > > Thank you Carl! > > - Kyle, Data OSS Dev @ Apple =) > > On Thu, Aug 5, 2021 at 11:49 PM Szehon Ho <szehon.apa...@gmail.com > <mailto:szehon.apa...@gmail.com>> wrote: > +1 (non-binding) > > * Verify Signature Keys > * Verify Checksum > * dev/check-license > * Build > * Run tests (though some timeout failures, on Hive MR test..) > > Thanks > Szehon > > On Thu, Aug 5, 2021 at 2:23 PM Daniel Weeks <dwe...@apache.org > <mailto:dwe...@apache.org>> wrote: > +1 (binding) > > I verified sigs/sums, license, build, and test > > -Dan > > On Wed, Aug 4, 2021 at 2:53 PM Ryan Murray <rym...@gmail.com > <mailto:rym...@gmail.com>> wrote: > After some wrestling w/ Spark I discovered that the problem was with my test. > Some SparkSession apis changed. so all good here now. > > +1 (non-binding) > > On Wed, Aug 4, 2021 at 11:29 PM Ryan Murray <rym...@gmail.com > <mailto:rym...@gmail.com>> wrote: > Thanks for the help Carl, got it sorted out. The gpg check now works. For > those who were interested I used a canned wget command in my history and it > pulled the RC0 :-) > > Will have a PR to fix the Nessie Catalog soon. > > Best, > Ryan > > On Wed, Aug 4, 2021 at 9:21 PM Carl Steinbach <cwsteinb...@gmail.com > <mailto:cwsteinb...@gmail.com>> wrote: > Hi Ryan, > > Can you please run the following command to see which keys in your public > keyring are associated with my UID? > > % gpg --list-keys c...@apache.org <mailto:c...@apache.org> > pub rsa4096/5A5C7F6EB9542945 2021-07-01 [SC] > 160F51BE45616B94103ED24D5A5C7F6EB9542945 > uid [ultimate] Carl W. Steinbach (CODE SIGNING KEY) > <c...@apache.org <mailto:c...@apache.org>> > sub rsa4096/4158EB8A4F03D2AA 2021-07-01 [E] > > Thanks. > > - Carl > > On Wed, Aug 4, 2021 at 11:12 AM Ryan Murray <rym...@gmail.com > <mailto:rym...@gmail.com>> wrote: > Hi all, > > Unfortunately I have to give -1 > > I had trouble w/ the keys: > > gpg: assuming signed data in 'apache-iceberg-0.12.0.tar.gz' > gpg: Signature made Mon 02 Aug 2021 03:36:30 CEST > gpg: using RSA key FAFEB6EAA60C95E2BB5E26F01FF0803CB78D539F > gpg: Can't check signature: No public key > > And I have discovered a bug in NessieCatalog. It is unclear what is wrong but > the NessieCatalog doesn't play nice w/ Spark3.1. I will raise a patch ASAP to > fix it. Very sorry for the inconvenience. > > Best, > Ryan > > On Wed, Aug 4, 2021 at 3:20 AM Carl Steinbach <c...@apache.org > <mailto:c...@apache.org>> wrote: > Hi everyone, > > I propose that we release RC2 as the official Apache Iceberg 0.12.0 release. > Please note that RC0 and RC1 were DOA. > > The commit id for RC2 is 7c2fcfd893ab71bee41242b46e894e6187340070 > * This corresponds to the tag: apache-iceberg-0.12.0-rc2 > * https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc2 > <https://github.com/apache/iceberg/commits/apache-iceberg-0.12.0-rc2> > * > https://github.com/apache/iceberg/tree/7c2fcfd893ab71bee41242b46e894e6187340070 > > <https://github.com/apache/iceberg/tree/7c2fcfd893ab71bee41242b46e894e6187340070> > > The release tarball, signature, and checksums are here: > * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/ > <https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-0.12.0-rc2/> > > You can find the KEYS file here: > * https://dist.apache.org/repos/dist/dev/iceberg/KEYS > <https://dist.apache.org/repos/dist/dev/iceberg/KEYS> > > Convenience binary artifacts are staged in Nexus. The Maven repository URL is: > * https://repository.apache.org/content/repositories/orgapacheiceberg-1017/ > <https://repository.apache.org/content/repositories/orgapacheiceberg-1017/> > > Please download, verify, and test. > > Please vote in the next 72 hours. > > [ ] +1 Release this as Apache Iceberg 0.12.0 > [ ] +0 > [ ] -1 Do not release this because... > > > -- > Ryan Blue > Tabular > > > -- > Ryan Blue > Tabular