Re: [VOTE] Release Apache Iceberg 1.0.0 RC0

Steven Wu Mon, 10 Oct 2022 15:25:49 -0700

Never mind. Missed the information that this 1.0.0 is based on the latest
0.14.1 release, which doesn't contain the PR 5318. I thought it was based
on the latest master branch.


+1 (non-binding)
- Verify signature
- Verify checksum
- Tried SQL insert and query with Flink 1.15

On Mon, Oct 10, 2022 at 3:20 PM Steven Wu <[email protected]> wrote:

> Ryan,
>
> It seems that this PR (merged on July 28) for the Flink FLIP-27 source is
> not included. https://github.com/apache/iceberg/pull/5318
>
> The commit ID still contains the old file in the old location. The new
> location should be "flink/source/IcebergTableSource.java" with FLIP-27
> config support.
>
> https://github.com/apache/iceberg/blob/e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01/flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/IcebergTableSource.java
>
> Thanks,
> Steven
>
> On Mon, Oct 10, 2022 at 11:37 AM Szehon Ho <[email protected]>
> wrote:
>
>> Whoops, sorry for the noise, I made a typo and was using the wrong scala
>> version of the iceberg-spark-runtime jar, this works.
>>
>> +1 (non-binding)
>> - Verify signature
>> - Verify checksum
>> - Verify license documentation
>> - Tried with Spark 3.3
>> - Ran unit tests
>>
>> Thanks
>> Szehon
>>
>>
>>
>>
>> On Mon, Oct 10, 2022 at 11:26 AM Szehon Ho <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I get a NoClassDefFoundError from IcebergSparkExtensions when running
>>> Spark 3.3, with iceberg-spark-runtime-3.3_2.12-1.0.0.jar.  I noticed this
>>> jar doesn't contain scala classes, unlike previous jars
>>> iceberg-spark-runtime-3.3_2.12-0.14.1.jar.
>>>
>>> scala> spark.sql("show databases").show
>>> java.lang.NoClassDefFoundError: scala/collection/SeqOps
>>>   at
>>> org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions.$anonfun$apply$2(IcebergSparkSessionExtensions.scala:50)
>>>   at
>>> org.apache.spark.sql.SparkSessionExtensions.$anonfun$buildResolutionRules$1(SparkSessionExtensions.scala:152)
>>>   at
>>> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>>>   at
>>> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>>>   at
>>> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>>>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>>>   at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>>>   at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>>>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>>>   at
>>> org.apache.spark.sql.SparkSessionExtensions.buildResolutionRules(SparkSessionExtensions.scala:152)
>>>   at
>>> org.apache.spark.sql.internal.BaseSessionStateBuilder.customResolutionRules(BaseSessionStateBuilder.scala:216)
>>>   at
>>> org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:94)
>>>   at
>>> org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:85)
>>>   at
>>> org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:360)
>>>   at
>>> org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:87)
>>>   at
>>> org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:87)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
>>>   at
>>> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
>>>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
>>>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
>>>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
>>>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
>>>   at
>>> org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
>>>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
>>>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
>>>   ... 47 elided
>>> Caused by: java.lang.ClassNotFoundException: scala.collection.SeqOps
>>>   at
>>> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
>>>   at
>>> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
>>>   at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
>>>   ... 79 more
>>>
>>> Note, I usually verify by copying the spark-runtime jar to spark jars
>>> dir (can't usually get --packages flag to work as indicated on
>>> https://iceberg.apache.org/how-to-release/#verifying-with-spark, as
>>> version is not released yet), so let me know if I am using the wrong jar?
>>>
>>> Thanks
>>> Szehon
>>>
>>> On Mon, Oct 10, 2022 at 9:22 AM Eduard Tudenhoefner <[email protected]>
>>> wrote:
>>>
>>>> +1 (non-binding)
>>>>
>>>>    - validated checksum and signature
>>>>    - checked license docs & ran RAT checks
>>>>    - ran build and tests with JDK11
>>>>
>>>>
>>>> Eduard
>>>>
>>>> On Mon, Oct 10, 2022 at 8:01 AM Ajantha Bhat <[email protected]>
>>>> wrote:
>>>>
>>>>> +1 (non-binding)
>>>>>
>>>>>
>>>>>    - Verified the Spark runtime jar contents.
>>>>>    - Checked license docs, ran RAT checks.
>>>>>    - Validated checksum and signature.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Ajantha
>>>>>
>>>>> On Mon, Oct 10, 2022 at 10:45 AM Prashant Singh <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello Everyone,
>>>>>>
>>>>>> Wanted to know your thoughts on whether we should also include the
>>>>>> following bug fixes in this release as well:
>>>>>>
>>>>>> 1. MERGE INTO nullability fix, leads to query failure otherwise:
>>>>>> *Reported instances :*
>>>>>> a.
>>>>>> https://stackoverflow.com/questions/73424454/spark-iceberg-merge-into-issue-caused-by-org-apache-spark-sql-analysisexcep
>>>>>> b. https://github.com/apache/iceberg/issues/5739
>>>>>> c.
>>>>>> https://github.com/apache/iceberg/issues/5424#issuecomment-1220688298
>>>>>>
>>>>>> *PR's (Merged):*
>>>>>> a. https://github.com/apache/iceberg/pull/5880
>>>>>> b. https://github.com/apache/iceberg/pull/5679
>>>>>>
>>>>>> 2.  QueryFailure when running RewriteManifestProcedure on Date /
>>>>>> Timestamp partitioned table when
>>>>>> `spark.sql.datetime.java8API.enabled` is true.
>>>>>> *Reported instances :*
>>>>>> a. https://github.com/apache/iceberg/issues/5104
>>>>>> b.
>>>>>> https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1663982635731469
>>>>>>
>>>>>> *PR* :
>>>>>> a. https://github.com/apache/iceberg/pull/5860
>>>>>>
>>>>>> Regards,
>>>>>> Prashant Singh
>>>>>>
>>>>>> On Mon, Oct 10, 2022 at 4:15 AM Ryan Blue <[email protected]> wrote:
>>>>>>
>>>>>>> +1 (binding)
>>>>>>>
>>>>>>>    - Checked license docs, ran RAT checks
>>>>>>>    - Validated checksum and signature
>>>>>>>    - Built and tested with Java 11
>>>>>>>    - Built binary artifacts with Java 8
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Oct 9, 2022 at 3:42 PM Ryan Blue <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Everyone,
>>>>>>>>
>>>>>>>> I propose that we release the following RC as the official Apache
>>>>>>>> Iceberg 1.0.0 release.
>>>>>>>>
>>>>>>>> The commit ID is e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01
>>>>>>>> * This corresponds to the tag: apache-iceberg-1.0.0-rc0
>>>>>>>> *
>>>>>>>> https://github.com/apache/iceberg/commits/apache-iceberg-1.0.0-rc0
>>>>>>>> *
>>>>>>>> https://github.com/apache/iceberg/tree/e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01
>>>>>>>>
>>>>>>>> The release tarball, signature, and checksums are here:
>>>>>>>> *
>>>>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.0.0-rc0
>>>>>>>>
>>>>>>>> You can find the KEYS file here:
>>>>>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>>>>>>
>>>>>>>> Convenience binary artifacts are staged on Nexus. The Maven
>>>>>>>> repository URL is:
>>>>>>>> *
>>>>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1106/
>>>>>>>>
>>>>>>>> Please download, verify, and test.
>>>>>>>>
>>>>>>>> This release is based on the latest 0.14.1 release. It includes
>>>>>>>> changes to remove deprecated APIs and the following additional bug 
>>>>>>>> fixes:
>>>>>>>> * Increase metrics limit to 100 columns
>>>>>>>> * Bump Spark patch versions for CVE-2022-33891
>>>>>>>> * Exclude Scala from Spark runtime Jars
>>>>>>>>
>>>>>>>> Please vote in the next 72 hours.
>>>>>>>>
>>>>>>>> [ ] +1 Release this as Apache Iceberg 1.0.0
>>>>>>>> [ ] +0
>>>>>>>> [ ] -1 Do not release this because...
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ryan Blue
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>>
>>>>>>

Re: [VOTE] Release Apache Iceberg 1.0.0 RC0

Reply via email to