Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-08 Thread Jungtaek Lim
Apologize for late participation.

I'm sorry, but -1 (non-binding) from me.

Unfortunately I found a major user-facing issue which hurts UX seriously on
Kafka data source usage.

In some cases, Kafka data source can throw IllegalStateException for the
case of failOnDataLoss=true which condition is bound to the state of Kafka
topic (not Spark's issue). With the recent change of Spark,
IllegalStateException is now bound to the "internal error", and Spark gives
incorrect guidance to the end users, telling to end users that Spark has a
bug and they are encouraged to file a JIRA ticket which is simply wrong.

Previously, Kafka data source provided the error message with the context
why it failed, and how to workaround it. I feel this is a serious
regression on UX.

Please look into https://issues.apache.org/jira/browse/SPARK-39412 for more
details.


On Wed, Jun 8, 2022 at 3:40 PM Hyukjin Kwon  wrote:

> Okay. Thankfully the binary release is fine per
> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-build.sh#L268
> .
> The source package (and GitHub tag) has 3.3.0.dev0, and the binary package
> has 3.3.0. Technically this is not a blocker now because PyPI upload will
> be able to be made correctly.
> I lowered the priority to critical. I switch my -1 to 0.
>
> On Wed, 8 Jun 2022 at 15:17, Hyukjin Kwon  wrote:
>
>> Arrrgh  .. I am very sorry that I found this problem late.
>> RC 5 does not have the correct version of PySpark, see
>> https://github.com/apache/spark/blob/v3.3.0-rc5/python/pyspark/version.py#L19
>> I think the release script was broken because the version now has 'str'
>> type, see
>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-tag.sh#L88
>> I filed a JIRA at https://issues.apache.org/jira/browse/SPARK-39411
>>
>> -1 from me
>>
>>
>>
>> On Wed, 8 Jun 2022 at 13:16, Cheng Pan  wrote:
>>
>>> +1 (non-binding)
>>>
>>> * Verified SPARK-39313 has been address[1]
>>> * Passed integration test w/ Apache Kyuubi (Incubating)[2]
>>>
>>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/123
>>> [2] https://github.com/apache/incubator-kyuubi/pull/2817
>>>
>>> Thanks,
>>> Cheng Pan
>>>
>>> On Wed, Jun 8, 2022 at 7:04 AM Chris Nauroth 
>>> wrote:
>>> >
>>> > +1 (non-binding)
>>> >
>>> > * Verified all checksums.
>>> > * Verified all signatures.
>>> > * Built from source, with multiple profiles, to full success, for Java
>>> 11 and Scala 2.13:
>>> > * build/mvn -Phadoop-3 -Phadoop-cloud -Phive-thriftserver
>>> -Pkubernetes -Pscala-2.13 -Psparkr -Pyarn -DskipTests clean package
>>> > * Tests passed.
>>> > * Ran several examples successfully:
>>> > * bin/spark-submit --class org.apache.spark.examples.SparkPi
>>> examples/jars/spark-examples_2.12-3.3.0.jar
>>> > * bin/spark-submit --class
>>> org.apache.spark.examples.sql.hive.SparkHiveExample
>>> examples/jars/spark-examples_2.12-3.3.0.jar
>>> > * bin/spark-submit
>>> examples/src/main/python/streaming/network_wordcount.py localhost 
>>> > * Tested some of the issues that blocked prior release candidates:
>>> > * bin/spark-sql -e 'SELECT (SELECT IF(x, 1, 0)) AS a FROM (SELECT
>>> true) t(x) UNION SELECT 1 AS a;'
>>> > * bin/spark-sql -e "select date '2018-11-17' > 1"
>>> > * SPARK-39293 ArrayAggregate fix
>>> >
>>> > Chris Nauroth
>>> >
>>> >
>>> > On Tue, Jun 7, 2022 at 1:30 PM Cheng Su 
>>> wrote:
>>> >>
>>> >> +1 (non-binding). Built and ran some internal test for Spark SQL.
>>> >>
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Cheng Su
>>> >>
>>> >>
>>> >>
>>> >> From: L. C. Hsieh 
>>> >> Date: Tuesday, June 7, 2022 at 1:23 PM
>>> >> To: dev 
>>> >> Subject: Re: [VOTE] Release Spark 3.3.0 (RC5)
>>> >>
>>> >> +1
>>> >>
>>> >> Liang-Chi
>>> >>
>>> >> On Tue, Jun 7, 2022 at 1:03 PM Gengliang Wang 
>>> wrote:
>>> >> >
>>> >> > +1 (non-binding)
>>> >> >
>>> >> > Gengliang
>>> >> >
>>> >> > On Tue, Jun 7, 2022 at 12:24 PM Thomas Graves 
>>> wrote:
>>> >> >>
>>> >> >> +1
>>> >> >>
>>> >> >> Tom Graves
>>> >> >>
>>> >> >> On Sat, Jun 4, 2022 at 9:50 AM Maxim Gekk
>>> >> >>  wrote:
>>> >> >> >
>>> >> >> > Please vote on releasing the following candidate as Apache Spark
>>> version 3.3.0.
>>> >> >> >
>>> >> >> > The vote is open until 11:59pm Pacific time June 8th and passes
>>> if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>> >> >> >
>>> >> >> > [ ] +1 Release this package as Apache Spark 3.3.0
>>> >> >> > [ ] -1 Do not release this package because ...
>>> >> >> >
>>> >> >> > To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>> >> >> >
>>> >> >> > The tag to be voted on is v3.3.0-rc5 (commit
>>> 7cf29705272ab8e8c70e8885a3664ad8ae3cd5e9):
>>> >> >> > https://github.com/apache/spark/tree/v3.3.0-rc5
>>> >> >> >
>>> >> >> > The release files, including signatures, digests, etc. can be
>>> found at:
>>> >> >> > https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-bin/
>>> >> >> >
>>> >> >> > Signatures used for Spark RCs can 

Root group membership

2022-06-08 Thread Rodrigo
Hi Everyone,

My Security team has raised concerns about the requirement for root group
membership for Spark running on Kubernetes. Does anyone know the reasons
for that requirement, how insecure it is, and any alternatives if at all?

Thanks,
Rodrigo


Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-08 Thread Jerry Peng
I agree with Jungtaek,  -1 from me because of the issue of Kafka source
throwing an error with an incorrect error message that was introduced
recently.  This may mislead users and cause unnecessary confusion.

On Wed, Jun 8, 2022 at 12:04 AM Jungtaek Lim 
wrote:

> Apologize for late participation.
>
> I'm sorry, but -1 (non-binding) from me.
>
> Unfortunately I found a major user-facing issue which hurts UX seriously
> on Kafka data source usage.
>
> In some cases, Kafka data source can throw IllegalStateException for the
> case of failOnDataLoss=true which condition is bound to the state of Kafka
> topic (not Spark's issue). With the recent change of Spark,
> IllegalStateException is now bound to the "internal error", and Spark gives
> incorrect guidance to the end users, telling to end users that Spark has a
> bug and they are encouraged to file a JIRA ticket which is simply wrong.
>
> Previously, Kafka data source provided the error message with the context
> why it failed, and how to workaround it. I feel this is a serious
> regression on UX.
>
> Please look into https://issues.apache.org/jira/browse/SPARK-39412 for
> more details.
>
>
> On Wed, Jun 8, 2022 at 3:40 PM Hyukjin Kwon  wrote:
>
>> Okay. Thankfully the binary release is fine per
>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-build.sh#L268
>> .
>> The source package (and GitHub tag) has 3.3.0.dev0, and the binary
>> package has 3.3.0. Technically this is not a blocker now because PyPI
>> upload will be able to be made correctly.
>> I lowered the priority to critical. I switch my -1 to 0.
>>
>> On Wed, 8 Jun 2022 at 15:17, Hyukjin Kwon  wrote:
>>
>>> Arrrgh  .. I am very sorry that I found this problem late.
>>> RC 5 does not have the correct version of PySpark, see
>>> https://github.com/apache/spark/blob/v3.3.0-rc5/python/pyspark/version.py#L19
>>> I think the release script was broken because the version now has 'str'
>>> type, see
>>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-tag.sh#L88
>>> I filed a JIRA at https://issues.apache.org/jira/browse/SPARK-39411
>>>
>>> -1 from me
>>>
>>>
>>>
>>> On Wed, 8 Jun 2022 at 13:16, Cheng Pan  wrote:
>>>
 +1 (non-binding)

 * Verified SPARK-39313 has been address[1]
 * Passed integration test w/ Apache Kyuubi (Incubating)[2]

 [1] https://github.com/housepower/spark-clickhouse-connector/pull/123
 [2] https://github.com/apache/incubator-kyuubi/pull/2817

 Thanks,
 Cheng Pan

 On Wed, Jun 8, 2022 at 7:04 AM Chris Nauroth 
 wrote:
 >
 > +1 (non-binding)
 >
 > * Verified all checksums.
 > * Verified all signatures.
 > * Built from source, with multiple profiles, to full success, for
 Java 11 and Scala 2.13:
 > * build/mvn -Phadoop-3 -Phadoop-cloud -Phive-thriftserver
 -Pkubernetes -Pscala-2.13 -Psparkr -Pyarn -DskipTests clean package
 > * Tests passed.
 > * Ran several examples successfully:
 > * bin/spark-submit --class org.apache.spark.examples.SparkPi
 examples/jars/spark-examples_2.12-3.3.0.jar
 > * bin/spark-submit --class
 org.apache.spark.examples.sql.hive.SparkHiveExample
 examples/jars/spark-examples_2.12-3.3.0.jar
 > * bin/spark-submit
 examples/src/main/python/streaming/network_wordcount.py localhost 
 > * Tested some of the issues that blocked prior release candidates:
 > * bin/spark-sql -e 'SELECT (SELECT IF(x, 1, 0)) AS a FROM (SELECT
 true) t(x) UNION SELECT 1 AS a;'
 > * bin/spark-sql -e "select date '2018-11-17' > 1"
 > * SPARK-39293 ArrayAggregate fix
 >
 > Chris Nauroth
 >
 >
 > On Tue, Jun 7, 2022 at 1:30 PM Cheng Su 
 wrote:
 >>
 >> +1 (non-binding). Built and ran some internal test for Spark SQL.
 >>
 >>
 >>
 >> Thanks,
 >>
 >> Cheng Su
 >>
 >>
 >>
 >> From: L. C. Hsieh 
 >> Date: Tuesday, June 7, 2022 at 1:23 PM
 >> To: dev 
 >> Subject: Re: [VOTE] Release Spark 3.3.0 (RC5)
 >>
 >> +1
 >>
 >> Liang-Chi
 >>
 >> On Tue, Jun 7, 2022 at 1:03 PM Gengliang Wang 
 wrote:
 >> >
 >> > +1 (non-binding)
 >> >
 >> > Gengliang
 >> >
 >> > On Tue, Jun 7, 2022 at 12:24 PM Thomas Graves <
 tgraves...@gmail.com> wrote:
 >> >>
 >> >> +1
 >> >>
 >> >> Tom Graves
 >> >>
 >> >> On Sat, Jun 4, 2022 at 9:50 AM Maxim Gekk
 >> >>  wrote:
 >> >> >
 >> >> > Please vote on releasing the following candidate as Apache
 Spark version 3.3.0.
 >> >> >
 >> >> > The vote is open until 11:59pm Pacific time June 8th and passes
 if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
 >> >> >
 >> >> > [ ] +1 Release this package as Apache Spark 3.3.0
 >> >> > [ ] -1 Do not release this package because ...
 >> >> >
 >> >> > To learn more about Apache Spark, please 

Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-08 Thread Prashant Singh
-1 from my side as well, found this today.

While testing Apache iceberg with 3.3 found this bug where a table with
partitions with null values we get a NPE on partition discovery, earlier we
use to get `DEFAULT_PARTITION_NAME`

Please look into : https://issues.apache.org/jira/browse/SPARK-39417 for
more details

Regards,
Prashant Singh

On Wed, Jun 8, 2022 at 10:27 PM Jerry Peng 
wrote:

>
>
> I agree with Jungtaek,  -1 from me because of the issue of Kafka source
> throwing an error with an incorrect error message that was introduced
> recently.  This may mislead users and cause unnecessary confusion.
>
> On Wed, Jun 8, 2022 at 12:04 AM Jungtaek Lim 
> wrote:
>
>> Apologize for late participation.
>>
>> I'm sorry, but -1 (non-binding) from me.
>>
>> Unfortunately I found a major user-facing issue which hurts UX seriously
>> on Kafka data source usage.
>>
>> In some cases, Kafka data source can throw IllegalStateException for the
>> case of failOnDataLoss=true which condition is bound to the state of Kafka
>> topic (not Spark's issue). With the recent change of Spark,
>> IllegalStateException is now bound to the "internal error", and Spark gives
>> incorrect guidance to the end users, telling to end users that Spark has a
>> bug and they are encouraged to file a JIRA ticket which is simply wrong.
>>
>> Previously, Kafka data source provided the error message with the context
>> why it failed, and how to workaround it. I feel this is a serious
>> regression on UX.
>>
>> Please look into https://issues.apache.org/jira/browse/SPARK-39412 for
>> more details.
>>
>>
>> On Wed, Jun 8, 2022 at 3:40 PM Hyukjin Kwon  wrote:
>>
>>> Okay. Thankfully the binary release is fine per
>>> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-build.sh#L268
>>> .
>>> The source package (and GitHub tag) has 3.3.0.dev0, and the binary
>>> package has 3.3.0. Technically this is not a blocker now because PyPI
>>> upload will be able to be made correctly.
>>> I lowered the priority to critical. I switch my -1 to 0.
>>>
>>> On Wed, 8 Jun 2022 at 15:17, Hyukjin Kwon  wrote:
>>>
 Arrrgh  .. I am very sorry that I found this problem late.
 RC 5 does not have the correct version of PySpark, see
 https://github.com/apache/spark/blob/v3.3.0-rc5/python/pyspark/version.py#L19
 I think the release script was broken because the version now has 'str'
 type, see
 https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-tag.sh#L88
 I filed a JIRA at https://issues.apache.org/jira/browse/SPARK-39411

 -1 from me



 On Wed, 8 Jun 2022 at 13:16, Cheng Pan  wrote:

> +1 (non-binding)
>
> * Verified SPARK-39313 has been address[1]
> * Passed integration test w/ Apache Kyuubi (Incubating)[2]
>
> [1] https://github.com/housepower/spark-clickhouse-connector/pull/123
> [2] https://github.com/apache/incubator-kyuubi/pull/2817
>
> Thanks,
> Cheng Pan
>
> On Wed, Jun 8, 2022 at 7:04 AM Chris Nauroth 
> wrote:
> >
> > +1 (non-binding)
> >
> > * Verified all checksums.
> > * Verified all signatures.
> > * Built from source, with multiple profiles, to full success, for
> Java 11 and Scala 2.13:
> > * build/mvn -Phadoop-3 -Phadoop-cloud -Phive-thriftserver
> -Pkubernetes -Pscala-2.13 -Psparkr -Pyarn -DskipTests clean package
> > * Tests passed.
> > * Ran several examples successfully:
> > * bin/spark-submit --class org.apache.spark.examples.SparkPi
> examples/jars/spark-examples_2.12-3.3.0.jar
> > * bin/spark-submit --class
> org.apache.spark.examples.sql.hive.SparkHiveExample
> examples/jars/spark-examples_2.12-3.3.0.jar
> > * bin/spark-submit
> examples/src/main/python/streaming/network_wordcount.py localhost 
> > * Tested some of the issues that blocked prior release candidates:
> > * bin/spark-sql -e 'SELECT (SELECT IF(x, 1, 0)) AS a FROM
> (SELECT true) t(x) UNION SELECT 1 AS a;'
> > * bin/spark-sql -e "select date '2018-11-17' > 1"
> > * SPARK-39293 ArrayAggregate fix
> >
> > Chris Nauroth
> >
> >
> > On Tue, Jun 7, 2022 at 1:30 PM Cheng Su 
> wrote:
> >>
> >> +1 (non-binding). Built and ran some internal test for Spark SQL.
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Cheng Su
> >>
> >>
> >>
> >> From: L. C. Hsieh 
> >> Date: Tuesday, June 7, 2022 at 1:23 PM
> >> To: dev 
> >> Subject: Re: [VOTE] Release Spark 3.3.0 (RC5)
> >>
> >> +1
> >>
> >> Liang-Chi
> >>
> >> On Tue, Jun 7, 2022 at 1:03 PM Gengliang Wang 
> wrote:
> >> >
> >> > +1 (non-binding)
> >> >
> >> > Gengliang
> >> >
> >> > On Tue, Jun 7, 2022 at 12:24 PM Thomas Graves <
> tgraves...@gmail.com> wrote:
> >> >>
> >> >> +1
> >> >>
> >> >> Tom Graves
> >> >>
> >> >> On 

Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-08 Thread huaxin gao
Thanks Dongjoon for opening a jira to track this issue. I agree this is a
flaky test. I have seen the flakiness in our internal tests. I also agree
this is a non-blocker because the feature is disabled by default. I will
try to take a look to see if I can find the root cause.

Thanks,
Huaxin

On Mon, Jun 6, 2022 at 12:43 AM Dongjoon Hyun 
wrote:

> +1.
>
> I double-checked the following additionally.
>
> - Run unit tests on Apple Silicon with Java 17/Python 3.9.11/R 4.1.2
> - Run unit tests on Linux with Java11/Scala 2.12/2.13
> - K8s integration test (including Volcano batch scheduler) on K8s v1.24
> - Check S3 read/write with spark-shell with Scala 2.13/Java17.
>
> So far, it looks good except one flaky test from the new `Row-level
> Runtime Filters` feature. Actually, this has been flaky in the previous RCs
> too.
>
> Since `Row-level Runtime Filters` feature is still disabled by default in
> Apache Spark 3.3.0, I filed it as a non-blocker flaky test bug.
>
> https://issues.apache.org/jira/browse/SPARK-39386
>
> If there is no other report on this test case, this could be my local
> environmental issue.
>
> I'm going to test RC5 more until the deadline (June 8th PST).
>
> Thanks,
> Dongjoon.
>
>
> On Sat, Jun 4, 2022 at 1:33 PM Sean Owen  wrote:
>
>> +1 looks good now on Scala 2.13
>>
>> On Sat, Jun 4, 2022 at 9:51 AM Maxim Gekk
>>  wrote:
>>
>>> Please vote on releasing the following candidate as
>>> Apache Spark version 3.3.0.
>>>
>>> The vote is open until 11:59pm Pacific time June 8th and passes if a
>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 3.3.0
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>
>>> The tag to be voted on is v3.3.0-rc5 (commit
>>> 7cf29705272ab8e8c70e8885a3664ad8ae3cd5e9):
>>> https://github.com/apache/spark/tree/v3.3.0-rc5
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1406
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-docs/
>>>
>>> The list of bug fixes going into 3.3.0 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12350369
>>>
>>> This release is using the release script of the tag v3.3.0-rc5.
>>>
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 3.3.0?
>>> ===
>>> The current list of open tickets targeted at 3.3.0 can be found at:
>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>> Version/s" = 3.3.0
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That being said, if there is something which is a regression
>>> that has not been correctly targeted please ping me or a committer to
>>> help target the issue.
>>>
>>> Maxim Gekk
>>>
>>> Software Engineer
>>>
>>> Databricks, Inc.
>>>
>>


Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-08 Thread huaxin gao
I agree with Prashant, -1 from me too because this may break iceberg usage.

Thanks,
Huaxin

On Wed, Jun 8, 2022 at 10:07 AM Prashant Singh 
wrote:

> -1 from my side as well, found this today.
>
> While testing Apache iceberg with 3.3 found this bug where a table with
> partitions with null values we get a NPE on partition discovery, earlier we
> use to get `DEFAULT_PARTITION_NAME`
>
> Please look into : https://issues.apache.org/jira/browse/SPARK-39417 for
> more details
>
> Regards,
> Prashant Singh
>
> On Wed, Jun 8, 2022 at 10:27 PM Jerry Peng 
> wrote:
>
>>
>>
>> I agree with Jungtaek,  -1 from me because of the issue of Kafka source
>> throwing an error with an incorrect error message that was introduced
>> recently.  This may mislead users and cause unnecessary confusion.
>>
>> On Wed, Jun 8, 2022 at 12:04 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Apologize for late participation.
>>>
>>> I'm sorry, but -1 (non-binding) from me.
>>>
>>> Unfortunately I found a major user-facing issue which hurts UX seriously
>>> on Kafka data source usage.
>>>
>>> In some cases, Kafka data source can throw IllegalStateException for the
>>> case of failOnDataLoss=true which condition is bound to the state of Kafka
>>> topic (not Spark's issue). With the recent change of Spark,
>>> IllegalStateException is now bound to the "internal error", and Spark gives
>>> incorrect guidance to the end users, telling to end users that Spark has a
>>> bug and they are encouraged to file a JIRA ticket which is simply wrong.
>>>
>>> Previously, Kafka data source provided the error message with the
>>> context why it failed, and how to workaround it. I feel this is a serious
>>> regression on UX.
>>>
>>> Please look into https://issues.apache.org/jira/browse/SPARK-39412 for
>>> more details.
>>>
>>>
>>> On Wed, Jun 8, 2022 at 3:40 PM Hyukjin Kwon  wrote:
>>>
 Okay. Thankfully the binary release is fine per
 https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-build.sh#L268
 .
 The source package (and GitHub tag) has 3.3.0.dev0, and the binary
 package has 3.3.0. Technically this is not a blocker now because PyPI
 upload will be able to be made correctly.
 I lowered the priority to critical. I switch my -1 to 0.

 On Wed, 8 Jun 2022 at 15:17, Hyukjin Kwon  wrote:

> Arrrgh  .. I am very sorry that I found this problem late.
> RC 5 does not have the correct version of PySpark, see
> https://github.com/apache/spark/blob/v3.3.0-rc5/python/pyspark/version.py#L19
> I think the release script was broken because the version now has
> 'str' type, see
> https://github.com/apache/spark/blob/v3.3.0-rc5/dev/create-release/release-tag.sh#L88
> I filed a JIRA at https://issues.apache.org/jira/browse/SPARK-39411
>
> -1 from me
>
>
>
> On Wed, 8 Jun 2022 at 13:16, Cheng Pan  wrote:
>
>> +1 (non-binding)
>>
>> * Verified SPARK-39313 has been address[1]
>> * Passed integration test w/ Apache Kyuubi (Incubating)[2]
>>
>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/123
>> [2] https://github.com/apache/incubator-kyuubi/pull/2817
>>
>> Thanks,
>> Cheng Pan
>>
>> On Wed, Jun 8, 2022 at 7:04 AM Chris Nauroth 
>> wrote:
>> >
>> > +1 (non-binding)
>> >
>> > * Verified all checksums.
>> > * Verified all signatures.
>> > * Built from source, with multiple profiles, to full success, for
>> Java 11 and Scala 2.13:
>> > * build/mvn -Phadoop-3 -Phadoop-cloud -Phive-thriftserver
>> -Pkubernetes -Pscala-2.13 -Psparkr -Pyarn -DskipTests clean package
>> > * Tests passed.
>> > * Ran several examples successfully:
>> > * bin/spark-submit --class org.apache.spark.examples.SparkPi
>> examples/jars/spark-examples_2.12-3.3.0.jar
>> > * bin/spark-submit --class
>> org.apache.spark.examples.sql.hive.SparkHiveExample
>> examples/jars/spark-examples_2.12-3.3.0.jar
>> > * bin/spark-submit
>> examples/src/main/python/streaming/network_wordcount.py localhost 
>> > * Tested some of the issues that blocked prior release candidates:
>> > * bin/spark-sql -e 'SELECT (SELECT IF(x, 1, 0)) AS a FROM
>> (SELECT true) t(x) UNION SELECT 1 AS a;'
>> > * bin/spark-sql -e "select date '2018-11-17' > 1"
>> > * SPARK-39293 ArrayAggregate fix
>> >
>> > Chris Nauroth
>> >
>> >
>> > On Tue, Jun 7, 2022 at 1:30 PM Cheng Su 
>> wrote:
>> >>
>> >> +1 (non-binding). Built and ran some internal test for Spark SQL.
>> >>
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Cheng Su
>> >>
>> >>
>> >>
>> >> From: L. C. Hsieh 
>> >> Date: Tuesday, June 7, 2022 at 1:23 PM
>> >> To: dev 
>> >> Subject: Re: [VOTE] Release Spark 3.3.0 (RC5)
>> >>
>> >> +1
>> >>
>> >> Liang-Chi
>> >>
>> >>