Re: [VOTE] Release Spark 2.4.6 (RC8)

Mridul Muralidharan Wed, 03 Jun 2020 12:34:38 -0700

  Is this a behavior change in 2.4.x from earlier version ?
Or are we proposing to introduce  a functionality to help with adoption ?


Regards,
Mridul


On Wed, Jun 3, 2020 at 10:32 AM Xiao Li <gatorsm...@gmail.com> wrote:

> Yes. Spark 3.0 RC2 works well.
>
> I think the current behavior in Spark 2.4 affects the adoption, especially
> for the new users who want to try Spark in their local environment.
>
> It impacts all our built-in clients, like Scala Shell and PySpark. Should
> we consider back-porting it to 2.4?
>
> Although this fixes the bug, it will also introduce the behavior change.
> We should publicly document it and mention it in the release note. Let us
> review it more carefully and understand the risk and impact.
>
> Thanks,
>
> Xiao
>
> Nicholas Chammas <nicholas.cham...@gmail.com> 于2020年6月3日周三 上午10:12写道：
>
>> I believe that was fixed in 3.0 and there was a decision not to backport
>> the fix: SPARK-31170 <https://issues.apache.org/jira/browse/SPARK-31170>
>>
>> On Wed, Jun 3, 2020 at 1:04 PM Xiao Li <gatorsm...@gmail.com> wrote:
>>
>>> Just downloaded it in my local macbook. Trying to create a table using
>>> the pre-built PySpark. It sounds like the conf "spark.sql.warehouse.dir"
>>> does not take an effect. It is trying to create a directory in
>>> "file:/user/hive/warehouse/t1". I have not done any investigation yet. Have
>>> any of you hit the same issue?
>>>
>>> C02XT0U7JGH5:bin lixiao$ ./pyspark --conf
>>> spark.sql.warehouse.dir="/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6"
>>>
>>> Python 2.7.16 (default, Jan 27 2020, 04:46:15)
>>>
>>> [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)] on darwin
>>>
>>> Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> 20/06/03 09:56:11 WARN NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>>
>>> Using Spark's default log4j profile:
>>> org/apache/spark/log4j-defaults.properties
>>>
>>> Setting default log level to "WARN".
>>>
>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>>> setLogLevel(newLevel).
>>>
>>> Welcome to
>>>
>>>       ____              __
>>>
>>>      / __/__  ___ _____/ /__
>>>
>>>     _\ \/ _ \/ _ `/ __/  '_/
>>>
>>>    /__ / .__/\_,_/_/ /_/\_\   version 2.4.6
>>>
>>>       /_/
>>>
>>>
>>> Using Python version 2.7.16 (default, Jan 27 2020 04:46:15)
>>>
>>> SparkSession available as 'spark'.
>>>
>>> >>> spark.sql("set spark.sql.warehouse.dir").show(truncate=False)
>>>
>>>
>>> +-----------------------+-------------------------------------------------+
>>>
>>> |key                    |value
>>>   |
>>>
>>>
>>> +-----------------------+-------------------------------------------------+
>>>
>>>
>>> |spark.sql.warehouse.dir|/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6|
>>>
>>>
>>> +-----------------------+-------------------------------------------------+
>>>
>>>
>>> >>> spark.sql("create table t1 (col1 int)")
>>>
>>> 20/06/03 09:56:29 WARN HiveMetaStore: Location:
>>> file:/user/hive/warehouse/t1 specified for non-external table:t1
>>>
>>> Traceback (most recent call last):
>>>
>>>   File "<stdin>", line 1, in <module>
>>>
>>>   File
>>> "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/session.py",
>>> line 767, in sql
>>>
>>>     return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
>>>
>>>   File
>>> "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
>>> line 1257, in __call__
>>>
>>>   File
>>> "/Users/lixiao/Downloads/spark-2.4.6-bin-hadoop2.6/python/pyspark/sql/utils.py",
>>> line 69, in deco
>>>
>>>     raise AnalysisException(s.split(': ', 1)[1], stackTrace)
>>>
>>> pyspark.sql.utils.AnalysisException:
>>> u'org.apache.hadoop.hive.ql.metadata.HiveException:
>>> MetaException(message:file:/user/hive/warehouse/t1 is not a directory or
>>> unable to create one);'
>>>
>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2020年6月3日周三 上午9:18写道：
>>>
>>>> +1
>>>>
>>>> Bests,
>>>> Dongjoon
>>>>
>>>> On Wed, Jun 3, 2020 at 5:59 AM Tom Graves <tgraves...@yahoo.com.invalid>
>>>> wrote:
>>>>
>>>>>  +1
>>>>>
>>>>> Tom
>>>>>
>>>>> On Sunday, May 31, 2020, 06:47:09 PM CDT, Holden Karau <
>>>>> hol...@pigscanfly.ca> wrote:
>>>>>
>>>>>
>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>> version 2.4.6.
>>>>>
>>>>> The vote is open until June 5th at 9AM PST and passes if a majority +1
>>>>> PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>
>>>>> [ ] +1 Release this package as Apache Spark 2.4.6
>>>>> [ ] -1 Do not release this package because ...
>>>>>
>>>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>>>
>>>>> There are currently no issues targeting 2.4.6 (try project = SPARK AND
>>>>> "Target Version/s" = "2.4.6" AND status in (Open, Reopened, "In 
>>>>> Progress"))
>>>>>
>>>>> The tag to be voted on is v2.4.6-rc8 (commit
>>>>> 807e0a484d1de767d1f02bd8a622da6450bdf940):
>>>>> https://github.com/apache/spark/tree/v2.4.6-rc8
>>>>>
>>>>> The release files, including signatures, digests, etc. can be found at:
>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-bin/
>>>>>
>>>>> Signatures used for Spark RCs can be found in this file:
>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>
>>>>> The staging repository for this release can be found at:
>>>>> https://repository.apache.org/content/repositories/orgapachespark-1349/
>>>>>
>>>>> The documentation corresponding to this release can be found at:
>>>>> https://dist.apache.org/repos/dist/dev/spark/v2.4.6-rc8-docs/
>>>>>
>>>>> The list of bug fixes going into 2.4.6 can be found at the following
>>>>> URL:
>>>>> https://issues.apache.org/jira/projects/SPARK/versions/12346781
>>>>>
>>>>> This release is using the release script of the tag v2.4.6-rc8.
>>>>>
>>>>> FAQ
>>>>>
>>>>> =========================
>>>>> What happened to the other RCs?
>>>>> =========================
>>>>>
>>>>> The parallel maven build caused some flakiness so I wasn't comfortable
>>>>> releasing them. I backported the fix from the 3.0 branch for this release.
>>>>> I've got a proposed change to the build script so that we only push tags
>>>>> when once the build is a success for the future, but it does not block 
>>>>> this
>>>>> release.
>>>>>
>>>>> =========================
>>>>> How can I help test this release?
>>>>> =========================
>>>>>
>>>>> If you are a Spark user, you can help us test this release by taking
>>>>> an existing Spark workload and running on this release candidate, then
>>>>> reporting any regressions.
>>>>>
>>>>> If you're working in PySpark you can set up a virtual env and install
>>>>> the current RC and see if anything important breaks, in the Java/Scala
>>>>> you can add the staging repository to your projects resolvers and test
>>>>> with the RC (make sure to clean up the artifact cache before/after so
>>>>> you don't end up building with an out of date RC going forward).
>>>>>
>>>>> ===========================================
>>>>> What should happen to JIRA tickets still targeting 2.4.6?
>>>>> ===========================================
>>>>>
>>>>> The current list of open tickets targeted at 2.4.6 can be found at:
>>>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>>>> Version/s" = 2.4.6
>>>>>
>>>>> Committers should look at those and triage. Extremely important bug
>>>>> fixes, documentation, and API tweaks that impact compatibility should
>>>>> be worked on immediately. Everything else please retarget to an
>>>>> appropriate release.
>>>>>
>>>>> ==================
>>>>> But my bug isn't fixed?
>>>>> ==================
>>>>>
>>>>> In order to make timely releases, we will typically not hold the
>>>>> release unless the bug in question is a regression from the previous
>>>>> release. That being said, if there is something which is a regression
>>>>> that has not been correctly targeted please ping me or a committer to
>>>>> help target the issue.
>>>>>
>>>>>
>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>
>>>>

Re: [VOTE] Release Spark 2.4.6 (RC8)

Reply via email to