RE: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Balaji Sudharsanam V
Hi Mich, True the vulnerable jar (hive-metastore-2.3.9.jar) is not directly related to Spark. And completely agree, “Spark does not run a Hive metastore itself nor use Hive for executing queries.” Like Nicholas said, When looking at vulnerabilities, many security teams, including ours, have b

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Sean Owen
Can you connect the CVE to Spark? Spark does not run a Hive metastore itself nor use Hive for executing queries. It is a Hive client in general. That seems to be what is affected. We ask people reporting issues to at least provide a plausible theory for a vulnerability. Just because A depends on B

RE: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Balaji Sudharsanam V
Sean, The vulnerability is explained here, Apache Hive security bypass CVE-2021-34538 Vulnerability Report It’s CVSS base score is 7.5 and it is not an AI gen content for sure. We can dig into the vulnerability though, but it can be a

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Mich Talebzadeh
Thanks for clarification Nicholas. Now the point is often we have when using Spark with hive, the spark_session is created as below spark_session = SparkSession.builder.enableHiveSupport().appName(appName).getOrCreate() and if I go to jars directory /opt/spark/jars> ls -l hi* -rw-r--r--. 1 hdus

Re: FYI: SPARK-49700 Unified Scala Interface for Connect and Classic

2025-01-27 Thread Dongjoon Hyun
Did you see the PR, Martin? SBT is also broken like the following and we've been waiting for actions over two days on the original PR. $ build/sbt clean "catalyst/testOnly org.apache.spark.sql.catalyst.encoders.EncoderResolutionSuite" ... [info] *** 1 SUITE ABORTED *** [error] Error during tests:

RE: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread NICHOLAS MARION
I’m am not entirely sure how Apache Hive is structured, but this CVE refers to HIVE-25468 and Backport of HIVE-25468 into Hive 3.x; which indicates that the Hive Metastore-server and Hive Standalone

Re: FYI: SPARK-49700 Unified Scala Interface for Connect and Classic

2025-01-27 Thread Martin Grund
Would it not have been mindful to wait for the original author to investigate the PR and do a forward fix instead of reverting such a big change? Since this was only blocking the Maven test we could have waited probably a few more days without any issues. On Mon, Jan 27, 2025 at 8:32 PM Dongjoon H

Re: FYI: SPARK-49700 Unified Scala Interface for Connect and Classic

2025-01-27 Thread Dongjoon Hyun
This is reverted from branch-4.0 via the following. - https://github.com/apache/spark/pull/49696 Revert "[SPARK-49700][CONNECT][SQL] Unified Scala Interface for Connect and Classic" Dongjoon. On 2025/01/26 16:58:45 Dongjoon Hyun wrote: > Thank you! > > Dongjoon > > On Sat, Jan 25, 2025 at 20:

Re: Proposal to improve data skew debugging

2025-01-27 Thread Rob Reeves
The counting does use count-min sketch and publishes the top K keys above a skew threshold to an accumulator. The core implementation in my prototype is in InlineApproxCountExec

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Mich Talebzadeh
To answer your question, I did not read this CVE, but I am responding solely from my previous experiences with vulennabiries and the thread owner implications, having used spark in conjunction with Spark for many years. Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analy

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Sean Owen
Mich: did you read the CVE? I'm not clear, as this contains no reference to the Hive functionality that is affected, or how it might relate to a metastore. Please explain. Otherwise this looks like a generic AI-generated response with no particularly relevant content. "In summary"... On Mon, Jan 2

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Mich Talebzadeh
I think the thread owner's point is valid. The default use of the Hive Metastore by Spark further gives credence to the importance of addressing this Hive vulnerability to ensure the security and reliability of Spark applications. I use Hive as the default metastore for Spark as well. Spark relies

Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions

2025-01-27 Thread Sean Owen
It looks like that affects Hive, and not the metastore. I do not see that it is relevant to Spark at first glance. On Mon, Jan 27, 2025 at 1:21 AM Balaji Sudharsanam V wrote: > Hi All, > > There is a vulnerability with ‘High’ severity found in the *Apache Spark > 3.x and 4.0.0 preview (2) relea