I’m am not entirely sure how Apache Hive is structured, but this CVE refers to 
HIVE-25468<https://issues.apache.org/jira/browse/HIVE-25468> and Backport of 
HIVE-25468 into Hive 3.x<https://github.com/apache/hive/pull/4309/files>; which 
indicates that the Hive Metastore-server and Hive Standalone-metastore were 
updated for the fix.

When looking at vulnerabilities, many security teams, including ours, have 
begun to look at them as Vulnerable or Affected. Vulnerable being, directly 
impacted by the vulnerability and exploitable; while Affected is indicating if 
a vulnerable dependency/package/jar is being delivered with a product. In this 
case, Spark is delivering these hive jars within the distribution:


  183633 Dec 16 23:33 hive-beeline-2.3.9.jar

   44704 Dec 16 23:33 hive-cli-2.3.9.jar

  436169 Dec 16 23:33 hive-common-2.3.9.jar

10840949 Dec 16 23:33 hive-exec-2.3.9-core.jar

  116364 Dec 16 23:33 hive-jdbc-2.3.9.jar

 326585 Dec 16 23:33 hive-llap-common-2.3.9.jar

8195966 Dec 16 23:33 hive-metastore-2.3.9.jar

 916630 Dec 16 23:33 hive-serde-2.3.9.jar

1679366 Dec 16 23:33 hive-service-rpc-3.1.3.jar

  53902 Dec 16 23:33 hive-shims-0.23-2.3.9.jar

   8786 Dec 16 23:33 hive-shims-2.3.9.jar

 120293 Dec 16 23:33 hive-shims-common-2.3.9.jar

  12923 Dec 16 23:33 hive-shims-scheduler-2.3.9.jar

 258346 Dec 16 23:33 hive-storage-api-2.8.1.jar

 577200 Dec 16 23:33 spark-hive-thriftserver_2.13-3.5.4.jar

 735193 Dec 16 23:33 spark-hive_2.13-3.5.4.jar



And to extend that further, these outdated Apache Hive dependencies pull in 
other older dependencies:



  75567 Dec 16 23:33 jackson-annotations-2.15.2.jar

549207 Dec 16 23:33 jackson-core-2.15.2.jar

232248 Dec 16 23:33 jackson-core-asl-1.9.13.jar

1620088 Dec 16 23:33 jackson-databind-2.15.2.jar

  54630 Dec 16 23:33 jackson-dataformat-yaml-2.15.2.jar

122937 Dec 16 23:33 jackson-datatype-jsr310-2.15.2.jar

780664 Dec 16 23:33 jackson-mapper-asl-1.9.13.jar

518681 Dec 16 23:33 jackson-module-scala_2.13-2.15.2.jar

  37085 Dec 16 23:33 json4s-jackson_2.13-3.7.0-M11.jar

2017388 Dec 16 23:33 parquet-jackson-1.13.1.jar


Which looking at a dependency like jackson-mapper-asl-1.9.13.jar, it has a 
Critical and High CVE against it. With that said, if a user accidentally users 
one of these dependents in their Spark application; will Java CLASSPATH, set 
the $SPARK_HOME/jars as precedent and in turn expose the unknowing end user to 
a vulnerability that way?

With all of that said, there is a Jira item 
SPARK-30466<https://issues.apache.org/jira/browse/SPARK-30466> to remove a 
dependency like Jackson-mapper-asl-1.9.13; but it is stuck behind 
SPARK-44114<https://issues.apache.org/jira/browse/SPARK-44114> which in turn is 
blocked by HIVE-27508<https://issues.apache.org/jira/browse/HIVE-27508>. Does 
Apache Spark’s community have enough push to encourage Apache Hive’s team to 
possibly release a Hive 3.1.4 which would solve both these old CVEs along with 
the one Balaji brought up in this thread. This would especially be great as the 
next likelihood for upgrading to Hive 3.x wouldn’t occur until Spark 5.x.

Sincerely,


Nicholas T. Marion
Senior AI and Analytics Development Lead | IBM zDNN Product Owner
Mobile: 1 845 649 3592
E-mail: nmar...@us.ibm.com <mailto:nmar...@us.ibm.com>

IBM

From: Mich Talebzadeh <mich.talebza...@gmail.com>
Date: Monday, January 27, 2025 at 10:11 AM
To: Sean Owen <sro...@gmail.com>
Cc: Balaji Sudharsanam V <balaji.sudharsa...@ibm.com>, dev@spark.apache.org 
<dev@spark.apache.org>
Subject: [EXTERNAL] Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar 
versions
To answer your question, I did not read this CVE, but I am responding solely 
from my previous experiences with vulennabiries and the thread owner 
implications, having used spark in conjunction with Spark for many years. Mich 
Talebzadeh, Architect

To answer your question, I did not read this CVE, but I am responding solely 
from my previous experiences with vulennabiries and the thread owner 
implications, having used spark in conjunction with Spark for  many years.


Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

 [Image removed by sender.]   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ >




On Mon, 27 Jan 2025 at 15:03, Sean Owen 
<sro...@gmail.com<mailto:sro...@gmail.com>> wrote:
Mich: did you read the CVE? I'm not clear, as this contains no reference to the 
Hive functionality that is affected, or how it might relate to a metastore. 
Please explain. Otherwise this looks like a generic AI-generated response with 
no particularly relevant content. "In summary"...

On Mon, Jan 27, 2025 at 8:57 AM Mich Talebzadeh 
<mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote:
I think the thread owner's point is valid. The default use of the Hive 
Metastore by Spark further gives credence to the importance of addressing this 
Hive vulnerability to ensure the security and reliability of Spark 
applications. I use Hive as the default metastore for Spark as well. Spark 
relies heavily on the Hive Metastore for managing critical metadata, such as 
table schemas, data locations, and access control, unless you are using a 
platform like Databricks with a unified catalog. In summary, this dependency 
makes it essential to address any vulnerabilities within the Hive Metastore, as 
they can indirectly impact the security and stability of Spark applications 
among other things

HTH

Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR

 [Image removed by sender.]   view my Linkedin 
profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ >




On Mon, 27 Jan 2025 at 13:37, Sean Owen 
<sro...@gmail.com<mailto:sro...@gmail.com>> wrote:
It looks like that affects Hive, and not the metastore. I do not see that it is 
relevant to Spark at first glance.


On Mon, Jan 27, 2025 at 1:21 AM Balaji Sudharsanam V 
<balaji.sudharsa...@ibm.com.invalid> wrote:
Hi All,

There is a vulnerability with ‘High’ severity found in the Apache Spark 3.x and 
4.0.0 preview (2) releases, with the hive-metastore-2.3.x.jar.
This is defined here, Apache Hive security bypass CVE-2021-34538 Vulnerability 
Report<https://exchange.xforce.ibmcloud.com/vulnerabilities/231404 >

The recommendation is to use upgrade to the latest version of Apache Hive 
(3.1.3, 4.0 or later), available from the Apache Web site.

Can we expect this getting fixed in the Apache Spark 4.0 GA ?

Thanks,
Balaji


Reply via email to