Thanks for clarification Nicholas. Now the point is often we have when
using Spark with hive, the spark_session is created as below


spark_session =
SparkSession.builder.enableHiveSupport().appName(appName).getOrCreate()

and if I go to jars directory

/opt/spark/jars> ls -l hi*
-rw-r--r--. 1 hduser hadoop   258346 Oct 21 03:29 hive-storage-api-2.8.1.jar
-rw-r--r--. 1 hduser hadoop    12923 Oct 21 03:29
hive-shims-scheduler-2.3.9.jar
-rw-r--r--. 1 hduser hadoop   120293 Oct 21 03:29
hive-shims-common-2.3.9.jar
-rw-r--r--. 1 hduser hadoop     8786 Oct 21 03:29 hive-shims-2.3.9.jar
-rw-r--r--. 1 hduser hadoop    53902 Oct 21 03:29 hive-shims-0.23-2.3.9.jar
-rw-r--r--. 1 hduser hadoop  1679366 Oct 21 03:29 hive-service-rpc-3.1.3.jar
-rw-r--r--. 1 hduser hadoop   916630 Oct 21 03:29 hive-serde-2.3.9.jar
-rw-r--r--. 1 hduser hadoop  8195966 Oct 21 03:29 hive-metastore-2.3.9.jar
-rw-r--r--. 1 hduser hadoop   326585 Oct 21 03:29 hive-llap-common-2.3.9.jar
-rw-r--r--. 1 hduser hadoop   116364 Oct 21 03:29 hive-jdbc-2.3.9.jar
-rw-r--r--. 1 hduser hadoop 10840949 Oct 21 03:29 hive-exec-2.3.9-core.jar
-rw-r--r--. 1 hduser hadoop   436169 Oct 21 03:29 hive-common-2.3.9.jar
-rw-r--r--. 1 hduser hadoop    44704 Oct 21 03:29 hive-cli-2.3.9.jar
-rw-r--r--. 1 hduser hadoop   183633 Oct 21 03:29 hive-beeline-2.3.9.jar

I have all these jars there but are you implying that the potential
vulnerability will
be from hive-metastore-2.3.9.jar alone or all of hive jars?

Cheers

Mich Talebzadeh,

Architect | Data Science | Financial Crime | Forensic Analysis | GDPR


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>





On Mon, 27 Jan 2025 at 20:33, NICHOLAS MARION <nmar...@us.ibm.com> wrote:

> I’m am not entirely sure how Apache Hive is structured, but this CVE
> refers to HIVE-25468 <https://issues.apache.org/jira/browse/HIVE-25468>
> and Backport of HIVE-25468 into Hive 3.x
> <https://github.com/apache/hive/pull/4309/files>; which indicates that
> the Hive Metastore-server and Hive Standalone-metastore were updated for
> the fix.
>
>
>
> When looking at vulnerabilities, many security teams, including ours, have
> begun to look at them as Vulnerable or Affected. Vulnerable being, directly
> impacted by the vulnerability and exploitable; while Affected is indicating
> if a vulnerable dependency/package/jar is being delivered with a product.
> In this case, Spark is delivering these hive jars within the distribution:
>
>
>
> *  183633 Dec 16 23:33 hive-beeline-2.3.9.jar*
>
> *   44704 Dec 16 23:33 hive-cli-2.3.9.jar*
>
> *  436169 Dec 16 23:33 hive-common-2.3.9.jar*
>
> *10840949 Dec 16 23:33 hive-exec-2.3.9-core.jar*
>
> *  116364 Dec 16 23:33 hive-jdbc-2.3.9.jar*
>
>  *326585 Dec 16 23:33 hive-llap-common-2.3.9.jar*
>
> *8195966 Dec 16 23:33 hive-metastore-2.3.9.jar*
>
> * 916630 Dec 16 23:33 hive-serde-2.3.9.jar*
>
> *1679366 Dec 16 23:33 hive-service-rpc-3.1.3.jar*
>
>   *53902 Dec 16 23:33 hive-shims-0.23-2.3.9.jar*
>
>    *8786 Dec 16 23:33 hive-shims-2.3.9.jar*
>
> * 120293 Dec 16 23:33 hive-shims-common-2.3.9.jar*
>
> *  12923 Dec 16 23:33 hive-shims-scheduler-2.3.9.jar*
>
> * 258346 Dec 16 23:33 hive-storage-api-2.8.1.jar*
>
> * 577200 Dec 16 23:33 spark-hive-thriftserver_2.13-3.5.4.jar*
>
> * 735193 Dec 16 23:33 spark-hive_2.13-3.5.4.jar*
>
>
>
> And to extend that further, these outdated Apache Hive dependencies pull
> in other older dependencies:
>
>
>
> *  75567 Dec 16 23:33 jackson-annotations-2.15.2.jar*
>
> *549207 Dec 16 23:33 jackson-core-2.15.2.jar*
>
> *232248 Dec 16 23:33 jackson-core-asl-1.9.13.jar*
>
> *1620088 Dec 16 23:33 jackson-databind-2.15.2.jar*
>
> *  54630 Dec 16 23:33 jackson-dataformat-yaml-2.15.2.jar*
>
> *122937 Dec 16 23:33 jackson-datatype-jsr310-2.15.2.jar*
>
> *780664 Dec 16 23:33 jackson-mapper-asl-1.9.13.jar*
>
> *518681 Dec 16 23:33 jackson-module-scala_2.13-2.15.2.jar*
>
> *  37085 Dec 16 23:33 json4s-jackson_2.13-3.7.0-M11.jar*
>
> *2017388 Dec 16 23:33 parquet-jackson-1.13.1.jar*
>
>
>
> Which looking at a dependency like j*ackson-mapper-asl-1.9.13.jar*, it
> has a Critical and High CVE against it. With that said, if a user
> accidentally users one of these dependents in their Spark application; will
> Java CLASSPATH, set the $SPARK_HOME/jars as precedent and in turn expose
> the unknowing end user to a vulnerability that way?
>
>
>
> With all of that said, there is a Jira item SPARK-30466
> <https://issues.apache.org/jira/browse/SPARK-30466> to remove a
> dependency like *Jackson-mapper-asl-1.9.13*; but it is stuck behind
> SPARK-44114 <https://issues.apache.org/jira/browse/SPARK-44114> which in
> turn is blocked by HIVE-27508
> <https://issues.apache.org/jira/browse/HIVE-27508>. Does Apache Spark’s
> community have enough push to encourage Apache Hive’s team to possibly
> release a Hive 3.1.4 which would solve both these old CVEs along with the
> one Balaji brought up in this thread. This would especially be great as the
> next likelihood for upgrading to Hive 3.x wouldn’t occur until Spark 5.x.
>
>
>
> Sincerely,
>
>
>
> *Nicholas T. Marion *
> Senior AI and Analytics Development Lead | IBM zDNN Product Owner
> * Mobile:* 1 845 649 3592
> * E-mail:* nmar...@us.ibm.com
>
> IBM
>
>
>
> *From: *Mich Talebzadeh <mich.talebza...@gmail.com>
> *Date: *Monday, January 27, 2025 at 10:11 AM
> *To: *Sean Owen <sro...@gmail.com>
> *Cc: *Balaji Sudharsanam V <balaji.sudharsa...@ibm.com>,
> dev@spark.apache.org <dev@spark.apache.org>
> *Subject: *[EXTERNAL] Re: Spark 4.0 vulnerable with
> hive-metastore-2.3.x.jar versions
>
> To answer your question, I did not read this CVE, but I am responding
> solely from my previous experiences with vulennabiries and the thread owner
> implications, having used spark in conjunction with Spark for many years.
> Mich Talebzadeh, Architect
>
> To answer your question, I did not read this CVE, but I am responding
> solely from my previous experiences with vulennabiries and the thread owner
> implications, having used spark in conjunction with Spark for  many years.
>
>
>
>
>
> Mich Talebzadeh,
>
> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>
>
>
>  [image: Image removed by sender.]  view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>
>
>
>
>
>
> On Mon, 27 Jan 2025 at 15:03, Sean Owen <sro...@gmail.com> wrote:
>
> Mich: did you read the CVE? I'm not clear, as this contains no reference
> to the Hive functionality that is affected, or how it might relate to a
> metastore. Please explain. Otherwise this looks like a generic AI-generated
> response with no particularly relevant content. "In summary"...
>
>
>
> On Mon, Jan 27, 2025 at 8:57 AM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
> I think the thread owner's point is valid. The default use of the Hive
> Metastore by Spark further gives credence to the importance of addressing
> this Hive vulnerability to ensure the security and reliability of Spark
> applications. I use Hive as the default metastore for Spark as well. Spark
> relies heavily on the Hive Metastore for managing critical metadata, such
> as table schemas, data locations, and access control, unless you are using
> a platform like Databricks with a unified catalog. In summary, this
> dependency makes it essential to address any vulnerabilities within the
> Hive Metastore, as they can indirectly impact the security and stability of
> Spark applications among other things
>
>
>
> HTH
>
>
>
> Mich Talebzadeh,
>
> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>
>
>
>  [image: Image removed by sender.]  view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>
>
>
>
>
>
> On Mon, 27 Jan 2025 at 13:37, Sean Owen <sro...@gmail.com> wrote:
>
> It looks like that affects Hive, and not the metastore. I do not see that
> it is relevant to Spark at first glance.
>
>
>
>
>
> On Mon, Jan 27, 2025 at 1:21 AM Balaji Sudharsanam V
> <balaji.sudharsa...@ibm.com.invalid> wrote:
>
> Hi All,
>
> There is a vulnerability with ‘High’ severity found in the *Apache Spark
> 3.x and 4.0.0 preview (2) releases,* with the hive-metastore-2.3.x.jar.
> This is defined here, Apache Hive security bypass CVE-2021-34538
> Vulnerability Report
> <https://exchange.xforce.ibmcloud.com/vulnerabilities/231404>
>
>
>
> The recommendation is to use upgrade to the latest version of Apache Hive
> (*3.1.3, 4.0 or later*), available from the Apache Web site.
>
>
>
> Can we expect this getting fixed in the Apache Spark 4.0 GA ?
>
> Thanks,
>
> Balaji
>
>
>
>

Reply via email to