I’m am not entirely sure how Apache Hive is structured, but this CVE refers to HIVE-25468<https://issues.apache.org/jira/browse/HIVE-25468> and Backport of HIVE-25468 into Hive 3.x<https://github.com/apache/hive/pull/4309/files>; which indicates that the Hive Metastore-server and Hive Standalone-metastore were updated for the fix.
When looking at vulnerabilities, many security teams, including ours, have begun to look at them as Vulnerable or Affected. Vulnerable being, directly impacted by the vulnerability and exploitable; while Affected is indicating if a vulnerable dependency/package/jar is being delivered with a product. In this case, Spark is delivering these hive jars within the distribution: 183633 Dec 16 23:33 hive-beeline-2.3.9.jar 44704 Dec 16 23:33 hive-cli-2.3.9.jar 436169 Dec 16 23:33 hive-common-2.3.9.jar 10840949 Dec 16 23:33 hive-exec-2.3.9-core.jar 116364 Dec 16 23:33 hive-jdbc-2.3.9.jar 326585 Dec 16 23:33 hive-llap-common-2.3.9.jar 8195966 Dec 16 23:33 hive-metastore-2.3.9.jar 916630 Dec 16 23:33 hive-serde-2.3.9.jar 1679366 Dec 16 23:33 hive-service-rpc-3.1.3.jar 53902 Dec 16 23:33 hive-shims-0.23-2.3.9.jar 8786 Dec 16 23:33 hive-shims-2.3.9.jar 120293 Dec 16 23:33 hive-shims-common-2.3.9.jar 12923 Dec 16 23:33 hive-shims-scheduler-2.3.9.jar 258346 Dec 16 23:33 hive-storage-api-2.8.1.jar 577200 Dec 16 23:33 spark-hive-thriftserver_2.13-3.5.4.jar 735193 Dec 16 23:33 spark-hive_2.13-3.5.4.jar And to extend that further, these outdated Apache Hive dependencies pull in other older dependencies: 75567 Dec 16 23:33 jackson-annotations-2.15.2.jar 549207 Dec 16 23:33 jackson-core-2.15.2.jar 232248 Dec 16 23:33 jackson-core-asl-1.9.13.jar 1620088 Dec 16 23:33 jackson-databind-2.15.2.jar 54630 Dec 16 23:33 jackson-dataformat-yaml-2.15.2.jar 122937 Dec 16 23:33 jackson-datatype-jsr310-2.15.2.jar 780664 Dec 16 23:33 jackson-mapper-asl-1.9.13.jar 518681 Dec 16 23:33 jackson-module-scala_2.13-2.15.2.jar 37085 Dec 16 23:33 json4s-jackson_2.13-3.7.0-M11.jar 2017388 Dec 16 23:33 parquet-jackson-1.13.1.jar Which looking at a dependency like jackson-mapper-asl-1.9.13.jar, it has a Critical and High CVE against it. With that said, if a user accidentally users one of these dependents in their Spark application; will Java CLASSPATH, set the $SPARK_HOME/jars as precedent and in turn expose the unknowing end user to a vulnerability that way? With all of that said, there is a Jira item SPARK-30466<https://issues.apache.org/jira/browse/SPARK-30466> to remove a dependency like Jackson-mapper-asl-1.9.13; but it is stuck behind SPARK-44114<https://issues.apache.org/jira/browse/SPARK-44114> which in turn is blocked by HIVE-27508<https://issues.apache.org/jira/browse/HIVE-27508>. Does Apache Spark’s community have enough push to encourage Apache Hive’s team to possibly release a Hive 3.1.4 which would solve both these old CVEs along with the one Balaji brought up in this thread. This would especially be great as the next likelihood for upgrading to Hive 3.x wouldn’t occur until Spark 5.x. Sincerely, Nicholas T. Marion Senior AI and Analytics Development Lead | IBM zDNN Product Owner Mobile: 1 845 649 3592 E-mail: nmar...@us.ibm.com <mailto:nmar...@us.ibm.com> IBM From: Mich Talebzadeh <mich.talebza...@gmail.com> Date: Monday, January 27, 2025 at 10:11 AM To: Sean Owen <sro...@gmail.com> Cc: Balaji Sudharsanam V <balaji.sudharsa...@ibm.com>, dev@spark.apache.org <dev@spark.apache.org> Subject: [EXTERNAL] Re: Spark 4.0 vulnerable with hive-metastore-2.3.x.jar versions To answer your question, I did not read this CVE, but I am responding solely from my previous experiences with vulennabiries and the thread owner implications, having used spark in conjunction with Spark for many years. Mich Talebzadeh, Architect To answer your question, I did not read this CVE, but I am responding solely from my previous experiences with vulennabiries and the thread owner implications, having used spark in conjunction with Spark for many years. Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR [Image removed by sender.] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ > On Mon, 27 Jan 2025 at 15:03, Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>> wrote: Mich: did you read the CVE? I'm not clear, as this contains no reference to the Hive functionality that is affected, or how it might relate to a metastore. Please explain. Otherwise this looks like a generic AI-generated response with no particularly relevant content. "In summary"... On Mon, Jan 27, 2025 at 8:57 AM Mich Talebzadeh <mich.talebza...@gmail.com<mailto:mich.talebza...@gmail.com>> wrote: I think the thread owner's point is valid. The default use of the Hive Metastore by Spark further gives credence to the importance of addressing this Hive vulnerability to ensure the security and reliability of Spark applications. I use Hive as the default metastore for Spark as well. Spark relies heavily on the Hive Metastore for managing critical metadata, such as table schemas, data locations, and access control, unless you are using a platform like Databricks with a unified catalog. In summary, this dependency makes it essential to address any vulnerabilities within the Hive Metastore, as they can indirectly impact the security and stability of Spark applications among other things HTH Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR [Image removed by sender.] view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/ > On Mon, 27 Jan 2025 at 13:37, Sean Owen <sro...@gmail.com<mailto:sro...@gmail.com>> wrote: It looks like that affects Hive, and not the metastore. I do not see that it is relevant to Spark at first glance. On Mon, Jan 27, 2025 at 1:21 AM Balaji Sudharsanam V <balaji.sudharsa...@ibm.com.invalid> wrote: Hi All, There is a vulnerability with ‘High’ severity found in the Apache Spark 3.x and 4.0.0 preview (2) releases, with the hive-metastore-2.3.x.jar. This is defined here, Apache Hive security bypass CVE-2021-34538 Vulnerability Report<https://exchange.xforce.ibmcloud.com/vulnerabilities/231404 > The recommendation is to use upgrade to the latest version of Apache Hive (3.1.3, 4.0 or later), available from the Apache Web site. Can we expect this getting fixed in the Apache Spark 4.0 GA ? Thanks, Balaji