I looked back into this today. I made some changes last week to the application to allow for not only compatibility with Spark 1.5.2, but also backwards compatibility with Spark 1.4.1 (the version our current deployment uses). The changes mostly involved changing dependencies from compile to provided scope, while also removing some conflicting dependencies with what’s bundled in the Spark assembled JAR, particularly Scala and SLF4J libraries. Now, the application works fine with Spark 1.6.0; the NPE is not occurring, no patch necessary. So unfortunately, I won’t be able to help determine the root cause, as I cannot replicate this issue.
Thanks for your help. From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>> Date: Friday, February 5, 2016 at 5:40 PM To: Jay Shipper <shipper_...@bah.com<mailto:shipper_...@bah.com>> Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: [External] Re: Spark 1.6.0 HiveContext NPE Was there any other exception(s) in the client log ? Just want to find the cause for this NPE. Thanks On Wed, Feb 3, 2016 at 8:33 AM, Shipper, Jay [USA] <shipper_...@bah.com<mailto:shipper_...@bah.com>> wrote: I’m upgrading an application from Spark 1.4.1 to Spark 1.6.0, and I’m getting a NullPointerException from HiveContext. It’s happening while it tries to load some tables via JDBC from an external database (not Hive), using context.read().jdbc(): — java.lang.NullPointerException at org.apache.spark.sql.hive.client.ClientWrapper.conf(ClientWrapper.scala:205) at org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:552) at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:551) at org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:538) at org.apache.spark.sql.hive.HiveContext$$anonfun$configure$1.apply(HiveContext.scala:537) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at org.apache.spark.sql.hive.HiveContext.configure(HiveContext.scala:537) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:250) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237) at org.apache.spark.sql.hive.HiveContext$$anon$2.<init>(HiveContext.scala:457) at org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:457) at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:456) at org.apache.spark.sql.hive.HiveContext$$anon$3.<init>(HiveContext.scala:473) at org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:473) at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:472) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.baseRelationToDataFrame(SQLContext.scala:442) at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:223) at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146) — Even though the application is not using Hive, HiveContext is used instead of SQLContext, for the additional functionality it provides. There’s no hive-site.xml for the application, but this did not cause an issue for Spark 1.4.1. Does anyone have an idea about what’s changed from 1.4.1 to 1.6.0 that could explain this NPE? The only obvious change I’ve noticed for HiveContext is that the default warehouse location is different (1.4.1 - current directory, 1.6.0 - /user/hive/warehouse), but I verified that this NPE happens even when /user/hive/warehouse exists and is readable/writeable for the application. In terms of changes to the application to work with Spark 1.6.0, the only one that might be relevant to this issue is the upgrade in the Hadoop dependencies to match what Spark 1.6.0 uses (2.6.0-cdh5.7.0-SNAPSHOT). Thanks, Jay