Greetings, I'm trying to get a working dataflow stack on a 6 node cluster (2 masters + 4 slaves, no Kerberos) using Kafka (2.10_0.10), Storm (1.0.1) and Hive2 (1.2.1). Storm is able to communicate with Kafka, but can't seemingly operate on Hive (on master-1), even though it manages to connect to its metastore.
I thought originally it was a problem of permissions on either HDFS or the local filesystem, but even though I set 777 permissions on /tmp/hive, there's still this issue. In core-site.xml: - hadoop.proxyuser.hcat.group - hadoop.proxyuser.hcat.hosts - hadoop.proxyuser.hdfs.groups - hadoop.proxyuser.hdfs.hosts - hadoop.proxyuser.hive.groups - hadoop.proxyuser.hive.hosts - hadoop.proxyuser.root.groups - hadoop.proxyuser.root.hosts are all set to '*'. Hive2, as far as I see is correctly set to work with transactions, being the target table with transactional=true, stored as orc and bucketed. In the hive-site.xml: - hive.compactor.worker.threads = 1 - hive.compactor.initiator.on = true - hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager I get a Nullpointer Exception, you may find the stack trace among the attached files. >From what I can gather, the NullpointerException is thrown in the following method inside SessionState: 1. public static Path getHDFSSessionPath(Configuration conf) { 2. SessionState ss = SessionState.get(); 3. if (ss == null) { 4. String sessionPathString = conf.get(HDFS_SESSION_PATH_KEY); 5. Preconditions.checkNotNull(sessionPathString, "Conf non-local session path expected to be non-null"); 6. return new Path(sessionPathString); 7. } 8. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path expected to be non-null"); 9. return ss.hdfsSessionPath; 10. } Specifically, by: 1. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path expected to be non-null"); So, it seems to be an hdfs related issue, but I can't understand why it's happening. >From what I gather, this occurs when Hive tries to retrieve the local path of the session, which is stored in the _hive.local.session.path configuration variable. The value of this variable is assigned each time a new Hive session is created, and it is formed by merging the path for user temporary files (hive.exec.local.scratchdir) to the session ID ( hive.session.id). If indeed is a permissions issue, what should I look into to find the origin of the issue? Thanks for your help, Federico
2017-07-08 10:02:36.896 o.a.s.h.t.HiveState [INFO] Creating Writer to Hive end point : {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[2017/07/189] } 2017-07-08 10:02:36.911 h.metastore [INFO] Trying to connect to metastore with URI thrift://master-1.localdomain:9083 2017-07-08 10:02:36.912 h.metastore [INFO] Connected to metastore. 2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver> 2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver> 2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver> 2017-07-08 10:02:36.923 STDIO [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null 2017-07-08 10:02:36.923 o.a.h.h.q.Driver [ERROR] FAILED: NullPointerException Non-local session path expected to be non-null java.lang.NullPointerException: Non-local session path expected to be non-null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:229) at org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:590) at org.apache.hadoop.hive.ql.Context.<init>(Context.java:129) at org.apache.hadoop.hive.ql.Context.<init>(Context.java:116) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:382) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:303) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1067) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1129) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.runDDL(HiveEndPoint.java:404) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.createPartitionIfNotExists(HiveEndPoint.java:369) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:276) at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.<init>(HiveEndPoint.java:243) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnectionImpl(HiveEndPoint.java:180) at org.apache.hive.hcatalog.streaming.HiveEndPoint.newConnection(HiveEndPoint.java:157) at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:238) at org.apache.storm.hive.common.HiveWriter$5.call(HiveWriter.java:235) at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2017-07-08 10:02:36.923 o.a.h.h.q.l.PerfLogger [INFO] </PERFLOG method=compile start=1499504556923 end=1499504556923 duration=0 from=org.apache.hadoop.hive.ql.Driver> 2017-07-08 10:02:36.924 o.a.s.h.t.HiveState [WARN] hive streaming failed. java.lang.NullPointerException