Thanks for your reply! Your comment made me realize that the table I was trying to write onto didn't have any partition, while I was trying to write in a specific partition:
val mapper: DelimitedRecordHiveMapper = new DelimitedRecordHiveMapper().withColumnFields(new Fields(colNames)).withTimeAsPartitionField("YYYY/MM/DD") could that be the problem? Anyway, I tried to comment out the withTimeAsPartionField and I am now getting a totally different error, which could really be the actual issue (as an attachment the complete stacktrace): java.io.IOException: No FileSystem for scheme: hdfs which makes me think I am bundling the wrong HDFS jar in the jar application I'm building. Still, the version being bundled is hdfs 2.6.1, while the version on the cluster is 2.7.3.2.5.5.0-157 (using HDP 2.5) which shouldn't they be compatible? Any suggestion? 2017-07-10 20:02 GMT+02:00 Eugene Koifman <ekoif...@hortonworks.com>: > Are you able to write to Hive to an existing partition? (The stack trace > shows that it’s being created) > > > > > > *From: *Federico D'Ambrosio <fedex...@gmail.com> > *Reply-To: *"d...@hive.apache.org" <d...@hive.apache.org> > *Date: *Monday, July 10, 2017 at 7:38 AM > *To: *"user@hive.apache.org" <user@hive.apache.org>, "d...@hive.apache.org" > <d...@hive.apache.org> > *Subject: *Non-local session path expected to be non-null trying to write > on Hive using storm-hive > > > > Greetings, > > I'mtrying to get a working dataflow stack on a 6 node cluster (2 masters + > 4 slaves, no Kerberos) using Kafka (2.10_0.10), Storm (1.0.1) and Hive2 > (1.2.1). Storm is able to communicate with Kafka, but can't seemingly > operate on Hive (on master-1), even though it manages to connect to its > metastore. > > I thought originally it was a problem of permissions on either HDFS or the > local filesystem, but even though I set 777 permissions on /tmp/hive, > there's still this issue. > > In core-site.xml: > > - hadoop.proxyuser.hcat.group > > · hadoop.proxyuser.hcat.hosts > > - hadoop.proxyuser.hdfs.groups > - hadoop.proxyuser.hdfs.hosts > - hadoop.proxyuser.hive.groups > - hadoop.proxyuser.hive.hosts > - hadoop.proxyuser.root.groups > - hadoop.proxyuser.root.hosts > > are all set to '*'. > > Hive2, as far as I see is correctly set to work with transactions, being > the target table with transactional=true, stored as orc and bucketed. In > the hive-site.xml: > > - hive.compactor.worker.threads = 1 > - hive.compactor.initiator.on = true > - hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager > > I get a Nullpointer Exception, you may find the stack trace among the > attached files. > > From what I can gather, the NullpointerException is thrown in the > following method inside SessionState: > > 1. public static Path getHDFSSessionPath(Configuration conf) { > > 2. SessionState ss = SessionState.get(); > > 3. if (ss == null) { > > 4. String sessionPathString = conf.get(HDFS_SESSION_PATH_KEY); > > 5. Preconditions.checkNotNull(sessionPathString, "Conf non-local > session path expected to be non-null"); > > 6. return new Path(sessionPathString); > > 7. } > > 8. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path > expected to be non-null"); > > 9. return ss.hdfsSessionPath; > > 10.} > > > > Specifically, by: > > 1. Preconditions.checkNotNull(ss.hdfsSessionPath, "Non-local session path > expected to be non-null"); > > So, it seems to be an hdfs related issue, but I can't understand why it's > happening. > > From what I gather, this occurs when Hive tries to retrieve the local path > of the session, which is stored in the _hive.local.session.path > configuration variable. The value of this variable is assigned each time a > new Hive session is created, and it is formed by merging the path for user > temporary files (hive.exec.local.scratchdir) to the session ID ( > hive.session.id). > > If indeed is a permissions issue, what should I look into to find the > origin of the issue? > > Thanks for your help, > > Federico > -- Federico D'Ambrosio
2017-07-10 20:07:40.212 o.a.s.h.t.HiveState [INFO] Creating Writer to Hive end point : {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[] } 2017-07-10 20:07:40.262 h.metastore [INFO] Trying to connect to metastore with URI thrift://master-1.localdomain:9083 2017-07-10 20:07:40.273 h.metastore [INFO] Connected to metastore. 2017-07-10 20:07:40.290 h.metastore [INFO] Trying to connect to metastore with URI thrift://master-1.localdomain:9083 2017-07-10 20:07:40.298 h.metastore [INFO] Connected to metastore. 2017-07-10 20:07:40.416 o.a.h.h.s.AbstractRecordWriter [ERROR] Failed creating record updater java.io.IOException: No FileSystem for scheme: hdfs at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) ~[stormjar.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) ~[stormjar.jar:?] at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:215) ~[stormjar.jar:?] at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:282) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:137) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:117) [stormjar.jar:?] at org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47) [stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506) [stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458) [stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345) [stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325) [stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256) [stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253) [stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366) [stormjar.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77] 2017-07-10 20:07:40.417 o.a.s.h.t.HiveState [ERROR] Failed to create HiveWriter for endpoint: {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[] } org.apache.storm.hive.common.HiveWriter$ConnectFailure: Failed connecting to EndPoint {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[] } at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:80) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveUtils.makeHiveWriter(HiveUtils.java:50) ~[stormjar.jar:?] at org.apache.storm.hive.trident.HiveState.getOrCreateWriter(HiveState.java:206) [stormjar.jar:?] at org.apache.storm.hive.trident.HiveState.writeTuples(HiveState.java:125) [stormjar.jar:?] at org.apache.storm.hive.trident.HiveState.updateState(HiveState.java:112) [stormjar.jar:?] at org.apache.storm.hive.trident.HiveUpdater.updateState(HiveUpdater.java:30) [stormjar.jar:?] at org.apache.storm.hive.trident.HiveUpdater.updateState(HiveUpdater.java:27) [stormjar.jar:?] at org.apache.storm.trident.planner.processor.PartitionPersistProcessor.finishBatch(PartitionPersistProcessor.java:98) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.trident.planner.SubtopologyBolt.finishBatch(SubtopologyBolt.java:151) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.trident.topology.TridentBoltExecutor.finishBatch(TridentBoltExecutor.java:266) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.trident.topology.TridentBoltExecutor.checkFinish(TridentBoltExecutor.java:299) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:373) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) [storm-core-1.0.1.2.5.5.0-157.jar:1.0.1.2.5.5.0-157] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77] Caused by: org.apache.storm.hive.common.HiveWriter$TxnBatchFailure: Failed acquiring Transaction Batch from EndPoint: {metaStoreUri='thrift://master-1.localdomain:9083,thrift://master-2.localdomain:9083', database='data_stream', table='air_traffic_test', partitionVals=[] } at org.apache.storm.hive.common.HiveWriter.nextTxnBatch(HiveWriter.java:264) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter.<init>(HiveWriter.java:72) ~[stormjar.jar:?] ... 21 more Caused by: org.apache.hive.hcatalog.streaming.StreamingIOFailure: Unable to get new record Updater at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:120) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366) ~[stormjar.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_77] ... 1 more Caused by: java.io.IOException: No FileSystem for scheme: hdfs at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) ~[stormjar.jar:?] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) ~[stormjar.jar:?] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) ~[stormjar.jar:?] at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:215) ~[stormjar.jar:?] at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:282) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.createRecordUpdater(AbstractRecordWriter.java:137) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.newBatch(AbstractRecordWriter.java:117) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.DelimitedInputWriter.newBatch(DelimitedInputWriter.java:47) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:506) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.<init>(HiveEndPoint.java:458) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:345) ~[stormjar.jar:?] at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:325) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:256) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$6.call(HiveWriter.java:253) ~[stormjar.jar:?] at org.apache.storm.hive.common.HiveWriter$9.call(HiveWriter.java:366) ~[stormjar.jar:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_77] ... 1 more 2017-07-10 20:07:40.417 o.a.s.h.t.HiveState [WARN] hive streaming failed.