[ https://issues.apache.org/jira/browse/HIVE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863510#comment-13863510 ]
Sushanth Sowmyan commented on HIVE-6115: ---------------------------------------- There are two purposes served - one, to check that hbase-default.xml and hbase-site.xml are accessible, which HiveHBaseStorageHandler.addHBaseResources achieves, and the other is to add those as requisite resources for the current job, which is achieved by the inner call directly to HBaseConfiguration on the jobconf. >From a HCat perspective, if I remember correctly, the second is needed to >setup and ship the job correctly, otherwise we'd wind up fail with errors >indicating that we're failing not being able to talk to zookeeper or the >master. Per your contention, the problem is that if you do have a local override hbase-site.xml, it still winds up pulling in a default hbase-default.xml/hbase-site.xml and thus fails? I'm a little confused as to how this might be a problem, since when those resources are added, they're added by name, without any associated path, and thus, would need to be present as resolved in the classpath anyway. Or I was barking up the wrong tree with that interpretation, and the problem is the update semantic that HiveHBaseStorageHandler.addHBaseResources takes care of is abused, and we wind up nuking other conf values by replacing, rather than strictly updating only for values where the values do not exist. In which case it makes sense to have a segment there which goes something like this: {code} tundra:hive sush$ git diff hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java diff --git a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java index fc63970..d76abe8 100644 --- a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java +++ b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java @@ -333,7 +333,11 @@ public void configureTableJobProperties( // check to see if this an input job or an outputjob if (this.configureInputJobProps) { try { - HBaseConfiguration.addHbaseResources(jobConf); + for (String k : jobProperties.keySet()){ + jobConf.set(k, jobProperties.get(k)); + } + jobConf.addResource("hbase-default.xml"); + jobConf.addResource("hbase-site.xml"); addHBaseDelegationToken(jobConf); }//try catch (IOException e) { {code} This, then, would be functionally equivalent and satisfy the need for those resources to be present, and not pollute jobconf with the rest of the parameters? This would then, however, be forcing visibility of hbase's internals out onto here, and looks hacky. What parameters get overridden by hbase's resource import that should not be overridden? This might be something to fix on HBaseConfiguration.addHBaseResources' end instead, then. > Remove redundant code in HiveHBaseStorageHandler > ------------------------------------------------ > > Key: HIVE-6115 > URL: https://issues.apache.org/jira/browse/HIVE-6115 > Project: Hive > Issue Type: Improvement > Affects Versions: 0.12.0 > Reporter: Brock Noland > Assignee: Brock Noland > Attachments: HIVE-6115.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)