[ 
https://issues.apache.org/jira/browse/HIVE-6115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863510#comment-13863510
 ] 

Sushanth Sowmyan commented on HIVE-6115:
----------------------------------------

There are two purposes served - one, to check that hbase-default.xml and 
hbase-site.xml are accessible, which HiveHBaseStorageHandler.addHBaseResources 
achieves, and the other is to add those as requisite resources for the current 
job, which is achieved by the inner call directly to HBaseConfiguration on the 
jobconf.

>From a HCat perspective, if I remember correctly, the second is needed to 
>setup and ship the job correctly, otherwise we'd wind up fail with errors 
>indicating that we're failing not being able to talk to zookeeper or the 
>master.

Per your contention, the problem is that if you do have a local override 
hbase-site.xml, it still winds up pulling in a default 
hbase-default.xml/hbase-site.xml and thus fails? I'm a little confused as to 
how this might be a problem, since when those resources are added,  they're 
added by name, without any associated path, and thus, would need to be present 
as resolved in the classpath anyway.

Or I was barking up the wrong tree with that interpretation, and the problem is 
the update semantic that HiveHBaseStorageHandler.addHBaseResources takes care 
of is abused, and we wind up nuking other conf values by replacing, rather than 
strictly updating only for values where the values do not exist. In which case 
it makes sense to have a segment there which goes something like this:

{code}
tundra:hive sush$ git diff 
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
diff --git 
a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java 
b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
index fc63970..d76abe8 100644
--- 
a/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
+++ 
b/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
@@ -333,7 +333,11 @@ public void configureTableJobProperties(
     // check to see if this an input job or an outputjob
     if (this.configureInputJobProps) {
       try {
-        HBaseConfiguration.addHbaseResources(jobConf);
+        for (String k : jobProperties.keySet()){
+          jobConf.set(k, jobProperties.get(k));
+        }
+        jobConf.addResource("hbase-default.xml");
+        jobConf.addResource("hbase-site.xml");
         addHBaseDelegationToken(jobConf);
       }//try
       catch (IOException e) {
{code}

This, then, would be functionally equivalent and satisfy the need for those 
resources to be present, and not pollute jobconf with the rest of the 
parameters?

This would then, however, be forcing visibility of hbase's internals out onto 
here, and looks hacky. What parameters get overridden by hbase's resource 
import that should not be overridden? This might be something to fix on 
HBaseConfiguration.addHBaseResources' end instead, then.

> Remove redundant code in HiveHBaseStorageHandler
> ------------------------------------------------
>
>                 Key: HIVE-6115
>                 URL: https://issues.apache.org/jira/browse/HIVE-6115
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.12.0
>            Reporter: Brock Noland
>            Assignee: Brock Noland
>         Attachments: HIVE-6115.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to