[ 
https://issues.apache.org/jira/browse/HIVE-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422589#comment-13422589
 ] 

Vikram Dixit K commented on HIVE-3025:
--------------------------------------

After digging more into this with @hashutosh's help, we see the following 
issues:

1. The hadoop archive command line has changed.
2. There is no way in the current set of commands supported by hive for a user 
to specify a parent directory for the archive.
3. The api createHadoopArchive in all shims is the same which is 
counter-intuitive.

The hadoop archive command has changed between versions 0.20 and 
0.20S/1.0/0.23. There is a compulsory command line parameter -p that is 
required in the latter versions. Since these versions are driving the same 
command line as 0.20 (without the -p), they fail. This needs to be fixed in the 
createHadoopArchive api.

The createHadoopArchive has the issue that it checks 
hive.archive.har.parentdir.settable. The user, in the current set of commands 
available, has no way of setting a parent directory for the creation of the 
archive. So, in the future when that ability is added, we need to revisit the 
createHadoopArchive api itself or derive it from conf.

The createHadoopArchive api is the same across all the shims, i.e. 
Hadoop20Shims.java and the HadoopShimsSecure.java have the exact same 
implementation of this api which is counter-intuitive considering the shims are 
supposed to be specific for versions of hadoop. 

So, I propose at this time, we should fix the createHadoopArchive in the 
HadoopShimsSecure to adhere to the new command line expected by those versions 
of Hadoop. We should also fix the Hadoop20Shims api to not worry about the -p 
parameter since it cannot use it. 

Please let me know if I am missing something.
                
> Fix Hive ARCHIVE command on 0.22 and 0.23
> -----------------------------------------
>
>                 Key: HIVE-3025
>                 URL: https://issues.apache.org/jira/browse/HIVE-3025
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.9.0
>            Reporter: Carl Steinbach
>            Assignee: Carl Steinbach
>         Attachments: HIVE-3025.D3195.1.patch
>
>
> archive.q and archive_multi.q fail when Hive is run on top of Hadoop 0.22 or 
> 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to