[jira] Commented: (HIVE-1813) Hive should be able to run on multiple data centers

Namit Jain (JIRA) Mon, 29 Nov 2010 22:59:39 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965126#action_12965126
 ]


Namit Jain commented on HIVE-1813:
----------------------------------

The data can be copied from one dfs to another using distcp - later on a 
wrapper can be developed in hive for the same.
Something like:

alter table <T> partition <P> copy <src> to <dst>;
alter table <T> partition <P> move <src> to <dst>;

> Hive should be able to run on multiple data centers
> ---------------------------------------------------
>
>                 Key: HIVE-1813
>                 URL: https://issues.apache.org/jira/browse/HIVE-1813
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Namit Jain
>             Fix For: 0.7.0
>
>
> Currently, hive assumes a single metastore and the HADOOP_HOME is passed as a 
> environment variable. 
> It would be desirable to support hive on top of multiple data centers (dfs + 
> mr).
> For eg. there could be 2 metastores: primary and secondary. They would have 
> different dfs's , and there will be a
> dfs->mr mapping maintained by the metastore.
> Hive would be enhanced to support multiple metastores and all operations (ddl 
> + query) would span multiple metastores.
> Different consistency pluggable policies can be employed - for eg. if a 
> table/partition can be present in both the metastores with different
> last modification times, either the last one can be used or an error can be 
> thrown.
> It will be upto the application (outside hive) to copy the data from one 
> metastore to another, and to maintain consistency inside.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1813) Hive should be able to run on multiple data centers

Reply via email to