[
https://issues.apache.org/jira/browse/HIVE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965126#action_12965126
]
Namit Jain commented on HIVE-1813:
----------------------------------
The data can be copied from one dfs to another using distcp - later on a
wrapper can be developed in hive for the same.
Something like:
alter table <T> partition <P> copy <src> to <dst>;
alter table <T> partition <P> move <src> to <dst>;
> Hive should be able to run on multiple data centers
> ---------------------------------------------------
>
> Key: HIVE-1813
> URL: https://issues.apache.org/jira/browse/HIVE-1813
> Project: Hive
> Issue Type: New Feature
> Reporter: Namit Jain
> Fix For: 0.7.0
>
>
> Currently, hive assumes a single metastore and the HADOOP_HOME is passed as a
> environment variable.
> It would be desirable to support hive on top of multiple data centers (dfs +
> mr).
> For eg. there could be 2 metastores: primary and secondary. They would have
> different dfs's , and there will be a
> dfs->mr mapping maintained by the metastore.
> Hive would be enhanced to support multiple metastores and all operations (ddl
> + query) would span multiple metastores.
> Different consistency pluggable policies can be employed - for eg. if a
> table/partition can be present in both the metastores with different
> last modification times, either the last one can be used or an error can be
> thrown.
> It will be upto the application (outside hive) to copy the data from one
> metastore to another, and to maintain consistency inside.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.