[ https://issues.apache.org/jira/browse/HIVE-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sushanth Sowmyan reassigned HIVE-7973: -------------------------------------- Assignee: Sushanth Sowmyan > Hive Replication Support > ------------------------ > > Key: HIVE-7973 > URL: https://issues.apache.org/jira/browse/HIVE-7973 > Project: Hive > Issue Type: Bug > Components: Import/Export > Reporter: Sushanth Sowmyan > Assignee: Sushanth Sowmyan > > A need for replication is a common one in many database management systems, > and it's important for hive to evolve support for such a tool as part of its > ecosystem. Hive already supports an EXPORT and IMPORT command, which can be > used to dump out tables, distcp them to another cluster, and and > import/create from that. If we had a mechanism by which exports and imports > could be automated, it establishes the base with which replication can be > developed. > One place where this kind of automation can be developed is with aid of the > HiveMetaStoreEventHandler mechanisms, to generate notifications when certain > changes are committed to the metastore, and then translate those > notifications to export actions, distcp actions and import actions on another > import action. > Part of that already exists is with the Notification system that is part of > hcatalog-server-extensions. Initially, this was developed to be able to > trigger a JMS notification, which an Oozie workflow can use to can start off > actions keyed on the finishing of a job that used HCatalog to write to a > table. While this currently lives under hcatalog, the primary reason for its > existence has a scope well past hcatalog alone, and can be used as-is without > the use of HCatalog IF/OF. This can be extended, with the help of a library > which does that aforementioned translation. I also think that these sections > should live in a core hive module, rather than being tucked away inside > hcatalog. > Once we have rudimentary support for table & partition replication, we can > then move on to further requirements of replication, such as metadata > replications (such as replication of changes to roles/etc), and/or optimize > away the requirement to distcp and use webhdfs instead, etc. > This Story tracks all the bits that go into development of such a system - > I'll create multiple smaller tasks inside this as we go on. > Please also see HIVE-10264 for documentation-related links for this, and > https://cwiki.apache.org/confluence/display/Hive/HiveReplicationDevelopment > for associated wiki (currently in progress) -- This message was sent by Atlassian JIRA (v6.3.4#6332)