Discussion on this topic has moved over to the JIRA ticket: https://issues.apache.org/jira/browse/HIVE-2612
On Tue, Jan 17, 2012 at 4:01 PM, yongqiang he <heyongqiang...@gmail.com>wrote: > Hi hive-dev, > > We are planning to make hive run across multiple data centers > (physical clusters). We prefer to use hive metastore to provide a > unified namespace. > Tables/partitions can exist in more than one cluster. And one cluster > is defined as a primary cluster. A primary cluster is a table level > property. A table T1's primary cluster is C1 meaning :1) C1 contains > all data that is available in all other clusters. 2) write is only > allowed in this cluster for table C1. but need to allow exceptions > here 3) new partitions are only allowed to be created in C1. 4) all > data changes to T1 happened in the primary cluster should be > replicated to other clusters if there are any secondary clusters. but > there should be a conf to disable it as there are some exception > situations. > > The first thing that needs to be done is to make hive metastore have a > concept of cluster. And that also means all thrift communication calls > to metastore need to provide a cluster parameter. So we have there > options here: > 1) add a cluster parameter to existing thrift interfaces > or > 2) add new interfaces which do exactly the same set of functionalities > as old ones but using a different name (use _on_cluster suffifx > maybe?) and have a cluster parameter > or > 3) overwrite database name for the purpose of cluster name. And allow > a table co-exist in multiple databases. But that require to promote > table to top level citizen, and degrade database. For example, "show > tables" used to scan all tables in current db, but now need to scan > all tables in all databases. > > We would like to get more ideas about which one to choose, and we are > definitely open to other alternatives that we missed here. > > We are also looking for other systems that have solved similar > problems. If anyone knows such a system, we would like to know. > Appreciate that! > > This is tracked on jira https://issues.apache.org/jira/browse/HIVE-2612. > > Thanks >