Hi Mich, Thanks for your reply. The cloud cluster is to be used for read-only analytics, so effectively one-way, stand-by. I'll take a look at your suggested technologies as I'm not familiar with them.
Thanks - Elliot. On 17 December 2015 at 16:57, Mich Talebzadeh <m...@peridale.co.uk> wrote: > Sounds like one way replication of metastore. Depending on your metastore > platform that could be achieved pretty easily. > > > > Mine is Oracle and I use Materialised View replication which is pretty > good but no latest technology. Others would be GoldenGate or SAP > replication server. > > > > HTH, > > > > Mich > > > > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk] > *Sent:* 17 December 2015 16:47 > *To:* user@hive.apache.org > *Subject:* RE: Synchronizing Hive metastores across clusters > > > > Are both clusters in active/active mode or the cloud based cluster is > standby? > > > > *From:* Elliot West [mailto:tea...@gmail.com <tea...@gmail.com>] > *Sent:* 17 December 2015 16:21 > *To:* user@hive.apache.org > *Subject:* Synchronizing Hive metastores across clusters > > > > Hello, > > > > I'm thinking about the steps required to repeatedly push Hive datasets out > from a traditional Hadoop cluster into a parallel cloud based cluster. This > is not a one off, it needs to be a constantly running sync process. As new > tables and partitions are added in one cluster, they need to be synced to > the cloud cluster. Assuming for a moment that I have the HDFS data syncing > working, I'm wondering what steps I need to take to reliably ship the > HCatalog metadata across. I use HCatalog as the point of truth as to when > when data is available and where it is located and so I think that metadata > is a critical element to replicate in the cloud based cluster. > > > > Does anyone have any recommendations on how to achieve this in practice? > One issue (of many I suspect) is that Hive appears to store table/partition > locations internally with absolute, fully qualified URLs, therefore unless > the target cloud cluster is similarly named and configured some path > transformation step will be needed as part of the synchronisation process. > > > > I'd appreciate any suggestions, thoughts, or experiences related to this. > > > > Cheers - Elliot. > > > > >