Re: Synchronizing Hive metastores across clusters

2016-01-21 Thread Elliot West
Following up on this: I've spent some time trying to evaluate the Hive replication features but in truth it's more been an exercise in trying to get them working! I thought I'd share my findings: - Conceptually this feature can sync (nearly) all Hive metadata and data changes between two clu

Re: Synchronizing Hive metastores across clusters

2015-12-18 Thread Elliot West
Eugene/Susanth, Thank you for pointing me in the direction of these features. I'll investigate them further to see if I can put them to good use. Cheers - Elliot. On 17 December 2015 at 20:03, Sushanth Sowmyan wrote: > Also, while I have not wiki-ized the documentation for the above, I > have

RE: Synchronizing Hive metastores across clusters

2015-12-17 Thread Mich Talebzadeh
or SAP replication server. HTH, Mich From: Mich Talebzadeh [mailto:m...@peridale.co.uk <mailto:m...@peridale.co.uk> ] Sent: 17 December 2015 16:47 To: user@hive.apache.org <mailto:user@hive.apache.org> Subject: RE: Synchronizing Hive metastores across clusters Are

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Sushanth Sowmyan
Also, while I have not wiki-ized the documentation for the above, I have uploaded slides from talks that I've given in hive user group meetup on the subject, and also a doc that describes the replication protocol followed for the EXIM replication that are attached over at https://issues.apache.org/

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Sushanth Sowmyan
Hi, I think that the replication work added with https://issues.apache.org/jira/browse/HIVE-7973 is exactly up this alley. Per Eugene's suggestion of MetaStoreEventListener, this replication system plugs into that and gets you a stream of notification events from HCatClient for the exact purpose

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Eugene Koifman
Metastore supports MetaStoreEventListener and MetaStorePreEventListener which may be useful here Eugene From: Elliot West mailto:tea...@gmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Thursday, December 17, 2015 at 8:21 AM To: "user@

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Jörn Franke
Hive has the export/import commands, alternatively Falcon+oozie > On 17 Dec 2015, at 17:21, Elliot West wrote: > > Hello, > > I'm thinking about the steps required to repeatedly push Hive datasets out > from a traditional Hadoop cluster into a parallel cloud based cluster. This > is not a one

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Elliot West
> > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk] > *Sent:* 17 December 2015 16:47 > *To:* user@hive.apache.org > *Subject:* RE: Synchronizing Hive metastores across clusters > > > > Are both clusters in active/active mode or the cloud based cluster is > standby?

Re: Synchronizing Hive metastores across clusters

2015-12-17 Thread Elliot West
server. > > > > HTH, > > > > Mich > > > > *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk] > *Sent:* 17 December 2015 16:47 > *To:* user@hive.apache.org > *Subject:* RE: Synchronizing Hive metastores across clusters > > > > Are both clusters

RE: Synchronizing Hive metastores across clusters

2015-12-17 Thread Mich Talebzadeh
, Mich From: Mich Talebzadeh [mailto:m...@peridale.co.uk] Sent: 17 December 2015 16:47 To: user@hive.apache.org Subject: RE: Synchronizing Hive metastores across clusters Are both clusters in active/active mode or the cloud based cluster is standby? From: Elliot West [mailto:tea

RE: Synchronizing Hive metastores across clusters

2015-12-17 Thread Mich Talebzadeh
Are both clusters in active/active mode or the cloud based cluster is standby? From: Elliot West [mailto:tea...@gmail.com] Sent: 17 December 2015 16:21 To: user@hive.apache.org Subject: Synchronizing Hive metastores across clusters Hello, I'm thinking about the steps required to repeat