Re: FYI - Apache ZooKeeper Backup, a Treatise

Andrew Purtell Fri, 17 Jun 2016 20:10:53 -0700

See HBASE-14379. The points on configuration, state management, and security 
apply.



> On Jun 17, 2016, at 7:25 PM, Martin Serrano <mar...@attivio.com> wrote:
> 
> Why are you seeking to undo it?
> 
>> On 06/17/2016 09:34 PM, Andrew Purtell wrote:
>> HBase stores replication peering configuration in ZK. We're working on
>> undoing that, but for now that information exists nowhere else.
>> 
>> 
>>> On Thu, Jun 16, 2016 at 2:47 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>>> 
>>> Hi Jordan,
>>> 
>>> Kafka stores ACLs as well as client and topic configs in ZooKeeper so that
>>> lends credence to your argument, I think.
>>> 
>>> Ismael
>>> 
>>> On Thu, Jun 16, 2016 at 11:41 PM, Jordan Zimmerman <
>>> jor...@jordanzimmerman.com> wrote:
>>> 
>>>> Contrary to recommendations everywhere, my experience is that almost
>>>> everyone is storing source of truth data in ZooKeeper. It’s just too
>>>> tempting. You have a distributed file system just sitting there and it’s
>>>> too easy to use. You get a lot of great features like watches, etc.
>>> People
>>>> are using it to store configuration data, sequence numbers, etc. They are
>>>> storing these things without a good means of reproducing them in case of
>>> a
>>>> catastrophic outage. Further, I’ve heard of several orgs who just back up
>>>> the transaction logs and think they can restore them for DR. Anyway,
>>> that’s
>>>> the genesis of my blog post.
>>>> 
>>>> -Jordan
>>>> 
>>>>>> On Jun 16, 2016, at 2:39 PM, Chris Nauroth <cnaur...@hortonworks.com>
>>>>> wrote:
>>>>> Yes, thank you to Jordan for the article!
>>>>> 
>>>>> Like Flavio, I personally have never come across the requirement for
>>>>> ZooKeeper backups.  I've generally followed the pattern that data
>>> stored
>>>>> in ZooKeeper is truly transient, and applications are built either to
>>>>> tolerate loss of that data or reconstruct it from first principles if
>>> it
>>>>> goes missing.  Adding observers in a second data center would give a
>>>>> rudimentary approximation of off-site backup in the case of a data
>>> center
>>>>> disaster, with the usual caveats around propagation delays.
>>>>> 
>>>>> Jordan, I'd be curious if you can share more specific details about the
>>>>> kind of data that you have that necessitates a backup/restore.  (If
>>>> you're
>>>>> not at liberty to share this, then I can understand that.)  It might
>>>>> inform if we have a motivating use case for backup/restore features
>>>> within
>>>>> ZooKeeper, such as some of the transaction log filtering that the
>>> article
>>>>> mentions.
>>>>> 
>>>>> --Chris Nauroth
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 6/16/16, 1:03 AM, "Flavio Junqueira" <f...@apache.org> wrote:
>>>>>> 
>>>>>> Great write-up, Jordan, thanks!
>>>>>> 
>>>>>> Whether to backup zk data or not is possibly an open topic for this
>>>>>> community, even though we have discussed it at times. My sense has
>>> been
>>>>>> that precisely because of the issues you mention in your post, it is
>>>>>> typically best to have a way to recreate its data upon a disaster
>>> rather
>>>>>> than backup the data. I think there could be three general scenarios
>>> in
>>>>>> which folks would prefer to backup data, but you correct me if these
>>>>>> aren't accurate:
>>>>>> 
>>>>>> - The data in zk isn't elsewhere, so it can't be recreated: zk isn't a
>>>>>> regular database, so I'd think it is best not to store data and focus
>>> on
>>>>>> cluster data or metadata.
>>>>>> - There is a just a lot of data and I'd rather have a shorter time to
>>>>>> recover: zk in general shouldn't have that much data in db, but let's
>>> go
>>>>>> with the assumption that for the requirements of the application it
>>> is a
>>>>>> lot. For such a case, it probably depends on whether your application
>>>> can
>>>>>> efficiently and effectively recover from a backup. Basically, as
>>> pointed
>>>>>> out in the post, the data could be inconsistent and cause trouble if
>>> you
>>>>>> don't think about the corner cases.
>>>>>> - The code to recreate the zk metadata for my application is super
>>>>>> complex: if you decide to code against zk, it is good to think whether
>>>>>> reconstructing in the case of a disaster is doable and if it is design
>>>>>> and implement to reconstruct the state upon a disaster.
>>>>>> 
>>>>>> Also, we typically provision enough replicas, often replicating across
>>>>>> data centers, to make sure that the data isn't all gone. Having more
>>>>>> replicas does not rule out completely the possibility of a disaster,
>>> but
>>>>>> in such rare cases we resort to the expensive path.
>>>>>> 
>>>>>> I personally have never worked with an application that was taking
>>>>>> backups of zk data in prod, so I'm really interested in what others
>>>>>> think.
>>>>>> 
>>>>>> -Flavio
>>>>>> 
>>>>>> 
>>>>>>> On 16 Jun 2016, at 00:43, Jordan Zimmerman <
>>> jor...@jordanzimmerman.com
>>>>>>> wrote:
>>>>>>> 
>>>>>>> FYI - I wrote a blog about backing up ZooKeeper:
>>>>>>> 
>>>>>>> https://www.elastic.co/blog/zookeeper-backup-a-treatise
>>>>>>> 
>>>>>>> -Jordan
>

Re: FYI - Apache ZooKeeper Backup, a Treatise

Reply via email to