Re: [PATCH] Introduce per-instance filesystem UUIDs

Ben Reser Wed, 20 Aug 2014 09:14:35 -0700

On 8/19/14 12:07 AM, Branko Čibej wrote:
> I think it's not that simple.
> 
> Consider the case where an administrator decides to not use 'svnadmin hotcopy'
> to back up a repository, but instead creates a (LVM) snapshot of the volume 
> and
> uses 'tar' (or 'cp -a') to create the backup.
> 
> When such a backup is restored and made active, everything will just work ...
> except that stale caches in svnserve or mor_dav_svn will not be automatically
> invalidated. In other words, the mere introduction of the instance ID does not
> solve "all" problems. There are several possible resolutions to this 
> particular
> problem:
> 
>   * Tell the users "don't do that". That won't help; they'll do it anyway.
>   * Require a restart of all servers when restoring such backups; been there,
>     people forget.
>   * Require that the users run 'svnadmin recover' before bringing the
>     repository online; this might work if 'svnadmin recover' tweaks the
>     instance ID, since presumably they're already using it per our existing
>     recommendation.
>   * Invent 'svnadmin restore' or 'svnadmin activate' or whatnot to make such
>     backups viable; see above, people forget.
>   * Require 'svnadmin setuuid' on the restored backups; this breaks existing
>     working copies.
>
> So, even though the existence of the instance ID is an implementation detail,
> it does cause visible change in the behaviour of the repository: server
> restarts due to fiddling with the repository instance are needed far less
> often; but we still have to document when and why they are needed.


I think part of the problem here has been we (as in WANdisco folks) have
discussed the idea of an instance ID for repositories in the past to solve the
range of replacing the repository without clearing the cache issues.  But this
change is being added for a very different reason.

Evgeny has implemented the instance ID for the purpose of solving the problem
of two different repositories not being able to be locked if they happen to
have the same UUID.  This happens because we use a mutex to handle locking
between threads and that mutex can't distinguish between different repositories
with identical UUIDs.

Currently the code on trunk adds the instance ID to the cache keys.  I'm not
sure we should be doing that (though both brane and stefan2 requested that be
done).  As per the discussion today at the SHF hackathon the instance ID can't
resolve the failure to clear the cache issues.  The best it can do is narrow
the window for these issues to exist.  That would seem like a good thing but I
think it creates a huge false sense of security.  We will ultimately have
someone that comes along with a corrupted repository, we're going to say you
replaced the repo while the server was running and the user is going to say
"But I've been doing this for years without any problem."

Without the instance ID in the cache keys users are unlikely to actually
corrupt their repository (just like they would be with them, it's a pretty hard
race to hit).  But they are likely to get errors related to the cache being
stale.  This gives them a giant hint that what they're doing is wrong and gives
us an opportunity to educate them.

Re: [PATCH] Introduce per-instance filesystem UUIDs

Reply via email to