Re: svn commit: r1040832 - Port a fix for a FSFS packing race

Daniel Shahaf Wed, 08 Dec 2010 13:58:08 -0800

Julian Foad wrote on Tue, Dec 07, 2010 at 18:13:09 +0000:
> On Tue, 2010-12-07, Daniel Shahaf wrote:
> > Julian Foad wrote on Tue, Dec 07, 2010 at 15:27:05 +0000:
> > > (Quoting and replying to two emails at once.)
> > > 
> > > First: I'm assuming that no process will ever do packing without getting
> > > the exclusive write lock.  Can you confirm that for me?  If that's
> > > false, all my reasoning would be bogus.
> > 
> > Why should I confirm that?  You have the code too. :-)
> 
> Heh.  I meant: "I am assuming that's how FSFS is *intended* to work; is
> that your understanding too?"
>


I'm not sure how it was "intended" to work.  I know that at some point
after packing was added I pointed out that two concurrent 'svnadmin pack'
operations will cause data loss if not serialized, and in response
glasser said they should run under the write lock:

http://thread.gmane.org/gmane.comp.version-control.subversion.devel/108040/focus=108048

Daniel
(who, at the time, didn't fully appreciate the importance of using the
same write lock as other FSFS operations)

> - Julian
> 
> (I have not read the rest of this email yet.)
> 
> 
> > Seriously: as of the last time I looked, svn_fs_pack() calls ... which
> > takes the write lock and calls pack_body(), which calls pack_shard(),
> > which does
> >         cat db/revs/$shard/* > db/revs/$shard.pack/pack 
> >         echo '($shard + 1) * 1000' > db/min-unpacked-revprop
> >         rm -rf db/revs/$shard/
> > while still holding the write lock.
> > 
> > > 
> > > On Tue, 2010-12-07 at 12:13 +0200, Daniel Shahaf wrote:
> > > > Philip Martin wrote on Tue, Dec 07, 2010 at 09:49:40 +0000:
> > > > > Julian Foad <julian.f...@wandisco.com> writes:
> > > > > 
> > > > > > On Mon, 2010-12-06 at 18:44 +0000, Philip Martin wrote:
> > > > > >> Julian Foad <julian.f...@wandisco.com> writes:
> > > > > >> 
> > > > > >> > I'm going to drop this "Remove the re-try logic from
> > > > > >> > svn_fs_fs__path_rev_absolute()" follow-up patch, as I don't want 
> > > > > >> > to get
> > > > > >> > into checking or messing with the txn-correctness of FSFS now.  
> > > > > >> > If
> > > > > >> > Daniel or anyone else wants to pursue it, I'd be glad to help.
> > > > > >> 
> > > > > >> I thought the patch was an improvement.  The existing retry in
> > > > > >> svn_fs_fs__path_rev_absolute doesn't fix the race, it simply makes 
> > > > > >> the
> > > > > >> window smaller.  As the patch makes the window larger we are more 
> > > > > >> likely
> > > > > >> to see and fix any places where the caller doesn't handle the race.
> > > > > >
> > > > > > I *think* the problem is that a caller may use this function within 
> > > > > > a
> > > > > > transaction and depend on this internal re-try logic simply to 
> > > > > > update a
> > > > > > stale cached min-unpacked-foo value.  "Stale" in the sense that 
> > > > > > *this*
> > > > > > transaction has changed the real value and hasn't updated the cached
> > > > > > value.
> > > > 
> > > > Yes, it's conceivable that a caller might do:
> > > > 
> > > >   for (int i = 0; i < 2; i++)
> > > >     {
> > > >       ...
> > > >       SVN_ERR(svn_fs_fs__path_rev_absolute());
> > > >       ...
> > > >       if (i == 0)
> > > >         /* other process runs svn_fs_pack(); */;
> > > > [...]
> > > 
> > > No, that's not what I mean.  I mean a caller might do:
> > > 
> > >   with a write lock:
> > >     {
> > >       svn_fs_pack();
> > >       SVN_ERR(svn_fs_fs__path_rev_absolute(&path, ...));
> > >       foo(path);
> > >     }
> > > 
> > > where neither this code nor foo() re-tries.  With the current version of
> > > svn_fs_fs__path_rev_absolute(), foo() will get the correct path,
> > 
> > Unless a commit occurs after svn_fs_fs__path_rev_absolute() returns but
> > before foo() is called.  (This is the condition Stefan2 fixed.)
> > 
> > > whereas if we remove the internal re-try then foo() will get the wrong
> > > path.
> > > 
> > > >   If there are any
> > > > > such cases then your patch should expose them and we will fix them.  
> > > > > Any
> > > > > such errors are easy to handle, compared to the difficulty of fixing
> > > > > races.
> > > > > 
> > > > > The real problem with the existing code is that it partially hides
> > > > > races, by making them harder to trigger, but doesn't fix the race.  If
> > > > > we want to fix the race properly making it easier to trigger is the
> > > > > right thing to do.
> > > 
> > > Agreed.
> > > 
> > > What I meant to indicate is that, before making this change, I would
> > > like to see clearly that a caller with the write lock cannot use an
> > > out-of-date value.  One way to ensure that would be:
> > > 
> > >   update the cached value whenever we get the write lock (which we do);
> > >   update the cached value whenever we update the value on disk (which we
> > > don't);
> > >   anything not using a write lock must accept that the cached value may
> > > be stale, and re-try if necessary.
> > > 
> > > Another way would be:
> > > 
> > >   don't update the cached value when we get the write lock;
> > >   don't update the cached value when we update the value on disk;
> > >   every time we use the cached value within a write lock, either update
> > > before using it or try and re-try if necessary;
> > >   every time we use the cached value without a write lock, try and
> > > re-try if necessary.
> > > 
> > > I don't have a strong preference between those, but the current
> > > inconsistency between updating it when we get the write lock and not
> > > updating it when we update the value on disk seems bad.
> > > 
> > 
> > What is the cost of this inconsistency?
> > 
> > The stale cache will result in a failed stat() call on the path where
> > a revision file existed before it was packed.  Then, either we will
> > update the cache and retry (meaning the only cost was a stat() call), or
> > due to a bug we will forget to retry and, instead, send a bogus
> > error up the stack.
> > 
> > So, we pay one stat(), but in exchange we hope that, by letting the
> > cache *actually* become stale, concurrency bugs that reproduce only
> > when the cache is stale will be unearthed earlier.
> > 
> > In fact, how about this patch?
> > 
> > [[[
> > Index: subversion/libsvn_fs_fs/fs_fs.c
> > ===================================================================
> > --- subversion/libsvn_fs_fs/fs_fs.c (revision 1042961)
> > +++ subversion/libsvn_fs_fs/fs_fs.c (working copy)
> > @@ -7553,7 +7553,7 @@ write_revnum_file(const char *fs_path,
> >     remove the pack file and start again. */
> >  static svn_error_t *
> >  pack_shard(const char *revs_dir,
> > -           const char *fs_path,
> > +           svn_fs_t *fs,
> >             apr_int64_t shard,
> >             int max_files_per_dir,
> >             svn_fs_pack_notify_t notify_func,
> > @@ -7562,6 +7562,7 @@ pack_shard(const char *revs_dir,
> >             void *cancel_baton,
> >             apr_pool_t *pool)
> >  {
> > +  const char *fs_path = fs->path;
> >    const char *pack_file_path, *manifest_file_path, *shard_path;
> >    const char *pack_file_dir;
> >    svn_stream_t *pack_stream, *manifest_stream;
> > @@ -7638,6 +7639,9 @@ pack_shard(const char *revs_dir,
> >    SVN_ERR(write_revnum_file(fs_path, PATH_MIN_UNPACKED_REV,
> >                              (svn_revnum_t)((shard + 1) * 
> > max_files_per_dir),
> >                              iterpool));
> > +#ifndef SVN_DEBUG
> > +  SVN_ERR(update_min_unpacked_rev(fs, iterpool));
> > +#endif
> >    svn_pool_destroy(iterpool);
> >  
> >    /* Finally, remove the existing shard directory. */
> > @@ -7701,6 +7705,9 @@ pack_revprop_shard(svn_fs_t *fs,
> >    SVN_ERR(write_revnum_file(fs_path, PATH_MIN_UNPACKED_REVPROP,
> >                              (svn_revnum_t)((shard + 1) * 
> > max_files_per_dir),
> >                              iterpool));
> > +#ifndef SVN_DEBUG
> > +  SVN_ERR(update_min_unpacked_revprop(fs, iterpool));
> > +#endif
> >    svn_pool_destroy(iterpool);
> >  
> >    /* Finally, remove the existing shard directory. */
> > @@ -7786,7 +7793,7 @@ pack_body(void *baton,
> >        if (pb->cancel_func)
> >          SVN_ERR(pb->cancel_func(pb->cancel_baton));
> >  
> > -      SVN_ERR(pack_shard(data_path, pb->fs->path, i, max_files_per_dir,
> > +      SVN_ERR(pack_shard(data_path, pb->fs, i, max_files_per_dir,
> >                           pb->notify_func, pb->notify_baton,
> >                           pb->cancel_func, pb->cancel_baton, iterpool));
> >      }
> > ]]]
> > (still need to add comments explaining why the #ifndef guard)
> > 
> > > This doesn't mean we should keep the re-try loop that we're talking
> > > about removing.  I agree it should go.  It's just that I'd like to
> > > resolve this inconsistency, or at least agree about resolving it, before
> > > feeling comfortable enough to commit this myself.
> > > 
> > > 
> > > > > > Daniel, you wrote:
> > > > > >> But is it strictly *necessary* to do so?  I think not: by not 
> > > > > >> calling
> > > > > >> update_min_unpacked_rev(), we cause ffd->min_unpacked_rev to become
> > > > > >> stale --- a situation which the code has to account for anyway.
> > > > > >
> > > > > > That was true, then, and it meant that every place that used this 
> > > > > > value
> > > > > > had to account for it possibly being stale.  But now we have 
> > > > > > realized
> > > > > > that the code shouldn't need to account for the possibility of the 
> > > > > > value
> > > > > > being stale if it is running within a transaction (with the write 
> > > > > > lock).
> > > > > >
> > > > > > Now, I think it would be better to change this.  Whenever the code
> > > > > > updates the min-foo on disk, it should also update the cached value.
> > > > > > That style of coding would be easier to reason about (to use an 
> > > > > > in-vogue
> > > > > > phrase), and then we would be able to remove the re-try in
> > > > > > svn_fs_fs__path_rev_absolute() without a concern.
> > > 
> > > > Making ffd->min_unpacked_rev stale less often does not mean that we will
> > > > have less places to account for it *possibly* being stale:
> > > 
> > > I think it *does*, because every piece of code that knows it has a write
> > > lock would not have to bother about that possibility.
> > > 
> > 
> > Code that has the write lock does not have to bother about that
> > possibility anyway, because with_some_lock() updates the ffd->min_*
> > caches after acquiring the lock and before invoking the under-the-lock
> > callback.
> > 
> > > >  it could
> > > > become stale due to other processes who write to the repository.
> > > > (except for "pack runs under the write lock" considerations)
> > > 
> > > It could become stale due to another process writing to the repository,
> > > except if *this* code is running under a write lock.  If *this* code has
> > > a write lock, it doesn't matter whether it's doing packing or not;
> > > another process can't make this process's cache get stale because
> > > another process can't get the write lock.
> > > 
> > 
> > Right: if we have the write lock, then the caches are not stale.
> > 
> > > > To borrow Philip's point: making ffd->min_unpacked_rev stale less often
> > > > might hide bugs.
> > > 
> > > That is one way of looking at it, when thinking about the asynchronous
> > > (no write lock) cases, but another way of looking at the same thing is
> > > it can help to *totally avoid* potential bugs in synchronous cases.
> > > 
> > 
> > Agreed, that's the flip side of the coin.  The #ifndef patch above tries
> > to find a middle ground between hiding and unearthing bugs here.
> > 
> > > - Julian
> > > 
> > > 
> 
>

Re: svn commit: r1040832 - Port a fix for a FSFS packing race

Reply via email to