Julian Foad wrote on Wed, Dec 01, 2010 at 12:32:45 +0000: > On Wed, 2010-12-01, stef...@apache.org wrote: > > Port (not merge) a fix for a FSFS packing race condition from the > > performance branch to /trunk: There is a slight time window > > between finding the name of a rev file and actually open it. If > > the revision in question gets packed just within this window, > > we will find the new (and final) file name with just one retry. > > > > * subversion/libsvn_fs_fs/fs_fs.c > > (open_pack_or_rev_file): retry once upon "missing file" error > > Hi Stefan. > > (1) I think the doc string of svn_fs_fs__path_rev_absolute() should note > that the returned path may no longer be valid by the time you come to > use it, and the reason for that. Does my patch below sound right? >
Well, yes, and Stefan already added such a comment at the definition at my suggestion. But +1 to move that to the declaration. > (2) Doesn't the exact same race exist in *all* uses of > svn_fs_fs__path_rev_absolute()? There are five other calls to it, As far as I can tell, apart from open_pack_or_rev_file(), all callers always[1] run under the write lock. Since pack operation take the write lock too, it's not possible for the result of svn_fs_fs__path_rev_absolute() to become outdated for those callers. [1] with the exception of set_revision_proplist(), which can be called either under the write lock or under write_revision_zero() --- but in the latter case it's called before the format file has been created. > all of them using the result as a permissions reference to set the > file permissions on some new file. It seems to me that those uses can > fail in the same way. > > The root problem is the existence of a "get the path of this file" API > in the presence of semantics whereby the file might be removed from that > path at any time. > > Perhaps your fix is the best fix for this caller, but for the other > callers I think creating and using a "copy file permissions from > rev-file of rev R" API would be a better solution than adding re-tries > there. That API can do the re-try internally. > > What do you think? > > > Patch for (1) above: > [[[ > Index: subversion/libsvn_fs_fs/fs_fs.h > =================================================================== > --- subversion/libsvn_fs_fs/fs_fs.h (revision 1040662) > +++ subversion/libsvn_fs_fs/fs_fs.h (working copy) > @@ -404,13 +404,16 @@ svn_error_t *svn_fs_fs__txn_changes_fetc > PERMS_REFERENCE. Temporary allocations are from POOL. */ > svn_error_t *svn_fs_fs__move_into_place(const char *old_filename, > const char *new_filename, > const char *perms_reference, > apr_pool_t *pool); > > -/* Sets *PATH to the path of REV in FS, whether in a pack file or not. > +/* Set *PATH to the path of REV in FS, whether in a pack file or not. > + The path is not guaranteed to remain correct after the function returns, > + because it is possible that the revision will become packed just after > + this call, so the caller should re-try once if the path is not found. > Allocate in POOL. */ Sounds right. Bordering on bikeshed, I would suggest some changes: * Separate describing what the function does ("Sets *PATH to FOO and allocates in POOL") from describing the conditions the caller should beware of / check for ("sometimes the return value will be out-of-date by the time you receive it"). * Mention that the non-guarantee is not in effect if the caller has the write lock. > svn_error_t * > svn_fs_fs__path_rev_absolute(const char **path, > svn_fs_t *fs, > svn_revnum_t rev, > apr_pool_t *pool); Also, svn_fs_fs__path_rev_absolute() also does, internally, a single repeat loop. Theoretically this isn't needed any more now (since all callers either run under the write lock or retry)... > ]]] > > - Julian > >