On Wed, 2010-04-14, C. Michael Pilato wrote: > Paul Burba wrote: > > After thinking about this some more I see three options: > > > > A) Keep the pre-927243 behavior as the default and thus support the > > incremental dump use case by default. Add a new option to svnadmin > > load, --skip-missing-merge-sources (or maybe > > --filter-missing-merge-sources) to activate the filtering of mergeinfo > > sources outside of the dump stream (i.e. make the r927243 changes only > > take affect with this option). The obvious drawback is that admins > > must know to use this option. They would still be able to partially > > dump a repos then load it into an empty (or unrelated) repos and have > > bogus mergeinfo. > > > > B) If the load stream's mergeinfo contains a merge source-rev pair > > that predates the start of the load stream, but exists in the target > > repository, then allow it to be loaded, otherwise filter it. The > > drawbacks are two: First, performance; we'd need to check every > > path/rev pair of incoming mergeinfo which certainly isn't going to > > speed up a load*. Second, it may be mere coincidence that the > > path/rev exists in the target repository at the start of the load. > > > > C) Revert r927243 and move the fix into svnadmin load: Give an error > > when doing a partial dump that contains mergeinfo with revisions that > > predate the starting rev of the dump and require the use of a new > > --missing-merge-source=[skip|allow] option to successfully complete > > the dump (i.e. something analogous to svndumpfilter's > > --skip-missing-merge-sources). > > > > Of course there is always: > > > > D) <Insert your brilliant idea here> > > [Thinking aloud here.] > > We have analogous behavior today with the handling of copied paths. There > are two aspects of this handling which might serve as models in this new > scenario: > > 1. 'svnadmin dump' warns when it is in incremental mode and must generate a > copy action from a source that predates the beginning of the dump window, > but it's only a warning and the dump continues.
> 2. 'svnadmin load' does nothing smart, trusting that the dump it's being > fed is a sensible one. If it gets a request to copy some path/rev that > doesn't exist, stuff errors out and the repository is left in whatever state > it was in prior to the revision that failed to load. > > Assuming similar behavior for mergeinfo handling, we have, at a minimum: > > 1. 'svnadmin dump' warns when it is in incremental mode and must generate > mergeinfo from a merge source that predates the beginning of the dump > window, but it's only a warning and the dump continues. +1 on that, both for consistency with copy-source info and because it's a good policy anyway. A detail: I haven't tried it, but would want it to print this warning (and the copy-source one) at the end of the dump, and not just in the middle of the long list of "Dumping rX" messages. > Now, there's some question here about how to handle part (2). In the copied > path case, 'svnadmin load' can be stupid because the failure case is a noisy > one -- stuff errors out from deep within the FS layer. That's not quite the > same failure case for mergeinfo, where bad mergeinfo would be silently > recorded without error. So we *could* try to validate the mergeinfo somehow > against the history recorded in the target repository. But there follows > another complication: it's really not so hard to get bad (if perhaps > innocuous) mergeinfo into a repository by normal means, and you certainly > wouldn't expect that to cost you the ability to dump and load your repository! > > So I'm really left with this: > > 2. 'svnadmin load' does nothing smart, trusting that the dump it's being > fed is a sensible one. Better: 'svnadmin load' tries to validate the mergeinfo, and issues a warning if it refers to a non-existent source. The admin then gets to figure out whether it was bad in the first place or because of his/her partial-dump/partial-load scenario. However, it depends how efficient the checking is. If that would make the 'load' really slow, I can see that not being wanted. - Julian > As a net change against our shipping code, that's really just, what, the > addition of warnings in 'svnadmin dump'? > > The question then becomes, "How do users deal with legitimate partial dumps > that will be loaded atop something other than loads from previous > incremental windows?" I think they do this the same way they handle the > copy case: either hand-hacking the dumpstream or using 'svndumpfilter'. So > maybe 'svndumpfilter' grows the logic and options required to pare away > mergeinfo that doesn't meet some criteria (such as "ranges outside the dump > stream range")? >