Re: applying hunks in svn patch

Stefan Sperling Tue, 17 Nov 2009 00:17:49 -0800

On Mon, Nov 16, 2009 at 11:31:12PM +0200, Daniel Shahaf wrote:
> Stefan Sperling wrote on Mon, 16 Nov 2009 at 14:39 +0100:
> > The goals of the hunk-application algorithm are:
> >  1) Memory usage is bound by the longest line occurring in the patch file
> 
> "bounded"


sbutler agrees (and he's american, so he probably doesn't know
english very well -- but at least we'll get things wrong consistently
when we follow his advice :)

> > Let C represent the current index into the lists of candidate offsets.
> > Set C = 1 (indexing the first candidate offset).
> > Iterate over the list of hunks.
> > If the hunk is already marked applied or rejected, ignore it.
> > If the hunk's Cth candidate offset is not contained in any range described
> > by the list of occupied lines, store its current candidate offset and the
> > length of its original text in the list of occupied line ranges and mark
> > the hunk as applied.
> > Else, if the hunk is not marked applied, and its current candidate offset
> > is within the range of occupied lines, defer handling of this hunk to the
> > next iteration.
> > If the hunk has no Cth candidate offset, mark the hunk as rejected.
> > Increment C and iterate.
> > 
> 
> i.e., iterate all 1st candidates, then all 2nd candidates, etc.  Why not
> iterate by hunks first?  (i.e., first all candidates of 1st hunk, then
> all candidates of 2nd hunk, etc.)

We already know what the best candidate offset for each hunk is,
so why iterate over the candidates without considering how placing
the current hunk at the current candidate offset affects other hunks?

The point of this step is to prevent hunks from occupying overlapping
line ranges. Any 2 hunks read from the patch cannot have occupied
overlapping space in the original file, otherwise they would be a single,
larger, hunk.
 
> > After this step, all hunks are marked either applied or rejected.
> > If any hunks were rejected and rejection is not allowed, raise an error.
> > 
> > Repeat the above process for the next patch target.
> > 
> 
> If the first target applies cleanly and the second contains rejected
> hunk, do we prompt the user before modifying the first target?

No, we default to not changing anything at all if any hunk does not
apply cleanly, for any target. Either all targets are modified, or
none at all -- unless rejects are allowed, in which case we produce
reject files for any rejected hunks, for any target.

> (if not, we might partially-apply a patch, only to some of the files it
> touches)
> 
> > Once all patch targets are processed as above, read each target file again
> > from the top, copying lines not contained in the list of lines occupied by
> > hunks verbatim. Replace any lines occupied by a hunk with lines of the 
> > hunk's
> > modified text.
> > (Note that the number of lines in the modified text of a hunk may differ
> > from the number of lines in the original text of a hunk, so the patched
> > file can have a different number of lines than the original file. All 
> > offsets
> > are relative to the original file, however.)
> > 
> > Write any rejected hunks for this target to a reject file.
> > 
> 
> In what format?

Rejected hunks are written out back to back, as they appear in the patch,
in plain unidiff format.
 
> > The patch is now applied.
> 
> Daniel

Thanks for your review!

> (btw... would be nice to use this code with non-svn patches.  Especially
> since it has interactive conflict resolution now...)

It does not anymore, since we removed use of svn_wc_merge4().
Julian and I determined that, in general, it's not possible for diff3
to create meaningful text conflicts if you don't really know what base
the merge left and merge right are derived from.

Stefan

Re: applying hunks in svn patch

Reply via email to