On Thu, Oct 19, 2017 at 01:04:59AM -0400, Jeff King wrote:

> So. That leaves me with:
> 
>   - I'm unclear on whether next_byte() is meant to return that trailing
>     NUL or not. I don't think it causes any bugs, but it certainly
>     confused me for a function to take a cp/endp pair of pointers, and
>     then dereference endp. It might be worth either fixing or clarifying
>     with a comment.
> 
>   - Those loops to eat trailing whitespace are doing nothing. I'm not
>     sure if that all works out because next_byte() eats whitespaces or
>     not (I think not, because it doesn't eat whitespace for the
>     IGNORE_WHITESPACE_AT_EOL case). But I'm not quite sure what a test
>     would look like.

I had trouble constructing a test at first, but I think my test lines
just weren't long enough to trigger the movement heuristics. If I switch
to something besides seq, I can do:

  # any input that has reasonably sized lines
  look e | head -50 >file
  git add file
  
  perl -i -ne '
    # pick up lines 20-25 to move to line 40, and
    # add some trailing whitespace to them
    if ($. >= 20 && $. <= 25) {
      s/$/     /;
      $hold .= $_;
    } else {
      print $hold if ($. == 40);
      print;
    }
  ' file

  git diff --color-moved --ignore-space-at-eol

I think that _should_ show the block as moved, but it doesn't. But if I
apply this patch:

diff --git a/diff.c b/diff.c
index 93dccd1817..375d9cf447 100644
--- a/diff.c
+++ b/diff.c
@@ -743,8 +743,8 @@ static int moved_entry_cmp(const struct diff_options 
*diffopt,
                           const struct moved_entry *b,
                           const void *keydata)
 {
-       const char *ap = a->es->line, *ae = a->es->line + a->es->len;
-       const char *bp = b->es->line, *be = b->es->line + b->es->len;
+       const char *ap = a->es->line, *ae = a->es->line + a->es->len - 1;
+       const char *bp = b->es->line, *be = b->es->line + b->es->len - 1;
 
        if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS))
                return a->es->len != b->es->len  || memcmp(ap, bp, a->es->len);
@@ -771,7 +771,7 @@ static unsigned get_string_hash(struct emitted_diff_symbol 
*es, struct diff_opti
 {
        if (o->xdl_opts & XDF_WHITESPACE_FLAGS) {
                static struct strbuf sb = STRBUF_INIT;
-               const char *ap = es->line, *ae = es->line + es->len;
+               const char *ap = es->line, *ae = es->line + es->len - 1;
                int c;
 
                strbuf_reset(&sb);

it does. It just adjusts our "end pointer" to point to the last valid
character in the string (rather than one past), which seems to be the
convention that those loops (and next_byte) expect.

-Peff

Reply via email to