Using Environment variable GIT_CONFIG does not work as expected

2018-07-30 Thread Mika Vesalainen
Hi,

I have a shared linux account which is used by multiple developers. But I would 
like to have git commit history configured so that I can see who made a change 
to common repository (so that author of the commit would be correct person, not 
shared user). There are reasons why developers cannot clone this environment to 
their own accounts.

So I don't have ~/.gitconfig in place for the shared user, and when developer 
log in I enforce them to configure git for them first. They must "export 
GIT_CONFIG=my_specific_gitconfig".
When this is done, "git config -l" will show correctly the user.name, 
user.email and other parameters which are set in "my_specific_gitconfig".

However, if user tries now to create a commit the git blames:
*** Please tell me who you are.
and so on...

But running "git config -l" shows that 'user.name' and 'user.email' are 
properly configured.
Do I need to configure something more in order to get this GIT_CONFIG 
environment variable working. I'm working in Debian Linux environment.

I have tested this with git versions: 2.1.4, 2.11.0 and 2.18.0.321.gffc6fa0

Thanks,
Mika




Re: Hash algorithm analysis

2018-07-30 Thread Johannes Schindelin
Hi Brian,

On Tue, 24 Jul 2018, brian m. carlson wrote:

> On Tue, Jul 24, 2018 at 02:13:07PM -0700, Junio C Hamano wrote:
> > Yup.  I actually was leaning toward saying "all of them are OK in
> > practice, so the person who is actually spear-heading the work gets to
> > choose", but if we picked SHA-256 now, that would not be a choice that
> > Brian has to later justify for choosing against everybody else's
> > wishes, which makes it the best choice ;-)
> 
> This looks like a rough consensus.

As I grew really uncomfortable with having a decision that seems to be
based on hunches by non-experts (we rejected the preference of the only
cryptographer who weighed in, after all, precisely like we did over a
decade ago), I asked whether I could loop in one of our in-house experts
with this public discussion.

Y'all should be quite familiar with his work: Dan Shumow.

Dan, thank you for agreeing to chime in publicly.

Ciao,
Dscho


[PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header

2018-07-30 Thread Eric Sunshine
When "git rebase -i --root" creates a new root commit (say, by swapping
in a different commit for the root), it corrupts the commit's "author"
header with trailing garbage:

author A U Thor  @1112912773 -0700...@example.com

This is a result of read_author_ident() neglecting to NUL-terminate the
buffer into which it composes the "author" header.

(Note that the extra "0" in the timezone is a separate bug which will be
fixed subsequently.)

Security considerations: Construction of the "author" header by
read_author_ident() happens in-place and in parallel with parsing the
content of "rebase-merge/author-script" which occupies the same buffer.
This is possible because the constructed "author" header is always
smaller than the content of "rebase-merge/author-script". Despite
neglecting to NUL-terminate the constructed "author" header, memory is
never accessed (either by read_author_ident() or its caller) beyond the
allocated buffer since a NUL-terminator is present at the end of the
loaded "rebase-merge/author-script" content, and additional NUL's are
inserted as part of the parsing process.

Signed-off-by: Eric Sunshine 
---
 sequencer.c   |  2 +-
 t/t3404-rebase-interactive.sh | 10 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 16c1411054..78864d9072 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -744,7 +744,7 @@ static const char *read_author_ident(struct strbuf *buf)
return NULL;
}
 
-   buf->len = out - buf->buf;
+   strbuf_setlen(buf, out - buf->buf);
return buf->buf;
 }
 
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 01616901bd..8509c89a26 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1238,7 +1238,7 @@ rebase_setup_and_clean () {
test_might_fail git branch -D $1 &&
test_might_fail git rebase --abort
" &&
-   git checkout -b $1 master
+   git checkout -b $1 ${2:-master}
 }
 
 test_expect_success 'drop' '
@@ -1415,4 +1415,12 @@ test_expect_success 'rebase -i --gpg-sign= 
overrides commit.gpgSign' '
test_i18ngrep "$SQ-S\"S I Gner\"$SQ" err
 '
 
+test_expect_success 'valid author header after --root swap' '
+   rebase_setup_and_clean author-header no-conflict-branch &&
+   set_fake_editor &&
+   FAKE_LINES="2 1" git rebase -i --root &&
+   git cat-file commit HEAD^ >out &&
+   grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
+'
+
 test_done
-- 
2.18.0.597.ga71716f1ad



[PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone

2018-07-30 Thread Eric Sunshine
When "git rebase -i --root" creates a new root commit, it corrupts the
"author" header's timezone by repeating the last digit:

author A U Thor  @1112912773 -07000

This is due to two bugs.

First, write_author_script() neglects to add the closing quote to the
value of GIT_AUTHOR_DATE when generating "rebase-merge/author-script".

Second, although sq_dequote() correctly diagnoses the missing closing
quote, read_author_ident() ignores sq_dequote()'s return value and
blindly uses the result of the aborted dequote.

sq_dequote() performs dequoting in-place by removing quoting and
shifting content downward. When it detects misquoting (lack of closing
quote, in this case), it gives up and returns an error without inserting
a NUL-terminator at the end of the shifted content, which explains the
duplicated last digit in the timezone.

Signed-off-by: Eric Sunshine 
---
 sequencer.c   | 7 ++-
 t/t3404-rebase-interactive.sh | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 78864d9072..1008f6d71a 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -654,6 +654,7 @@ static int write_author_script(const char *message)
strbuf_addch(&buf, *(message++));
else
strbuf_addf(&buf, "'%c'", *(message++));
+   strbuf_addch(&buf, '\'');
res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
strbuf_release(&buf);
return res;
@@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf *buf)
 
eol = strchrnul(in, '\n');
*eol = '\0';
-   sq_dequote(in);
+   if (!sq_dequote(in)) {
+   warning(_("bad quoting on %s value in '%s'"),
+   keys[i], rebase_path_author_script());
+   return NULL;
+   }
len = strlen(in);
 
if (i > 0) /* separate values by spaces */
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 8509c89a26..37796bb4c1 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -1420,7 +1420,7 @@ test_expect_success 'valid author header after --root 
swap' '
set_fake_editor &&
FAKE_LINES="2 1" git rebase -i --root &&
git cat-file commit HEAD^ >out &&
-   grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
+   grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
 '
 
 test_done
-- 
2.18.0.597.ga71716f1ad



[PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Eric Sunshine
This series fixes bugs causing corruption of the root commit when
"rebase -i --root" is used to swap in a new root commit. In particular,
the "author" header has trailing garbage. Some tools handle the
corruption somewhat gracefully by showing a bogus date, but others barf
on it (gitk, for instance). git-fsck correctly identifies the
corruption. I discovered this after git-rebase corrupted one of my own
projects.

Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
into the v2.18.0 release. It's worrying that a released Git can be
creating corrupt commits, but fortunately "rebase -i --root" is not
likely used often (especially on well-established projects).
Nevertheless, it may be 'maint' worthy and applies cleanly there.

It was only after I diagnosed and fixed these bugs that I thought to
check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
fixing one of the three bugs which this series fixes. Akinori's fix has
the somewhat undesirable property that it adds an extra blank line to
the end of the script, as Phillip correctly pointed out in review[2].
Patch 2/2 of this series has the more "correct" fix, in addition to
fixing another bug.

Moreover, patch 2/2 of this series provides a more thorough fix overall
than Akinori, so it may make sense to replace his patch with this
series, though perhaps keep the test his patch adds to augment the
strict test of the "author" header added by this series.

[1]: https://public-inbox.org/git/86a7qwpt9g@idaemons.org/
[2]: 
https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889...@talktalk.net/

Eric Sunshine (2):
  sequencer: fix "rebase -i --root" corrupting author header
  sequencer: fix "rebase -i --root" corrupting author header timezone

 sequencer.c   |  9 +++--
 t/t3404-rebase-interactive.sh | 10 +-
 2 files changed, 16 insertions(+), 3 deletions(-)

-- 
2.18.0.597.ga71716f1ad



Re: [RFC PATCH] sequencer: fix quoting in write_author_script

2018-07-30 Thread Phillip Wood
On 27/07/18 13:37, Johannes Schindelin wrote:
> Hi Phillip, Junio and Akinori,
> 
> I just noticed that t3404 is broken without my patches (but with Junio's
> fixup), on Windows, macOS and Linux. (See log at the end.)
> 
> On Fri, 27 Jul 2018, Phillip Wood wrote:
> 
>> On 26/07/18 13:33, Johannes Schindelin wrote:
>>>
>>> On Wed, 18 Jul 2018, Phillip Wood wrote:
>>>
 Single quotes should be escaped as \' not \\'. Note that this only
 affects authors that contain a single quote and then only external
 scripts that read the author script and users whose git is upgraded from
 the shell version of rebase -i while rebase was stopped. This is because
 the parsing in read_env_script() expected the broken version and for
 some reason sq_dequote() called by read_author_ident() seems to handle
 the broken quoting correctly.

 Ideally write_author_script() would be rewritten to use
 split_ident_line() and sq_quote_buf() but this commit just fixes the
 immediate bug.
>>>
 This is untested, unfortuantely I don't have really have time to write a 
 test or
 follow this up at the moment, if someone else want to run with it then 
 please
 do.
>>>
>>> I modified the test that was added by Akinori. As it was added very early,
>>> and as there is still a test case *after* Akinori's that compares a
>>> hard-coded SHA-1, I refrained from using `test_commit` (which would change
>>> that SHA-1). See below.
>>
>> Thanks for adding a test, that sounds like sensible approach, however
>> having thought about it I wonder if we should just be writing a plain
>> text file (e.g rebase-merge/author-data) and fixing the reader to read
>> that if it exists and only then fall back to reading the legacy
>> rebase-merge/author-script with a fix to correctly handle the script
>> written by the shell version - what do you think? The author-script
>> really should be just an implementation detail. If anyone really wants
>> to read it they can still do 'read -r l' and split the lines with
>> ${l%%=*} and ${l#*=}
> 
> In contrast to `git am`, there *is* a use case where power users might
> have come to rely on the presence of the .git/rebase-merge/author-script
> file *and* its nature as a shell script snippet: we purposefully allow
> scripting `rebase -i`.
> 
> So I don't think that we can declare the file and its format as
> implementation detail, even if the idea is very, very tempting.
> 
 diff --git a/sequencer.c b/sequencer.c
 index 5354d4d51e..0b78d1f100 100644
 --- a/sequencer.c
 +++ b/sequencer.c
 @@ -638,21 +638,21 @@ static int write_author_script(const char *message)
else if (*message != '\'')
strbuf_addch(&buf, *(message++));
else
 -  strbuf_addf(&buf, "'%c'", *(message++));
 +  strbuf_addf(&buf, "'\\%c'", *(message++));
strbuf_addstr(&buf, "'\nGIT_AUTHOR_EMAIL='");
while (*message && *message != '\n' && *message != '\r')
if (skip_prefix(message, "> ", &message))
break;
else if (*message != '\'')
strbuf_addch(&buf, *(message++));
else
 -  strbuf_addf(&buf, "'%c'", *(message++));
 +  strbuf_addf(&buf, "'\\%c'", *(message++));
strbuf_addstr(&buf, "'\nGIT_AUTHOR_DATE='@");
while (*message && *message != '\n' && *message != '\r')
if (*message != '\'')
strbuf_addch(&buf, *(message++));
else
 -  strbuf_addf(&buf, "'%c'", *(message++));
 +  strbuf_addf(&buf, "'\\%c'", *(message++));
res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
>>>
>>> I resolved the merge conflict with Akinori's patch. FWIW I pushed all of
>>> this, including the fixup to Junio's fixup to the
>>> `fix-t3404-author-script-test` branch at https://github.com/dscho/git.
>>>
strbuf_release(&buf);
return res;
 @@ -666,13 +666,21 @@ static int read_env_script(struct argv_array *env)
  {
struct strbuf script = STRBUF_INIT;
int i, count = 0;
 -  char *p, *p2;
 +  const char *p2;
 +  char *p;
  
if (strbuf_read_file(&script, rebase_path_author_script(), 256) <= 0)
return -1;
  
for (p = script.buf; *p; p++)
 -  if (skip_prefix(p, "'''", (const char **)&p2))
 +  /*
 +   * write_author_script() used to escape "'" incorrectly as
 +   * "'''" rather than "'\\''" so we check for the correct
 +   * version the incorrect version in case git was upgraded while
 +   * rebase was stopped.
 +   */
 +  if (skip_prefix(p, "'\\''", &p2) ||
 +  skip_prefix(p, "'''", &p2))
>>>
>>> I think in this form, it is 

Is detecting endianness at compile-time unworkable?

2018-07-30 Thread Ævar Arnfjörð Bjarmason


On Sun, Jul 29 2018, Michael wrote:

> On 29/07/2018 22:06, brian m. carlson wrote:
>> On Sun, Jul 29, 2018 at 09:48:43PM +0200, Michael wrote:
>>> On 29/07/2018 21:27, brian m. carlson wrote:
 Well, that explains it.  I would recommend submitting a patch to
 https://github.com/cr-marcstevens/sha1collisiondetection, and the we can
 pull in the updated submodule with that fix.
>>> Not sure I am smart enough to do that. I'll have to download, build, and see
>>> what it says.
>> The issue is that somewhere in lib/sha1.c, you need to cause
>> SHA1DC_BIGENDIAN to be set.  That means you need to figure out what
>> compiler macro might indicate that.
> I remember - roughly - a few decades back - having an assignment to
> write code to determine endianness. PDP and VAC were different iirc,
> and many other micro-processors besides the 8088/8086/z85/68k/etc..
>
> If you are looking for a compiler macro as a way to determine this -
> maybe you have one for gcc, but not for xlc. I do not know it - currently :)

I'm not familiar with AIX, but from searching around I found this
porting manual from IBM:
http://www.redbooks.ibm.com/redbooks/pdfs/sg246034.pdf

There they suggest either defining your own macros, or testing the
memory layout at runtime (see section "2.2.2.3 Technique 3: Testing
memory layout" and surrounding sections).

Perhaps it's worth taking a step back here and thinking about whether
this whole thing is unworkable. It was hard enough to get this to work
on the combination of Linux, *BSD and Solaris, but I suspect we'll run
into increasingly obscure platforms where this is hard or impossible
(AIX, HP/UX etc.)

The reason we're in this hole is because we use this
sha1collisiondetection library to do SHA-1, and the reason we have
issues with it specifically (not OpenSSL et al) is because its only
method of detecting endianness is at compile time.

This didn't use to be the case, it was changed in this commit:
https://github.com/cr-marcstevens/sha1collisiondetection/commit/d597672

Dan Shumow: Since the commit message doesn't say why, can you elaborate
a bit on why this was done, i.e. is determining this at runtime harmful
for performance? If not, perhaps it would be best to bring this back, at
least as an option.

And, as an aside, the reason we can't easily make it better ourselves is
because the build process for git.git doesn't have a facility to run
code to detect this type of stuff (the configure script is always
optional). So we can't just run this test ourselves.

Junio: I've barked up that particular tree before in
https://public-inbox.org/git/87a7x3kmh5@evledraar.gmail.com/ and I
won't bore you all by repeating myself, except to say that this is yet
another case where I wish we had a hard dependency on some way of doing
checks via compiled code in our build system.


Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 5:30 AM Eric Sunshine  wrote:
> This series fixes bugs causing corruption of the root commit when
> "rebase -i --root" is used to swap in a new root commit. In particular,
> the "author" header has trailing garbage. Some tools handle the
> corruption somewhat gracefully by showing a bogus date, but others barf
> on it (gitk, for instance). git-fsck correctly identifies the
> corruption. I discovered this after git-rebase corrupted one of my own
> projects.

Unfortunately, after sending this series, I see that there is
additional corruption that needs to be addressed. In particular, the
"author" header time is incorrectly prefixed with "@", so this series
is going to need a re-roll to fix that, as well.


GOOD NEWS

2018-07-30 Thread REV. AUSTIN IBEH PAYMENT INSTRUCTOR
ATTENTION BENEFICIARY:

THIS IS TO INFORM YOU THAT A DELIVERY PERSONNEL IS PRESENTLY IN YOUR
HOME COUNTRY WITH YOUR CONSIGNMENT AS SEALED .THE UNITED NATION
MONETARY SETTLEMENT AGENCY ASSIGNED $15.5M IN YOUR FAVOR AS THE
BENEFICIARY,YOUR NAME APPEAR  TO BE ONE OF THE BENEFICIARY.YOU ARE
REQUIRED TO SEND THE BELOW INFORMATION TO AVOID HANDING OVER TO A
WRONG PERSON OR DEALING WITH IMPERSONATORS.

YOUR FULL NAME:
YOUR HOME ADDRESS:
YOUR DIRECT PHONE NUMBER:
YOUR  OCCUPATION:
YOUR IDENTIFICATION:
YOUR CERTIFY CERTIFICATE OF ORIGIN:

TO ENABLE THE DELIVERY PERSONNEL WHOM IS PRESENTLY IN YOUR COUNTRY
HAND THE CONSIGNMENT AS SEALED TO YOU WITH OUT ANY DELAY.

WAITING TO HEAR FROM YOU IMMEDIATELY.

REMAIN BLESS OF THE LORD.

YOURS IN SERVICES

REV. AUSTIN IBEH
PAYMENT INSTRUCTOR


Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Phillip Wood
On 30/07/18 10:29, Eric Sunshine wrote:
> This series fixes bugs causing corruption of the root commit when
> "rebase -i --root" is used to swap in a new root commit. In particular,
> the "author" header has trailing garbage. Some tools handle the
> corruption somewhat gracefully by showing a bogus date, but others barf
> on it (gitk, for instance). git-fsck correctly identifies the
> corruption. I discovered this after git-rebase corrupted one of my own
> projects.
> 
> Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
> into the v2.18.0 release. It's worrying that a released Git can be
> creating corrupt commits, but fortunately "rebase -i --root" is not
> likely used often (especially on well-established projects).
> Nevertheless, it may be 'maint' worthy and applies cleanly there.
> 
> It was only after I diagnosed and fixed these bugs that I thought to
> check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> fixing one of the three bugs which this series fixes. Akinori's fix has
> the somewhat undesirable property that it adds an extra blank line to
> the end of the script, as Phillip correctly pointed out in review[2].
> Patch 2/2 of this series has the more "correct" fix, in addition to
> fixing another bug.
> 
> Moreover, patch 2/2 of this series provides a more thorough fix overall
> than Akinori, so it may make sense to replace his patch with this
> series, though perhaps keep the test his patch adds to augment the
> strict test of the "author" header added by this series.

Johannes and I have some fixups for Akinori's patch on the branch
fix-t3403-author-script-test at https://github.com/phillipwood/git

That branch also contains a fix for the bad quoting of names with "'" in
them. I think it would be good to somehow try and combine this series
with those patches.

I'd really like to see a single function to read and another to write
the author script that is shared by 'git am' and 'git rebase -i', rather
than the two writers and three readers we have at the moment. I was
thinking of doing that in the longer term, but given the extra bug
you've found in read_author_script() maybe we should do that sooner
rather than later.

> [1]: https://public-inbox.org/git/86a7qwpt9g@idaemons.org/
> [2]: 
> https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889...@talktalk.net/
> 
> Eric Sunshine (2):
>   sequencer: fix "rebase -i --root" corrupting author header
>   sequencer: fix "rebase -i --root" corrupting author header timezone
> 
>  sequencer.c   |  9 +++--
>  t/t3404-rebase-interactive.sh | 10 +-
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 



[PATCH 2/2] Highlight keywords in remote sideband output.

2018-07-30 Thread Han-Wen Nienhuys
The highlighting is done on the client-side. Supported keywords are
"error", "warning", "hint" and "success".

The colorization is controlled with the config setting "color.remote".

Signed-off-by: Han-Wen Nienhuys 
Change-Id: I090412a1288bc2caef0916447e28c2d0199da47d
---
 sideband.c  | 79 +
 t/t5409-colorize-remote-messages.sh | 34 +
 2 files changed, 104 insertions(+), 9 deletions(-)
 create mode 100644 t/t5409-colorize-remote-messages.sh

diff --git a/sideband.c b/sideband.c
index 325bf0e97..15213a7c1 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,6 +1,63 @@
 #include "cache.h"
 #include "pkt-line.h"
 #include "sideband.h"
+#include "color.h"
+
+static int sideband_use_color = -1;
+
+/*
+ * Optionally highlight some keywords in remote output if they appear at the
+ * start of the line.
+ */
+void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+{
+   if (sideband_use_color < 0) {
+   const char *key = "color.remote";
+   char *value = NULL;
+   if (!git_config_get_string(key, &value))
+   sideband_use_color = git_config_colorbool(key, value);
+   }
+
+   int want_color = want_color_stderr(sideband_use_color);
+   int i;
+
+   if (!want_color) {
+   strbuf_add(dest, src, n);
+   return;
+   }
+
+   struct kwtable {
+   const char *keyword;
+   const char *color;
+   } keywords[] = {
+   {"hint", GIT_COLOR_YELLOW},
+   {"warning", GIT_COLOR_BOLD_YELLOW},
+   {"success", GIT_COLOR_BOLD_GREEN},
+   {"error", GIT_COLOR_BOLD_RED},
+   };
+
+   while (isspace(*src)) {
+   strbuf_addch(dest, *src);
+   src++;
+   n--;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(keywords); i++) {
+   struct kwtable* p = keywords + i;
+   int len = strlen(p->keyword);
+   if (!strncasecmp(p->keyword, src, len) && !isalnum(src[len])) {
+   strbuf_addstr(dest, p->color);
+   strbuf_add(dest, src, len);
+   strbuf_addstr(dest, GIT_COLOR_RESET);
+   n -= len;
+   src += len;
+   break;
+   }
+   }
+
+   strbuf_add(dest, src, n);
+}
+
 
 /*
  * Receive multiplexed output stream over git native protocol.
@@ -48,8 +105,10 @@ int recv_sideband(const char *me, int in_stream, int out)
len--;
switch (band) {
case 3:
-   strbuf_addf(&outbuf, "%s%s%s", outbuf.len ? "\n" : "",
-   DISPLAY_PREFIX, buf + 1);
+   strbuf_addf(&outbuf, "%s%s", outbuf.len ? "\n" : "",
+   DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, buf + 1, len);
+
retval = SIDEBAND_REMOTE_ERROR;
break;
case 2:
@@ -69,20 +128,22 @@ int recv_sideband(const char *me, int in_stream, int out)
if (!outbuf.len)
strbuf_addstr(&outbuf, DISPLAY_PREFIX);
if (linelen > 0) {
-   strbuf_addf(&outbuf, "%.*s%s%c",
-   linelen, b, suffix, *brk);
-   } else {
-   strbuf_addch(&outbuf, *brk);
+   maybe_colorize_sideband(&outbuf, b, 
linelen);
+   strbuf_addstr(&outbuf, suffix);
}
+
+   strbuf_addch(&outbuf, *brk);
xwrite(2, outbuf.buf, outbuf.len);
strbuf_reset(&outbuf);
 
b = brk + 1;
}
 
-   if (*b)
-   strbuf_addf(&outbuf, "%s%s", outbuf.len ?
-   "" : DISPLAY_PREFIX, b);
+   if (*b) {
+   strbuf_addstr(&outbuf, outbuf.len ?
+   "" : DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, b, strlen(b));
+   }
break;
case 1:
write_or_die(out, buf + 1, len);
diff --git a/t/t5409-colorize-remote-messages.sh 
b/t/t5409-colorize-remote-messages.sh
new file mode 100644
index 0..1620cffbe
--- /dev/null
+++ b/t/t5409-colorize-remote-messages.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='remote messages are colorized on the client'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+   mkd

[PATCH 1/2] Document git config getter return value.

2018-07-30 Thread Han-Wen Nienhuys
---
 config.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/config.h b/config.h
index b95bb7649..d39256eb1 100644
--- a/config.h
+++ b/config.h
@@ -178,11 +178,16 @@ struct config_set {
 };

 extern void git_configset_init(struct config_set *cs);
-extern int git_configset_add_file(struct config_set *cs, const char *filename);
-extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **value);
+
 extern const struct string_list *git_configset_get_value_multi(struct 
config_set *cs, const char *key);
 extern void git_configset_clear(struct config_set *cs);
+
+/*
+ * The int return values in these functions is 1 if not found, 0 if found, 
leaving
+ * the found value in the 'dest' pointer.
+ */
+extern int git_configset_add_file(struct config_set *cs, const char *filename);
+extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **dest);
 extern int git_configset_get_string_const(struct config_set *cs, const char 
*key, const char **dest);
 extern int git_configset_get_string(struct config_set *cs, const char *key, 
char **dest);
 extern int git_configset_get_int(struct config_set *cs, const char *key, int 
*dest);
--
2.18.0.345.g5c9ce644c3-goog


Re: [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone

2018-07-30 Thread Phillip Wood
Hi Eric
On 30/07/18 10:29, Eric Sunshine wrote:
> When "git rebase -i --root" creates a new root commit, it corrupts the
> "author" header's timezone by repeating the last digit:
> 
> author A U Thor  @1112912773 -07000
> 
> This is due to two bugs.
> 
> First, write_author_script() neglects to add the closing quote to the
> value of GIT_AUTHOR_DATE when generating "rebase-merge/author-script".
> 
> Second, although sq_dequote() correctly diagnoses the missing closing
> quote, read_author_ident() ignores sq_dequote()'s return value and
> blindly uses the result of the aborted dequote.
> 
> sq_dequote() performs dequoting in-place by removing quoting and
> shifting content downward. When it detects misquoting (lack of closing
> quote, in this case), it gives up and returns an error without inserting
> a NUL-terminator at the end of the shifted content, which explains the
> duplicated last digit in the timezone.
> 
> Signed-off-by: Eric Sunshine 
> ---
>  sequencer.c   | 7 ++-
>  t/t3404-rebase-interactive.sh | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 78864d9072..1008f6d71a 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -654,6 +654,7 @@ static int write_author_script(const char *message)
>   strbuf_addch(&buf, *(message++));
>   else
>   strbuf_addf(&buf, "'%c'", *(message++));
> + strbuf_addch(&buf, '\'');
>   res = write_message(buf.buf, buf.len, rebase_path_author_script(), 1);
>   strbuf_release(&buf);
>   return res;
> @@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf *buf)
>  
>   eol = strchrnul(in, '\n');
>   *eol = '\0';
> - sq_dequote(in);
> + if (!sq_dequote(in)) {
> + warning(_("bad quoting on %s value in '%s'"),
> + keys[i], rebase_path_author_script());
> + return NULL;

I think we want to handle the broken author script properly rather than
returning NULL. If we had a single function
int read_author_script(const char **name, const char **author, const
char **date)
to read the author script that tried sq_dequote() and then fell back to
code based on read_env_script() that handled the missing "'" at the end
and also the bad quoting of "'" if sq_dequote() failed it would make it
easier to fix the existing bugs, rather than having to fix
read_author_ident() and read_env_script() separately. What do you think?

Best Wishes

Phillip

> + }
>   len = strlen(in);
>  
>   if (i > 0) /* separate values by spaces */
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 8509c89a26..37796bb4c1 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1420,7 +1420,7 @@ test_expect_success 'valid author header after --root 
> swap' '
>   set_fake_editor &&
>   FAKE_LINES="2 1" git rebase -i --root &&
>   git cat-file commit HEAD^ >out &&
> - grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
> + grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
>  '
>  
>  test_done
> 



Re: [PATCH] RFC Highlight keywords in remote sideband output.

2018-07-30 Thread Han-Wen Nienhuys
On Thu, Jul 26, 2018 at 8:50 PM Junio C Hamano  wrote:
> Hold your objection a bit.  I'll come back to it soon ;-)
>
> It theoretically may make more sense to color on the sender side,
> but that is true only if done at a higher layer that prepares a
> string and calls into the sideband code to send.  That code must
> know what the bytes _mean_ a lot better than the code at the
> sideband layer, so we do not have to guess.
>
> Having written all the above, I think you are doing this at the
> receiving end, so this actually makes quite a lot of sense.  I was
> fooled greatly by "EMIT_sideband", which in reality does NOT emit at
> all.  That function is badly misnamed.

fixed.

> The function is more like "color sideband payload"; actual
> "emitting" is still done at the places the code originally "emitted"
> them to the receiving user.
>
> > Signed-off-by: Han-Wen Nienhuys 
> > Change-Id: I090412a1288bc2caef0916447e28c2d0199da47d
>
> That's an unusual trailer we do not use in this project.

Yes, I know. I forgot to strip it from v2 again, though :-(

> > +void emit_sideband(struct strbuf *dest, const char *src, int n) {
>
> Open brace on its own line.

Done.

> > +// NOSUBMIT - maybe use transport.color property?
>
> Avoid // comment.

Done

> In our codebase in C, asterisk sticks to the variable not the type.

Done.

> > +} keywords[] = {
> > +{"hint", GIT_COLOR_YELLOW},
> > +{"warning", GIT_COLOR_BOLD_YELLOW},
> > +{"success", GIT_COLOR_BOLD_GREEN},
> > +{"error", GIT_COLOR_BOLD_RED},
> > +{},
>
> Drop the last sentinel element, and instead stop iteration over the
> array using (i < ARRAY_SIZE(keywords)).

Done.

> > +for (struct kwtable* p = keywords; p->keyword; p++) {
>
> Does anybody know if we already use the variable decl inside the
> "for (...)" construct like this?  I know we discussed the idea of
> using it somewhere as a weather-balloon to see if people with exotic
> environment would mind, and I certainly do not mind making this
> patch serve as such a weather-baloon, but if that is what we are
> doing, I want the commit log message clearly marked as such, so that
> we can later "git log --grep=C99" to see how long ago such an
> experiment started.

I elided this. (I had expected for the compile to enforce restrictions
like these using --std=c99.)

> >   * Receive multiplexed output stream over git native protocol.
> > @@ -48,8 +95,10 @@ int recv_sideband(const char *me, int in_stream, int out)
> >   len--;
> >   switch (band) {
> >   case 3:
> > - strbuf_addf(&outbuf, "%s%s%s", outbuf.len ? "\n" : "",
> > - DISPLAY_PREFIX, buf + 1);
> > + strbuf_addf(&outbuf, "%s%s", outbuf.len ? "\n" : "",
> > + DISPLAY_PREFIX);
> > +emit_sideband(&outbuf, buf+1, len);
> > +
>
> Let's not lose SP around "+" on both sides.
>
> Also you seem to be indenting some lines with all SP and some with
> mixture of HT and SP.  We prefer to use as many 8-column HT and then
> fill the remainder with SP if needed to align with the opening
> parenthesis on line above it (imitate the way strbuf_addf() is split
> into two lines in the original in this hunk).

Fixed these, I think.

> Thanks.  While there are need for mostly minor fix-ups, the logic
> seems quite sane.  I think we can start without configuration and
> then "fix" it later.

I need the configuration to be able to test this, though.

> While I am OK with calling that variable "transport.", we
> should not define/explain it as "color output coming from the other
> end over the wire transport".  Those who want to see messages
> emitted remotely during "git fetch" in color would want to see the
> messages generated by "git fetch" locally painted in the same color
> scheme, so it makes sense to let "git fetch" pay attention and honor
> that variable even for its own locally generated messages.  The
> variable instead means "color any message, either generated locally
> or remotely, during an operation that has something to do with
> object transport", or something like that.

I used color.remote for the property,  but I'm happy to colorize the
bikeshed with another color.

--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado


[PATCH 2/2] Highlight keywords in remote sideband output.

2018-07-30 Thread Han-Wen Nienhuys
The highlighting is done on the client-side. Supported keywords are
"error", "warning", "hint" and "success".

The colorization is controlled with the config setting "color.remote".

Signed-off-by: Han-Wen Nienhuys 
---
 sideband.c  | 79 +
 t/t5409-colorize-remote-messages.sh | 34 +
 2 files changed, 104 insertions(+), 9 deletions(-)
 create mode 100644 t/t5409-colorize-remote-messages.sh

diff --git a/sideband.c b/sideband.c
index 325bf0e97..15213a7c1 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,6 +1,63 @@
 #include "cache.h"
 #include "pkt-line.h"
 #include "sideband.h"
+#include "color.h"
+
+static int sideband_use_color = -1;
+
+/*
+ * Optionally highlight some keywords in remote output if they appear at the
+ * start of the line.
+ */
+void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+{
+   if (sideband_use_color < 0) {
+   const char *key = "color.remote";
+   char *value = NULL;
+   if (!git_config_get_string(key, &value))
+   sideband_use_color = git_config_colorbool(key, value);
+   }
+
+   int want_color = want_color_stderr(sideband_use_color);
+   int i;
+
+   if (!want_color) {
+   strbuf_add(dest, src, n);
+   return;
+   }
+
+   struct kwtable {
+   const char *keyword;
+   const char *color;
+   } keywords[] = {
+   {"hint", GIT_COLOR_YELLOW},
+   {"warning", GIT_COLOR_BOLD_YELLOW},
+   {"success", GIT_COLOR_BOLD_GREEN},
+   {"error", GIT_COLOR_BOLD_RED},
+   };
+
+   while (isspace(*src)) {
+   strbuf_addch(dest, *src);
+   src++;
+   n--;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(keywords); i++) {
+   struct kwtable* p = keywords + i;
+   int len = strlen(p->keyword);
+   if (!strncasecmp(p->keyword, src, len) && !isalnum(src[len])) {
+   strbuf_addstr(dest, p->color);
+   strbuf_add(dest, src, len);
+   strbuf_addstr(dest, GIT_COLOR_RESET);
+   n -= len;
+   src += len;
+   break;
+   }
+   }
+
+   strbuf_add(dest, src, n);
+}
+

 /*
  * Receive multiplexed output stream over git native protocol.
@@ -48,8 +105,10 @@ int recv_sideband(const char *me, int in_stream, int out)
len--;
switch (band) {
case 3:
-   strbuf_addf(&outbuf, "%s%s%s", outbuf.len ? "\n" : "",
-   DISPLAY_PREFIX, buf + 1);
+   strbuf_addf(&outbuf, "%s%s", outbuf.len ? "\n" : "",
+   DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, buf + 1, len);
+
retval = SIDEBAND_REMOTE_ERROR;
break;
case 2:
@@ -69,20 +128,22 @@ int recv_sideband(const char *me, int in_stream, int out)
if (!outbuf.len)
strbuf_addstr(&outbuf, DISPLAY_PREFIX);
if (linelen > 0) {
-   strbuf_addf(&outbuf, "%.*s%s%c",
-   linelen, b, suffix, *brk);
-   } else {
-   strbuf_addch(&outbuf, *brk);
+   maybe_colorize_sideband(&outbuf, b, 
linelen);
+   strbuf_addstr(&outbuf, suffix);
}
+
+   strbuf_addch(&outbuf, *brk);
xwrite(2, outbuf.buf, outbuf.len);
strbuf_reset(&outbuf);

b = brk + 1;
}

-   if (*b)
-   strbuf_addf(&outbuf, "%s%s", outbuf.len ?
-   "" : DISPLAY_PREFIX, b);
+   if (*b) {
+   strbuf_addstr(&outbuf, outbuf.len ?
+   "" : DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, b, strlen(b));
+   }
break;
case 1:
write_or_die(out, buf + 1, len);
diff --git a/t/t5409-colorize-remote-messages.sh 
b/t/t5409-colorize-remote-messages.sh
new file mode 100644
index 0..1620cffbe
--- /dev/null
+++ b/t/t5409-colorize-remote-messages.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='remote messages are colorized on the client'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+   mkdir .git/hooks &&
+cat << EOF > .git/hooks/update

[PATCH 1/2] Document git config getter return value.

2018-07-30 Thread Han-Wen Nienhuys
---
 config.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/config.h b/config.h
index b95bb7649..d39256eb1 100644
--- a/config.h
+++ b/config.h
@@ -178,11 +178,16 @@ struct config_set {
 };

 extern void git_configset_init(struct config_set *cs);
-extern int git_configset_add_file(struct config_set *cs, const char *filename);
-extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **value);
+
 extern const struct string_list *git_configset_get_value_multi(struct 
config_set *cs, const char *key);
 extern void git_configset_clear(struct config_set *cs);
+
+/*
+ * The int return values in these functions is 1 if not found, 0 if found, 
leaving
+ * the found value in the 'dest' pointer.
+ */
+extern int git_configset_add_file(struct config_set *cs, const char *filename);
+extern int git_configset_get_value(struct config_set *cs, const char *key, 
const char **dest);
 extern int git_configset_get_string_const(struct config_set *cs, const char 
*key, const char **dest);
 extern int git_configset_get_string(struct config_set *cs, const char *key, 
char **dest);
 extern int git_configset_get_int(struct config_set *cs, const char *key, int 
*dest);
--
2.18.0.345.g5c9ce644c3-goog


[PATCH 1/1] Highlight keywords in remote sideband output.

2018-07-30 Thread Han-Wen Nienhuys
The highlighting is done on the client-side. Supported keywords are
"error", "warning", "hint" and "success".

The colorization is controlled with the config setting "color.remote".

Signed-off-by: Han-Wen Nienhuys 
---
 sideband.c  | 78 +
 t/t5409-colorize-remote-messages.sh | 34 +
 2 files changed, 103 insertions(+), 9 deletions(-)
 create mode 100644 t/t5409-colorize-remote-messages.sh

diff --git a/sideband.c b/sideband.c
index 325bf0e97..e939cd436 100644
--- a/sideband.c
+++ b/sideband.c
@@ -1,7 +1,63 @@
 #include "cache.h"
+#include "color.h"
+#include "config.h"
 #include "pkt-line.h"
 #include "sideband.h"

+
+/*
+ * Optionally highlight some keywords in remote output if they appear at the
+ * start of the line.
+ */
+void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+{
+   static int sideband_use_color = -1;
+   int i;
+   if (sideband_use_color < 0) {
+   const char *key = "color.remote";
+   char *value = NULL;
+   if (!git_config_get_string(key, &value))
+   sideband_use_color = git_config_colorbool(key, value);
+   }
+
+   if (!want_color_stderr(sideband_use_color)) {
+   strbuf_add(dest, src, n);
+   return;
+   }
+
+   struct kwtable {
+   const char *keyword;
+   const char *color;
+   } keywords[] = {
+   {"hint", GIT_COLOR_YELLOW},
+   {"warning", GIT_COLOR_BOLD_YELLOW},
+   {"success", GIT_COLOR_BOLD_GREEN},
+   {"error", GIT_COLOR_BOLD_RED},
+   };
+
+   while (isspace(*src)) {
+   strbuf_addch(dest, *src);
+   src++;
+   n--;
+   }
+
+   for (i = 0; i < ARRAY_SIZE(keywords); i++) {
+   struct kwtable* p = keywords + i;
+   int len = strlen(p->keyword);
+   if (!strncasecmp(p->keyword, src, len) && !isalnum(src[len])) {
+   strbuf_addstr(dest, p->color);
+   strbuf_add(dest, src, len);
+   strbuf_addstr(dest, GIT_COLOR_RESET);
+   n -= len;
+   src += len;
+   break;
+   }
+   }
+
+   strbuf_add(dest, src, n);
+}
+
+
 /*
  * Receive multiplexed output stream over git native protocol.
  * in_stream is the input stream from the remote, which carries data
@@ -48,8 +104,10 @@ int recv_sideband(const char *me, int in_stream, int out)
len--;
switch (band) {
case 3:
-   strbuf_addf(&outbuf, "%s%s%s", outbuf.len ? "\n" : "",
-   DISPLAY_PREFIX, buf + 1);
+   strbuf_addf(&outbuf, "%s%s", outbuf.len ? "\n" : "",
+   DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, buf + 1, len);
+
retval = SIDEBAND_REMOTE_ERROR;
break;
case 2:
@@ -69,20 +127,22 @@ int recv_sideband(const char *me, int in_stream, int out)
if (!outbuf.len)
strbuf_addstr(&outbuf, DISPLAY_PREFIX);
if (linelen > 0) {
-   strbuf_addf(&outbuf, "%.*s%s%c",
-   linelen, b, suffix, *brk);
-   } else {
-   strbuf_addch(&outbuf, *brk);
+   maybe_colorize_sideband(&outbuf, b, 
linelen);
+   strbuf_addstr(&outbuf, suffix);
}
+
+   strbuf_addch(&outbuf, *brk);
xwrite(2, outbuf.buf, outbuf.len);
strbuf_reset(&outbuf);

b = brk + 1;
}

-   if (*b)
-   strbuf_addf(&outbuf, "%s%s", outbuf.len ?
-   "" : DISPLAY_PREFIX, b);
+   if (*b) {
+   strbuf_addstr(&outbuf, outbuf.len ?
+   "" : DISPLAY_PREFIX);
+   maybe_colorize_sideband(&outbuf, b, strlen(b));
+   }
break;
case 1:
write_or_die(out, buf + 1, len);
diff --git a/t/t5409-colorize-remote-messages.sh 
b/t/t5409-colorize-remote-messages.sh
new file mode 100644
index 0..fd165cda0
--- /dev/null
+++ b/t/t5409-colorize-remote-messages.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+test_description='remote messages are colorized on the client'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+   

[PATCH 0/1] Highlight keywords in remote sideband output.

2018-07-30 Thread Han-Wen Nienhuys
Made tests compile and pass (oops). Remove Change-Id footer.

Han-Wen Nienhuys (1):
  Highlight keywords in remote sideband output.

 sideband.c  | 78 +
 t/t5409-colorize-remote-messages.sh | 34 +
 2 files changed, 103 insertions(+), 9 deletions(-)
 create mode 100644 t/t5409-colorize-remote-messages.sh

--
2.18.0.345.g5c9ce644c3-goog


Re: [PATCH] config: fix case sensitive subsection names on writing

2018-07-30 Thread Johannes Schindelin
Hi,

On Fri, 27 Jul 2018, Junio C Hamano wrote:

> Stefan Beller  writes:
>
> [...]

Thanks for the patch!

The only thing that was not clear to me from the patch and from the commit
message was: the first part *is* case insensitive, right? How does the
patch take care of that? Is it relying on `git_config_parse_key()` to do
that? If so, I don't see it...

> I would still hold the judgment on "all except only this one"
> myself.  That's a bit too early in my mind.

Agreed. I seem to remember that I had a funny problem "in the reverse",
where http..* is case-sensitive, but in an unexpected way: if the URL
contains upper-case characters, the  part of the config key needs to
be downcased, otherwise the setting won't be picked up.

Ciao,
Dscho


Strange bug with "git log" - pdftotext?

2018-07-30 Thread Jeremy Morton
I'm trying to search my git log history for a particular term - 
"unobtrusive" - so I run this command:


git log -S unobtrusive --oneline

When I do this, this is displayed and I'm in an interactive less 
terminal or something:


pdftotext version 4.00
Copyright 1996-2017 Glyph & Cog, LLC
Usage: pdftotext [options]  []
 -f  : first page to convert
 -l  : last page to convert
 -layout  : maintain original physical layout
 -simple  : simple one-column page layout
 -table   : similar to -layout, but optimized for tables
 -lineprinter : use strict fixed-pitch/height layout
 -raw : keep strings in content stream order
 -fixed   : assume fixed-pitch (or tabular) text
 -linespacing : fixed line spacing for LinePrinter mode
 -clip: separate clipped text
 -nodiag  : discard diagonal text
 -enc : output text encoding name
 -eol : output end-of-line convention (unix, dos, or mac)
 -nopgbrk : don't insert page breaks between pages
 -bom : insert a Unicode BOM at the start of the text 
file

 -opw : owner password (for encrypted files)
 -upw : user password (for encrypted files)
 -q   : don't print any messages or errors
 -cfg : configuration file to use in place of .xpdfrc
 -v   : print copyright and version info
:

When I hit the down arrow, it scrolls down, repeating this output 
infinitely until I hit 'q'.  What is going on here??


--
Best regards,
Jeremy Morton (Jez)


Hello

2018-07-30 Thread Zelei Mariann
Can you get back to me for a viable business we can do together ?


Re: Using Environment variable GIT_CONFIG does not work as expected

2018-07-30 Thread SZEDER Gábor
> I have a shared linux account which is used by multiple developers.
> But I would like to have git commit history configured so that I can
> see who made a change to common repository (so that author of the
> commit would be correct person, not shared user). There are reasons
> why developers cannot clone this environment to their own accounts.
> 
> So I don't have ~/.gitconfig in place for the shared user, and when
> developer log in I enforce them to configure git for them first. They
> must "export GIT_CONFIG=my_specific_gitconfig".
> When this is done, "git config -l" will show correctly the user.name,
> user.email and other parameters which are set in
> "my_specific_gitconfig".
> 
> However, if user tries now to create a commit the git blames:
> *** Please tell me who you are.
> and so on...
> 
> But running "git config -l" shows that 'user.name' and 'user.email'
> are properly configured.
> Do I need to configure something more in order to get this GIT_CONFIG
> environment variable working. I'm working in Debian Linux environment.

I think it's working as intended, because GIT_CONFIG is only supposed
to affect 'git config' and is only documented in the git-config(1) man
page.  Perhaps the wording could be improved to be more explicit about
this.

Note that more general environment variables affecting more git
commands are documented in git(1), and GIT_CONFIG is not mentioned
there.

Try setting and exporting the environment variables GIT_AUTHOR_NAME,
GIT_AUTHOR_EMAIL, GIT_COMMITTER_NAME, and GIT_COMMITTER_EMAIL instead.

> 
> I have tested this with git versions: 2.1.4, 2.11.0 and
> 2.18.0.321.gffc6fa0


range-diff, was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25)

2018-07-30 Thread Johannes Schindelin
Hi Junio,

On Fri, 27 Jul 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
> 
> > Count me in the "this is useful" camp, but also I did look at the latest
> > submission this time around, but had nothing to say, so I didn't say
> > anything :)
> 
> Please make it a habit to do say something to show that you did
> carefully review the series especially in such a case, which helps
> to differentiate a change that is interesting to nobody, and one
> that is so obviously well done that there is nothing to add.  
> 
> What I have been doing from time to time, when I think there is
> nothing bad in particular to point out, is to quote a part that is
> especially tricky to get right and think it aloud to show how I
> would reason why the result of the change is correct.  This type of
> review helps in two and a half ways:
> 
>  (1) We can see a correction to what the reviewer said, which could
>  lead to further improvement ("no, the code does not quite work
>  that way, so it is still buggy", or "no, the code does not work
>  that way, but triggering that bug is impossible because the
>  caller high above will not call us in that case", etc.),
> 
>  (2) The reviewer can demonstrate to others that s/he understands
>  the area being touched well enough to explain how it works
>  correctly, and a "looks good to me" comment from the reviewer
>  on that change becomes more credible than a mere "looked good
>  and I have nothing else to add".
> 
> Thanks.

FWIW I picked up your Asciidoc-underline fix, and I also fixed a typo in a
commit message (you may want to pick that up, too, unless you want me to
send a full new iteration, I don't care either way):

-- snipsnap --
11:  bf0a5879361 ! 11:  0c1f5db5d01 range-diff: add tests
@@ -3,7 +3,7 @@
 range-diff: add tests
 
 These are essentially lifted from https://github.com/trast/tbdiff, with
-light touch-ups to account for the command now being names `git
+light touch-ups to account for the command now being named `git
 range-diff`.
 
 Apart from renaming `tbdiff` to `range-diff`, only one test case needed

Re: [PATCH 1/9] contrib: add a script to initialize VS Code configuration

2018-07-30 Thread Johannes Schindelin
Hi Jonathan,

On Mon, 23 Jul 2018, Jonathan Nieder wrote:

> Johannes Schindelin via GitGitGadget wrote:
> 
> > From: Johannes Schindelin 
> >
> > VS Code is a lightweight but powerful source code editor which runs on
> > your desktop and is available for Windows, macOS and Linux. Among other
> > languages, it has support for C/C++ via an extension.
> 
> This doesn't really seem relevant to the change.  The relevant bits
> are that (1) VS Code is a popular source code editor, and that (2)
> it's one worth recommending to newcomers.

To the contrary. This paragraph describes the motivation of the patch.

It is, if you want, the most important part of the entire patch series.

> > To start developing Git with VS Code, simply run the Unix shell script
> > contrib/vscode/init.sh, which creates the configuration files in
> > .vscode/ that VS Code consumes.
> >
> > Signed-off-by: Johannes Schindelin 
> 
> This doesn't tell me what the patch will actually do.

True. Will add a paragraph.

> VSCode is an editor.  Is the idea that this will help configure the
> editor to do the right thing with whitespace or something?  Or does it
> configure IDE features to build git?

It is a start of all of the above. Ideally, I will be eventually be able
to consistently use VS Code for Git development, whether I have to work on
a Linux, a macOS or a Windows machine.

And I am not happy until the same ease is also available to others. Hence
my contribution.

Ciao,
Dscho


Re: [PATCH] DO-NOT-MERGE: write and read commit-graph always

2018-07-30 Thread Jakub Narebski
Derrick Stolee  writes:

> This commit is not intended to be merged, but is only to create a test
> environment to see what works with the commit-graph feature and what
> does not. The tests that fail are primarily related to corrupting the
> object store to remove a commit from visibility and testing that
> rev-list fails -- except it doesn't when there is a commit-graph that
> prevents parsing from the object database. The following tests will fail
> with this commit, but are not "real" bugs:
>
> t0410-partial-clone.sh, Test 9
> t5307-pack-missing-commit.sh, Tests 3-4
> t6011-rev-list-with-bad-commit.sh, Test 4
>
> The following test fails because the repo has ambiguous merge-bases, and
> the commit-graph changes the walk order so we select a different one.
> This alters the resulting merge from the expected result.
>
> t6024-recursive-merge.sh, Test 4
>
> The tests above are made to pass by deleting the commit-graph file
> before the necessary steps.
>
> Signed-off-by: Derrick Stolee 
> ---
>  builtin/commit.c|  2 ++
>  builtin/gc.c|  3 +--
>  commit-graph.c  | 11 ---
>  t/t0410-partial-clone.sh|  1 +
>  t/t5307-pack-missing-commit.sh  |  2 ++
>  t/t6011-rev-list-with-bad-commit.sh |  1 +
>  t/t6024-recursive-merge.sh  |  1 +
>  7 files changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 158e3f843a..acc31252a9 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -33,6 +33,7 @@
>  #include "sequencer.h"
>  #include "mailmap.h"
>  #include "help.h"
> +#include "commit-graph.h"
>  
>  static const char * const builtin_commit_usage[] = {
>   N_("git commit [] [--] ..."),
> @@ -1652,6 +1653,7 @@ int cmd_commit(int argc, const char **argv, const char 
> *prefix)
>"not exceeded, and then \"git reset HEAD\" to recover."));
>  
>   rerere(0);
> + write_commit_graph_reachable(get_object_directory(), 1);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   run_commit_hook(use_editor, get_index_file(), "post-commit", NULL);
>   if (amend && !no_post_rewrite) {

This is certainly not for merging.

> diff --git a/builtin/gc.c b/builtin/gc.c
> index e103f0f85d..60ab773087 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -41,7 +41,7 @@ static int aggressive_depth = 50;
>  static int aggressive_window = 250;
>  static int gc_auto_threshold = 6700;
>  static int gc_auto_pack_limit = 50;
> -static int gc_write_commit_graph;
> +static int gc_write_commit_graph = 1;
>  static int detach_auto = 1;
>  static timestamp_t gc_log_expire_time;
>  static const char *gc_log_expire = "1.day.ago";

This is switching from default to off to default to on.  I think with
this feature we would default to off for a long time (wishful thinking:
maybe automatically enabling it for large repositories, or if
commit-graph file exists already?).

> @@ -131,7 +131,6 @@ static void gc_config(void)
>   git_config_get_int("gc.aggressivedepth", &aggressive_depth);
>   git_config_get_int("gc.auto", &gc_auto_threshold);
>   git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
> - git_config_get_bool("gc.writecommitgraph", &gc_write_commit_graph);
>   git_config_get_bool("gc.autodetach", &detach_auto);
>   git_config_get_expiry("gc.pruneexpire", &prune_expire);
>   git_config_get_expiry("gc.worktreepruneexpire", 
> &prune_worktrees_expire);

This is certainly not for merging, as it disables gc.writeCommitGraph
config option entirely.

> diff --git a/commit-graph.c b/commit-graph.c
> index 237d4e7d1b..ed0d27c12e 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -227,22 +227,11 @@ static int prepare_commit_graph(struct repository *r)
>  {
>   struct alternate_object_database *alt;
>   char *obj_dir;
> - int config_value;
>  
>   if (r->objects->commit_graph_attempted)
>   return !!r->objects->commit_graph;
>   r->objects->commit_graph_attempted = 1;
>  
> - if (repo_config_get_bool(r, "core.commitgraph", &config_value) ||
> - !config_value)
> - /*
> -  * This repository is not configured to use commit graphs, so
> -  * do not load one. (But report commit_graph_attempted anyway
> -  * so that commit graph loading is not attempted again for this
> -  * repository.)
> -  */
> - return 0;
> -
>   if (!commit_graph_compatible(r))
>   return 0;
>

This is certainly not for merging, as it disables core.commitGraph
config option entirely.

> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 4984ca583d..c235672b03 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -181,6 +181,7 @@ test_expect_success 'rev-list stops traversal at missing 
> and promised commit' '
>  
>   git -C repo config core.repositoryformatversion 1 &&
> 

Re: [PATCH 1/9] contrib: add a script to initialize VS Code configuration

2018-07-30 Thread Johannes Schindelin
Hi Junio,

On Mon, 23 Jul 2018, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" 
> writes:
> 
> > diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
> > new file mode 100755
> > index 0..3cc93243f
> > --- /dev/null
> > +++ b/contrib/vscode/init.sh
> > @@ -0,0 +1,165 @@
> > +#!/bin/sh
> 
> This is caued by our usual "git apply --whitespace=warn" as it
> contains lines indented by spaces not tabs (perhaps the json
> literals?)

A `git diff --check` does not show any red flag.

> Can we have contrib/vscode/.gitattributes to tweak the whitespace
> breakage checking rules so that this file is excempt?

I would have appreciated quite a bit more clear a message, including how I
can trigger this, and what is the appropriate setting.

As it is, I cannot reproduce your problem with a `git show`, so I assume
that the main problem is that this commit *does* add that .gitattributes,
but of course your `git apply` ran before that file existed.

For your own pleasure, here is the link to the .gitattributes file in the
commit that you applied yourself:

https://github.com/gitster/git/commit/44559f0388a4e256bea3eb2f57b92a2b9e3d06f9#diff-046ba41c5a21694c62b8dc5e93d7f0c2R1

I assume that this is good enough for you, and your comment is moot?

Ciao,
Dscho


Re: [PATCH 2/9] vscode: hard-code a couple defines

2018-07-30 Thread Johannes Schindelin
Hi Jonathan,

On Mon, 23 Jul 2018, Jonathan Nieder wrote:

> Johannes Schindelin via GitGitGadget wrote:
> 
> > From: Johannes Schindelin 
> >
> > Sadly, we do not get all of the definitions via ALL_CFLAGS. Some defines
> > are passed to GCC *only* when compiling specific files, such as git.o.
> >
> > Let's just hard-code them into the script for the time being.
> 
> Could we move these to ALL_CFLAGS?  Is there any downside other than
> increased rebuilds when options change?

I assumed that it gives us a bit more liberty at using names for constants
that might otherwise be too general.

I did not even consider the rebuild time, but that's a good point.

The idea to move those definitions to ALL_CFLAGS sounds good to me, yet I
am loathe to introduce this into a patch series that is not about
far-reaching design decisions at all, but merely about making a decent
tool available to the finger tips of all Git contributors.

Read: I think this would make for a pretty good micro-project, maybe in next
year's GSoC (assuming that it happens and that we're accepted again).

Ciao,
Dscho


Re: [PATCH v3 10/10] fsck: test and document unknown fsck. values

2018-07-30 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>>> When fsck. is set to an unknown value it'll cause "fsck" to
>>> die, but the same is not rue of the "fetch" and "receive"
>>> variants. Document this and test for it.
> ...
> We could change it. This is just something I ran into and figured it
> should be tested / documented, and didn't feel any need to change it
> myself.
>
> The current behavior is probably more of an organically grown
> accident. Maybe we should make all of these warn.

Or die.  I do agree with your assessment that the discrepancy was
not designed, and my inclination for a correctness-centered feature
like fsck is to err on the side of caution, rather than letting a
misconfiguration pass unintended kinds of breakages through.

Thanks.



Re: [PATCH 7/9] vscode: use 8-space tabs, no trailing ws, etc for Git's source code

2018-07-30 Thread Johannes Schindelin
Hi Hannes,

On Wed, 25 Jul 2018, Johannes Sixt wrote:

> Am 23.07.2018 um 15:52 schrieb Johannes Schindelin via GitGitGadget:
> > From: Johannes Schindelin 
> > 
> > This adds a couple settings for the .c/.h files so that it is easier to
> > conform to Git's conventions while editing the source code.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >   contrib/vscode/init.sh | 8 
> >   1 file changed, 8 insertions(+)
> > 
> > diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
> > index face115e8..29f2a729d 100755
> > --- a/contrib/vscode/init.sh
> > +++ b/contrib/vscode/init.sh
> > @@ -21,6 +21,14 @@ cat >.vscode/settings.json.new <<\EOF ||
> >   "editor.wordWrap": "wordWrapColumn",
> >   "editor.wordWrapColumn": 72
> >   },
> > +"[c]": {
> > +"editor.detectIndentation": false,
> > +"editor.insertSpaces": false,
> > +"editor.tabSize": 8,
> > +"editor.wordWrap": "wordWrapColumn",
> > +"editor.wordWrapColumn": 80,
> > +"files.trimTrailingWhitespace": true
> > +},
> 
> I am a VS Code user, but I haven't used these settings before.
> 
> With these settings, does the editor break lines while I am typing? Or does it
> just insert a visual cue that tells where I should insert a line break? If the
> former, it would basically make the editor unusable for my taste. I want to
> have total control over the code I write. The 80 column limit is just a
> recommendation, not a hard requirement.

Fear not. It is giving you a very clear visual cue, but that's all. It
does show the line wrapped, but the line number column quite clearly shows
that it did not insert a new-line.

Ciao,
Dscho

> 
> >   "files.associations": {
> >   "*.h": "c",
> >   "*.c": "c"
> > 
> 
> -- Hannes
> 
> 


Re: [PATCH 1/1] Highlight keywords in remote sideband output.

2018-07-30 Thread Duy Nguyen
On Mon, Jul 30, 2018 at 2:34 PM Han-Wen Nienhuys  wrote:
> +   struct kwtable {
> +   const char *keyword;
> +   const char *color;
> +   } keywords[] = {
> +   {"hint", GIT_COLOR_YELLOW},
> +   {"warning", GIT_COLOR_BOLD_YELLOW},
> +   {"success", GIT_COLOR_BOLD_GREEN},
> +   {"error", GIT_COLOR_BOLD_RED},
> +   };

Please let me customize these colors. "grep color.*slot
Documentation/config.txt help.c" could give you a few examples.

I think we also add a space after { and before } in most places (there
are a few places that don't do that, but personally I'd prefer spaces
to make it a bit easier to read).
-- 
Duy


Re: Is detecting endianness at compile-time unworkable?

2018-07-30 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> And, as an aside, the reason we can't easily make it better ourselves is
> because the build process for git.git doesn't have a facility to run
> code to detect this type of stuff (the configure script is always
> optional). So we can't just run this test ourselves.

It won't help those who cross-compile anyway.  I thought we declared
"we make a reasonable effort to guess the target endianness from the
system header by inspecting usual macros, but will not aim to cover
every system on the planet---instead there is a knob to tweak it for
those on exotic platforms" last time we discussed this?


Re: [PATCH v3 07/10] fetch: implement fetch.fsck.*

2018-07-30 Thread Duy Nguyen
On Fri, Jul 27, 2018 at 02:37:17PM +, Ævar Arnfjörð Bjarmason wrote:
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 7ff453c53b..8dace49daa 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1467,6 +1467,16 @@ fetch.fsckObjects::
>   checked. Defaults to false. If not set, the value of
>   `transfer.fsckObjects` is used instead.
>  
> +fetch.fsck.::
> + Acts like `fsck.`, but is used by
> + linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
> + the `fsck.` documentation for details.

Could you squash this patch in? It would let us auto-complete "git
config fetch.fsck" in bash/tcsh

-- 8< --
diff --git a/help.c b/help.c
index 9c0b5a8de9..32fe466dbf 100644
--- a/help.c
+++ b/help.c
@@ -427,6 +427,7 @@ void list_config_help(int for_human)
{ "color.interactive", "", 
list_config_color_interactive_slots },
{ "color.status", "", list_config_color_status_slots },
{ "fsck", "", list_config_fsck_msg_ids },
+   { "fetch.fsck", "", list_config_fsck_msg_ids },
{ "receive.fsck", "", list_config_fsck_msg_ids },
{ NULL, NULL, NULL }
};
-- 8< --
--
Duy


Re: [PATCH v3 07/10] fetch: implement fetch.fsck.*

2018-07-30 Thread Ævar Arnfjörð Bjarmason


On Mon, Jul 30 2018, Duy Nguyen wrote:

> On Fri, Jul 27, 2018 at 02:37:17PM +, Ævar Arnfjörð Bjarmason wrote:
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index 7ff453c53b..8dace49daa 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -1467,6 +1467,16 @@ fetch.fsckObjects::
>>  checked. Defaults to false. If not set, the value of
>>  `transfer.fsckObjects` is used instead.
>>  
>> +fetch.fsck.::
>> +Acts like `fsck.`, but is used by
>> +linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
>> +the `fsck.` documentation for details.
>
> Could you squash this patch in? It would let us auto-complete "git
> config fetch.fsck" in bash/tcsh
>
> -- 8< --
> diff --git a/help.c b/help.c
> index 9c0b5a8de9..32fe466dbf 100644
> --- a/help.c
> +++ b/help.c
> @@ -427,6 +427,7 @@ void list_config_help(int for_human)
>   { "color.interactive", "", 
> list_config_color_interactive_slots },
>   { "color.status", "", list_config_color_status_slots },
>   { "fsck", "", list_config_fsck_msg_ids },
> + { "fetch.fsck", "", list_config_fsck_msg_ids },
>   { "receive.fsck", "", list_config_fsck_msg_ids },
>   { NULL, NULL, NULL }
>   };
> -- 8< --

Thanks! I missed that. Willdo.


[PATCH/RFC] clone: report duplicate entries on case-insensitive filesystems

2018-07-30 Thread Nguyễn Thái Ngọc Duy
Paths that only differ in case work fine in a case-sensitive
filesystems, but if those repos are cloned in a case-insensitive one,
you'll get problems. The first thing to notice is "git status" will
never be clean with no indication what's exactly is "dirty".

This patch helps the situation a bit by pointing out the problem at
clone time. I have not suggested any way to work around or fix this
problem. But I guess we could probably have a section in
Documentation/ dedicated to this problem and point there instead of
a long advice in this warning.

Another thing we probably should do is catch in "git checkout" too,
not just "git clone" since your linux/unix colleage colleague may
accidentally add some files that your mac/windows machine is not very
happy with. But then there's another problem, once the problem is
known, we probably should stop spamming this warning at every
checkout, but how?

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/clone.c | 41 +
 1 file changed, 41 insertions(+)

diff --git a/builtin/clone.c b/builtin/clone.c
index 5c439f1394..32738c2737 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -711,6 +711,33 @@ static void update_head(const struct ref *our, const 
struct ref *remote,
}
 }
 
+static void find_duplicate_icase_entries(struct index_state *istate,
+struct string_list *dup)
+{
+   struct string_list list = STRING_LIST_INIT_NODUP;
+   int i;
+
+   for (i = 0; i < istate->cache_nr; i++)
+   string_list_append(&list, istate->cache[i]->name);
+
+   list.cmp = fspathcmp;
+   string_list_sort(&list);
+
+   for (i = 1; i < list.nr; i++) {
+   const char *cur = list.items[i].string;
+   const char *prev = list.items[i - 1].string;
+
+   if (dup->nr &&
+   !fspathcmp(cur, dup->items[dup->nr - 1].string)) {
+   string_list_append(dup, cur);
+   } else if (!fspathcmp(cur, prev)) {
+   string_list_append(dup, prev);
+   string_list_append(dup, cur);
+   }
+   }
+   string_list_clear(&list, 0);
+}
+
 static int checkout(int submodule_progress)
 {
struct object_id oid;
@@ -761,6 +788,20 @@ static int checkout(int submodule_progress)
if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
die(_("unable to write new index file"));
 
+   if (ignore_case) {
+   struct string_list dup = STRING_LIST_INIT_DUP;
+   int i;
+
+   find_duplicate_icase_entries(&the_index, &dup);
+   if (dup.nr) {
+   warning(_("the following paths in this repository only 
differ in case and will\n"
+ "cause problems because you have cloned it on 
an case-insensitive filesytem:\n"));
+   for (i = 0; i < dup.nr; i++)
+   fprintf(stderr, "\t%s\n", dup.items[i].string);
+   }
+   string_list_clear(&dup, 0);
+   }
+
err |= run_hook_le(NULL, "post-checkout", sha1_to_hex(null_sha1),
   oid_to_hex(&oid), "1", NULL);
 
-- 
2.18.0.656.gda699b98b3



Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Junio C Hamano
Phillip Wood  writes:

>> Moreover, patch 2/2 of this series provides a more thorough fix overall
>> than Akinori, so it may make sense to replace his patch with this
>> series, though perhaps keep the test his patch adds to augment the
>> strict test of the "author" header added by this series.
>
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git
>
> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

Thanks for working well together.  Always nice to see contributors
thinking beyond immediate band-aid and for longer term ;-)

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.



Re: Strange bug with "git log" - pdftotext?

2018-07-30 Thread Jeff King
On Mon, Jul 30, 2018 at 01:49:46PM +0100, Jeremy Morton wrote:

> I'm trying to search my git log history for a particular term -
> "unobtrusive" - so I run this command:
> 
> git log -S unobtrusive --oneline
> 
> When I do this, this is displayed and I'm in an interactive less terminal or
> something:
> 
> pdftotext version 4.00
> [...]

That's definitely weird.

My guess is that the repository has some .gitattributes set up to diff
pdf files in a particular way, and you have some matching config that
tries to call pdftotext.

What does:

  git config --list | grep ^diff

say? I'd expect to see an external or textconv option there running
pdftotext.

Another option is that your pager is somehow set up to call pdftotext,
but that seems much more nonsensical to use the tool there (but try "git
var GIT_PAGER" and "git config pager.log" to check).

-Peff


Re: range-diff, was Re: What's cooking in git.git (Jul 2018, #03; Wed, 25)

2018-07-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> FWIW I picked up your Asciidoc-underline fix, and I also fixed a typo in a
> commit message (you may want to pick that up, too, unless you want me to
> send a full new iteration, I don't care either way):

Meaning that if you send a full new iteration it would match what we
have on 'pu' plus the one-liner below?  I think we can do without
such a resend, because everybody has seen all there is to see if
that is the case.

> -- snipsnap --
> 11:  bf0a5879361 ! 11:  0c1f5db5d01 range-diff: add tests
> @@ -3,7 +3,7 @@
>  range-diff: add tests
>  
>  These are essentially lifted from https://github.com/trast/tbdiff, 
> with
> -light touch-ups to account for the command now being names `git
> +light touch-ups to account for the command now being named `git
>  range-diff`.
>  
>  Apart from renaming `tbdiff` to `range-diff`, only one test case 
> needed

I'll need to remember to rebuild es/format-patch-rangediff after
amending bf0a587936 with this, but I think I should be able to push
out the result in today's round.

If any other issue arises, I do not mind taking an update, either,
but I think at this point the topic is reaching the point of
diminishing returns and should switch to incremental.


[PATCH v2 1/9] contrib: add a script to initialize VS Code configuration

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

VS Code is a lightweight but powerful source code editor which runs on
your desktop and is available for Windows, macOS and Linux. Among other
languages, it has support for C/C++ via an extension, which offers to
not only build and debug the code, but also Intellisense, i.e.
code-aware completion and similar niceties.

This patch adds a script that helps set up the environment to work
effectively with VS Code: simply run the Unix shell script
contrib/vscode/init.sh, which creates the relevant files, and open the
top level folder of Git's source code in VS Code.

Signed-off-by: Johannes Schindelin 
---
 .gitignore|   1 +
 contrib/vscode/.gitattributes |   1 +
 contrib/vscode/README.md  |  14 +++
 contrib/vscode/init.sh| 165 ++
 4 files changed, 181 insertions(+)
 create mode 100644 contrib/vscode/.gitattributes
 create mode 100644 contrib/vscode/README.md
 create mode 100755 contrib/vscode/init.sh

diff --git a/.gitignore b/.gitignore
index 388cc4bee..592e8f879 100644
--- a/.gitignore
+++ b/.gitignore
@@ -206,6 +206,7 @@
 /config.mak.autogen
 /config.mak.append
 /configure
+/.vscode/
 /tags
 /TAGS
 /cscope*
diff --git a/contrib/vscode/.gitattributes b/contrib/vscode/.gitattributes
new file mode 100644
index 0..e89f2236e
--- /dev/null
+++ b/contrib/vscode/.gitattributes
@@ -0,0 +1 @@
+init.sh whitespace=-indent-with-non-tab
diff --git a/contrib/vscode/README.md b/contrib/vscode/README.md
new file mode 100644
index 0..8202d6203
--- /dev/null
+++ b/contrib/vscode/README.md
@@ -0,0 +1,14 @@
+Configuration for VS Code
+=
+
+[VS Code](https://code.visualstudio.com/) is a lightweight but powerful source
+code editor which runs on your desktop and is available for
+[Windows](https://code.visualstudio.com/docs/setup/windows),
+[macOS](https://code.visualstudio.com/docs/setup/mac) and
+[Linux](https://code.visualstudio.com/docs/setup/linux). Among other languages,
+it has [support for C/C++ via an 
extension](https://github.com/Microsoft/vscode-cpptools).
+
+To start developing Git with VS Code, simply run the Unix shell script called
+`init.sh` in this directory, which creates the configuration files in
+`.vscode/` that VS Code consumes. `init.sh` needs access to `make` and `gcc`,
+so run the script in a Git SDK shell if you are using Windows.
diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
new file mode 100755
index 0..3cc93243f
--- /dev/null
+++ b/contrib/vscode/init.sh
@@ -0,0 +1,165 @@
+#!/bin/sh
+
+die () {
+   echo "$*" >&2
+   exit 1
+}
+
+cd "$(dirname "$0")"/../.. ||
+die "Could not cd to top-level directory"
+
+mkdir -p .vscode ||
+die "Could not create .vscode/"
+
+# General settings
+
+cat >.vscode/settings.json <<\EOF ||
+{
+"C_Cpp.intelliSenseEngine": "Default",
+"C_Cpp.intelliSenseEngineFallback": "Disabled",
+"files.associations": {
+"*.h": "c",
+"*.c": "c"
+}
+}
+EOF
+die "Could not write settings.json"
+
+# Infer some setup-specific locations/names
+
+GCCPATH="$(which gcc)"
+GDBPATH="$(which gdb)"
+MAKECOMMAND="make -j5 DEVELOPER=1"
+OSNAME=
+X=
+case "$(uname -s)" in
+MINGW*)
+   GCCPATH="$(cygpath -am "$GCCPATH")"
+   GDBPATH="$(cygpath -am "$GDBPATH")"
+   MAKE_BASH="$(cygpath -am /git-cmd.exe) --command=usrbinbash.exe"
+   MAKECOMMAND="$MAKE_BASH -lc \\\"$MAKECOMMAND\\\""
+   OSNAME=Win32
+   X=.exe
+   ;;
+Linux)
+   OSNAME=Linux
+   ;;
+Darwin)
+   OSNAME=macOS
+   ;;
+esac
+
+# Default build task
+
+cat >.vscode/tasks.json .vscode/launch.json 

[PATCH v2 3/9] cache.h: extract enum declaration from inside a struct declaration

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

While it is technically possible, it is confusing. Not only the user,
but also VS Code's intellisense.

Signed-off-by: Johannes Schindelin 
---
 cache.h | 24 +---
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 89a107a7f..2380136f6 100644
--- a/cache.h
+++ b/cache.h
@@ -1425,18 +1425,20 @@ extern void *read_object_with_reference(const struct 
object_id *oid,
 extern struct object *peel_to_type(const char *name, int namelen,
   struct object *o, enum object_type);
 
+enum date_mode_type {
+   DATE_NORMAL = 0,
+   DATE_RELATIVE,
+   DATE_SHORT,
+   DATE_ISO8601,
+   DATE_ISO8601_STRICT,
+   DATE_RFC2822,
+   DATE_STRFTIME,
+   DATE_RAW,
+   DATE_UNIX
+};
+
 struct date_mode {
-   enum date_mode_type {
-   DATE_NORMAL = 0,
-   DATE_RELATIVE,
-   DATE_SHORT,
-   DATE_ISO8601,
-   DATE_ISO8601_STRICT,
-   DATE_RFC2822,
-   DATE_STRFTIME,
-   DATE_RAW,
-   DATE_UNIX
-   } type;
+   enum date_mode_type type;
const char *strftime_fmt;
int local;
 };
-- 
gitgitgadget



[PATCH v2 5/9] vscode: only overwrite C/C++ settings

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

The C/C++ settings are special, as they are the only generated VS Code
configurations that *will* change over the course of Git's development,
e.g. when a new constant is defined.

Therefore, let's only update the C/C++ settings, also to prevent user
modifications from being overwritten.

Ideally, we would keep user modifications in the C/C++ settings, but
that would require parsing JSON, a task for which a Unix shell script is
distinctly unsuited. So we write out .new files instead, and warn the
user if they may want to reconcile their changes.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index 494a51ac5..ba9469226 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -13,7 +13,7 @@ die "Could not create .vscode/"
 
 # General settings
 
-cat >.vscode/settings.json <<\EOF ||
+cat >.vscode/settings.json.new <<\EOF ||
 {
 "C_Cpp.intelliSenseEngine": "Default",
 "C_Cpp.intelliSenseEngineFallback": "Disabled",
@@ -51,7 +51,7 @@ esac
 
 # Default build task
 
-cat >.vscode/tasks.json <.vscode/tasks.json.new .vscode/launch.json <.vscode/launch.json.new <

[PATCH v2 2/9] vscode: hard-code a couple defines

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

Sadly, we do not get all of the definitions via ALL_CFLAGS. Some defines
are passed to GCC *only* when compiling specific files, such as git.o.

Let's just hard-code them into the script for the time being.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index 3cc93243f..494a51ac5 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -115,7 +115,19 @@ include Makefile
 vscode-init:
@mkdir -p .vscode && \
incs= && defs= && \
-   for e in $(ALL_CFLAGS); do \
+   for e in $(ALL_CFLAGS) \
+   '-DGIT_EXEC_PATH="$(gitexecdir_SQ)"' \
+   '-DGIT_LOCALE_PATH="$(localedir_relative_SQ)"' \
+   '-DBINDIR="$(bindir_relative_SQ)"' \
+   '-DFALLBACK_RUNTIME_PREFIX="$(prefix_SQ)"' \
+   '-DDEFAULT_GIT_TEMPLATE_DIR="$(template_dir_SQ)"' \
+   '-DETC_GITCONFIG="$(ETC_GITCONFIG_SQ)"' \
+   '-DETC_GITATTRIBUTES="$(ETC_GITATTRIBUTES_SQ)"' \
+   '-DGIT_LOCALE_PATH="$(localedir_relative_SQ)"' \
+   '-DCURL_DISABLE_TYPECHECK', \
+   '-DGIT_HTML_PATH="$(htmldir_relative_SQ)"' \
+   '-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
+   '-DGIT_INFO_PATH="$(infodir_relative_SQ)"'; do \
case "$$e" in \
-I.) \
incs="$$(printf '% 16s"$${workspaceRoot}",\n%s' \
-- 
gitgitgadget



[PATCH v2 0/9] Add support to develop Git in Visual Studio Code

2018-07-30 Thread Johannes Schindelin via GitGitGadget
[Visual Studio Code](https://code.visualstudio.com/) (nickname "VS Code") is a 
light-weight, cross-platform, Open Source development environment, with an 
increasingly powerful extension to support C/C++ development. In particular the 
intellisense support as well as all the other niceties developers might have 
come to expect from Integrated Development Environments will help accelerate 
development.

Due to the way Git's source code and build process is structured, it can be 
quite challenging to use VS Code effectively for developing Git. Which is a 
shame, as developing with vim and the command-line causes unnecessary churn, 
and it is quite understandable that most Git developers simply do not want to 
fight with a modern development environment just to try whether they like 
developing Git with it.

This topic branch makes it easy to get started using VS Code to develop Git.

Simply run the script `./contrib/vscode/init.sh`. This will initialize the 
`.vscode/` directory and some files in that directory. After that, just open 
Git's top-level directory as "folder" in VS Code.

The files have to be generated because of the curious way Git determines what 
flags to pass to the C compiler, in particular which constants are defined, 
because they change the compile flow in rather dramatic ways (determining, e.g. 
which SHA-1 backend to use).

Changes since v1:

- Clarified commit message of the first commit.

Johannes Schindelin (9):
  contrib: add a script to initialize VS Code configuration
  vscode: hard-code a couple defines
  cache.h: extract enum declaration from inside a struct declaration
  mingw: define WIN32 explicitly
  vscode: only overwrite C/C++ settings
  vscode: wrap commit messages at column 72 by default
  vscode: use 8-space tabs, no trailing ws, etc for Git's source code
  vscode: add a dictionary for cSpell
  vscode: let cSpell work on commit messages, too

 .gitignore|   1 +
 cache.h   |  24 ++-
 config.mak.uname  |   2 +-
 contrib/vscode/.gitattributes |   1 +
 contrib/vscode/README.md  |  14 ++
 contrib/vscode/init.sh| 375 ++
 6 files changed, 405 insertions(+), 12 deletions(-)
 create mode 100644 contrib/vscode/.gitattributes
 create mode 100644 contrib/vscode/README.md
 create mode 100755 contrib/vscode/init.sh


base-commit: 53f9a3e157dbbc901a02ac2c73346d375e24978c
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-2%2Fdscho%2Fvscode-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2/dscho/vscode-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/2

Range-diff vs v1:

  1:  e2e449a00 !  1:  bbf13e40a contrib: add a script to initialize VS Code 
configuration
 @@ -4,11 +4,14 @@
  
  VS Code is a lightweight but powerful source code editor which runs on
  your desktop and is available for Windows, macOS and Linux. Among 
other
 -languages, it has support for C/C++ via an extension.
 +languages, it has support for C/C++ via an extension, which offers to
 +not only build and debug the code, but also Intellisense, i.e.
 +code-aware completion and similar niceties.
  
 -To start developing Git with VS Code, simply run the Unix shell script
 -contrib/vscode/init.sh, which creates the configuration files in
 -.vscode/ that VS Code consumes.
 +This patch adds a script that helps set up the environment to work
 +effectively with VS Code: simply run the Unix shell script
 +contrib/vscode/init.sh, which creates the relevant files, and open the
 +top level folder of Git's source code in VS Code.
  
  Signed-off-by: Johannes Schindelin 
  
  2:  3770bd855 =  2:  6c8b5a4f2 vscode: hard-code a couple defines
  3:  de49c4bf2 =  3:  105b50c62 cache.h: extract enum declaration from inside 
a struct declaration
  4:  b3ce2ccf4 =  4:  4b95b1e2a mingw: define WIN32 explicitly
  5:  06dcf6d97 =  5:  3dd53c04c vscode: only overwrite C/C++ settings
  6:  08212c57e =  6:  384b08836 vscode: wrap commit messages at column 72 by 
default
  7:  2e880b6f1 =  7:  ba78af756 vscode: use 8-space tabs, no trailing ws, etc 
for Git's source code
  8:  ce216cf43 =  8:  358f38d3a vscode: add a dictionary for cSpell
  9:  4c2aa015a =  9:  b9c628b88 vscode: let cSpell work on commit messages, too

-- 
gitgitgadget


[PATCH v2 6/9] vscode: wrap commit messages at column 72 by default

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

When configuring VS Code as core.editor (via `code --wait`), we really
want to adhere to the Git conventions of wrapping commit messages.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 4 
 1 file changed, 4 insertions(+)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index ba9469226..face115e8 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -17,6 +17,10 @@ cat >.vscode/settings.json.new <<\EOF ||
 {
 "C_Cpp.intelliSenseEngine": "Default",
 "C_Cpp.intelliSenseEngineFallback": "Disabled",
+"[git-commit]": {
+"editor.wordWrap": "wordWrapColumn",
+"editor.wordWrapColumn": 72
+},
 "files.associations": {
 "*.h": "c",
 "*.c": "c"
-- 
gitgitgadget



[PATCH v2 9/9] vscode: let cSpell work on commit messages, too

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

By default, the cSpell extension ignores all files under .git/. That
includes, unfortunately, COMMIT_EDITMSG, i.e. commit messages. However,
spell checking is *quite* useful when writing commit messages... And
since the user hardly ever opens any file inside .git (apart from commit
messages, the config, and sometimes interactive rebase's todo lists),
there is really not much harm in *not* ignoring .git/.

The default also ignores `node_modules/`, but that does not apply to
Git, so let's skip ignoring that, too.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index a134cb4c5..27de94994 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -33,6 +33,8 @@ cat >.vscode/settings.json.new <<\EOF ||
 "*.h": "c",
 "*.c": "c"
 },
+"cSpell.ignorePaths": [
+],
 "cSpell.words": [
 "DATAW",
 "DBCACHED",
-- 
gitgitgadget


[PATCH v2 8/9] vscode: add a dictionary for cSpell

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

The quite useful cSpell extension allows VS Code to have "squiggly"
lines under spelling mistakes. By default, this would add too much
clutter, though, because so much of Git's source code uses words that
would trigger cSpell.

Let's add a few words to make the spell checking more useful by reducing
the number of false positives.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 169 -
 1 file changed, 168 insertions(+), 1 deletion(-)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index 29f2a729d..a134cb4c5 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -32,7 +32,174 @@ cat >.vscode/settings.json.new <<\EOF ||
 "files.associations": {
 "*.h": "c",
 "*.c": "c"
-}
+},
+"cSpell.words": [
+"DATAW",
+"DBCACHED",
+"DFCHECK",
+"DTYPE",
+"Hamano",
+"HCAST",
+"HEXSZ",
+"HKEY",
+"HKLM",
+"IFGITLINK",
+"IFINVALID",
+"ISBROKEN",
+"ISGITLINK",
+"ISSYMREF",
+"Junio",
+"LPDWORD",
+"LPPROC",
+"LPWSTR",
+"MSVCRT",
+"NOARG",
+"NOCOMPLETE",
+"NOINHERIT",
+"RENORMALIZE",
+"STARTF",
+"STARTUPINFOEXW",
+"Schindelin",
+"UCRT",
+"YESNO",
+"argcp",
+"beginthreadex",
+"committish",
+"contentp",
+"cpath",
+"cpidx",
+"ctim",
+"dequote",
+"envw",
+"ewah",
+"fdata",
+"fherr",
+"fhin",
+"fhout",
+"fragp",
+"fsmonitor",
+"hnsec",
+"idents",
+"includeif",
+"interpr",
+"iprog",
+"isexe",
+"iskeychar",
+"kompare",
+"mksnpath",
+"mktag",
+"mktree",
+"mmblob",
+"mmbuffer",
+"mmfile",
+"noenv",
+"nparents",
+"ntpath",
+"ondisk",
+"ooid",
+"oplen",
+"osdl",
+"pnew",
+"pold",
+"ppinfo",
+"pushf",
+"pushv",
+"rawsz",
+"rebasing",
+"reencode",
+"repo",
+"rerere",
+"scld",
+"sharedrepo",
+"spawnv",
+"spawnve",
+"spawnvpe",
+"strdup'ing",
+"submodule",
+"submodules",
+"topath",
+"topo",
+"tpatch",
+"unexecutable",
+"unhide",
+"unkc",
+"unkv",
+"unmark",
+"unmatch",
+"unsets",
+"unshown",
+"untracked",
+"untrackedcache",
+"unuse",
+"upos",
+"uval",
+"vreportf",
+"wargs",
+"wargv",
+"wbuffer",
+"wcmd",
+"wcsnicmp",
+"wcstoutfdup",
+"wdeltaenv",
+"wdir",
+"wenv",
+"wenvblk",
+"wenvcmp",
+"wenviron",
+"wenvpos",
+"wenvsz",
+"wfile",
+"wfilename",
+"wfopen",
+"wfreopen",
+"wfullpath",
+"which'll",
+"wlink",
+"wmain",
+"wmkdir",
+"wmktemp",
+"wnewpath",
+"wotype",
+"wpath",
+"wpathname",
+"wpgmptr",
+"wpnew",
+"wpointer",
+"wpold",
+"wpos",
+"wputenv",
+"wrmdir",
+"wship",
+"wtarget",
+"wtemplate",
+"wunlink",
+"xcalloc",
+"xgetcwd",
+"xmallocz",
+"xmemdupz",
+"xmmap",
+"xopts",
+"xrealloc",
+"xsnprintf",
+"xutftowcs",
+"xutftowcsn",
+"xwcstoutf"
+],
+"cSpell.ignoreRegExpList": [
+"\\\"(DIRC|FSMN|REUC|UNTR)\\\"",
+"u[0-9a-fA-Fx]{4}\\b",
+"\\b(filfre|frotz|xyzzy)\\b",
+"\\bCMIT_FMT_DEFAULT\\b",
+"\\bde-munge\\b",
+"\\bGET_OID_DISAMBIGUATORS\\b",
+"\\bHASH_RENORMALIZE\\b",
+"\\bTREESAMEness\\b",
+"\\bUSE_STDEV\\b",
+"\\Wchar *\\*\\W*utfs\\W",
+"cURL's",
+"nedmalloc'ed",
+"ntifs\\.h",
+],
 }
 EOF
 die "Could not write settings.json"
-- 
gitgitgadget



[PATCH v2 7/9] vscode: use 8-space tabs, no trailing ws, etc for Git's source code

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

This adds a couple settings for the .c/.h files so that it is easier to
conform to Git's conventions while editing the source code.

Signed-off-by: Johannes Schindelin 
---
 contrib/vscode/init.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/contrib/vscode/init.sh b/contrib/vscode/init.sh
index face115e8..29f2a729d 100755
--- a/contrib/vscode/init.sh
+++ b/contrib/vscode/init.sh
@@ -21,6 +21,14 @@ cat >.vscode/settings.json.new <<\EOF ||
 "editor.wordWrap": "wordWrapColumn",
 "editor.wordWrapColumn": 72
 },
+"[c]": {
+"editor.detectIndentation": false,
+"editor.insertSpaces": false,
+"editor.tabSize": 8,
+"editor.wordWrap": "wordWrapColumn",
+"editor.wordWrapColumn": 80,
+"files.trimTrailingWhitespace": true
+},
 "files.associations": {
 "*.h": "c",
 "*.c": "c"
-- 
gitgitgadget



[PATCH v2 4/9] mingw: define WIN32 explicitly

2018-07-30 Thread Johannes Schindelin via GitGitGadget
From: Johannes Schindelin 

This helps VS Code's intellisense to figure out that we want to include
windows.h, and that we want to define the minimum target Windows version
as Windows Vista/2008R2.

Signed-off-by: Johannes Schindelin 
---
 config.mak.uname | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config.mak.uname b/config.mak.uname
index 684fc5bf0..2be2f1981 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -528,7 +528,7 @@ ifneq (,$(findstring MINGW,$(uname_S)))
COMPAT_OBJS += compat/mingw.o compat/winansi.o \
compat/win32/pthread.o compat/win32/syslog.o \
compat/win32/dirent.o
-   BASIC_CFLAGS += -DPROTECT_NTFS_DEFAULT=1
+   BASIC_CFLAGS += -DWIN32 -DPROTECT_NTFS_DEFAULT=1
EXTLIBS += -lws2_32
GITLIBS += git.res
PTHREAD_LIBS =
-- 
gitgitgadget



Re: [PATCH 1/2] sequencer: fix "rebase -i --root" corrupting author header

2018-07-30 Thread Johannes Schindelin
Hi Eric,

On Mon, 30 Jul 2018, Eric Sunshine wrote:

> When "git rebase -i --root" creates a new root commit (say, by swapping
> in a different commit for the root), it corrupts the commit's "author"
> header with trailing garbage:
> 
> author A U Thor  @1112912773 -0700...@example.com
> 
> This is a result of read_author_ident() neglecting to NUL-terminate the
> buffer into which it composes the "author" header.
> 
> (Note that the extra "0" in the timezone is a separate bug which will be
> fixed subsequently.)
> 
> Security considerations: Construction of the "author" header by
> read_author_ident() happens in-place and in parallel with parsing the
> content of "rebase-merge/author-script" which occupies the same buffer.
> This is possible because the constructed "author" header is always
> smaller than the content of "rebase-merge/author-script". Despite
> neglecting to NUL-terminate the constructed "author" header, memory is
> never accessed (either by read_author_ident() or its caller) beyond the
> allocated buffer since a NUL-terminator is present at the end of the
> loaded "rebase-merge/author-script" content, and additional NUL's are
> inserted as part of the parsing process.
> 
> Signed-off-by: Eric Sunshine 

ACK!

Thanks,
Dscho

> ---
>  sequencer.c   |  2 +-
>  t/t3404-rebase-interactive.sh | 10 +-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 16c1411054..78864d9072 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -744,7 +744,7 @@ static const char *read_author_ident(struct strbuf *buf)
>   return NULL;
>   }
>  
> - buf->len = out - buf->buf;
> + strbuf_setlen(buf, out - buf->buf);
>   return buf->buf;
>  }
>  
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 01616901bd..8509c89a26 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1238,7 +1238,7 @@ rebase_setup_and_clean () {
>   test_might_fail git branch -D $1 &&
>   test_might_fail git rebase --abort
>   " &&
> - git checkout -b $1 master
> + git checkout -b $1 ${2:-master}
>  }
>  
>  test_expect_success 'drop' '
> @@ -1415,4 +1415,12 @@ test_expect_success 'rebase -i --gpg-sign= 
> overrides commit.gpgSign' '
>   test_i18ngrep "$SQ-S\"S I Gner\"$SQ" err
>  '
>  
> +test_expect_success 'valid author header after --root swap' '
> + rebase_setup_and_clean author-header no-conflict-branch &&
> + set_fake_editor &&
> + FAKE_LINES="2 1" git rebase -i --root &&
> + git cat-file commit HEAD^ >out &&
> + grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
> +'
> +
>  test_done
> -- 
> 2.18.0.597.ga71716f1ad
> 
> 


Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Johannes Schindelin
Hi Eric,

On Mon, 30 Jul 2018, Eric Sunshine wrote:

> On Mon, Jul 30, 2018 at 5:30 AM Eric Sunshine  wrote:
> > This series fixes bugs causing corruption of the root commit when
> > "rebase -i --root" is used to swap in a new root commit. In particular,
> > the "author" header has trailing garbage. Some tools handle the
> > corruption somewhat gracefully by showing a bogus date, but others barf
> > on it (gitk, for instance). git-fsck correctly identifies the
> > corruption. I discovered this after git-rebase corrupted one of my own
> > projects.
> 
> Unfortunately, after sending this series, I see that there is
> additional corruption that needs to be addressed. In particular, the
> "author" header time is incorrectly prefixed with "@", so this series
> is going to need a re-roll to fix that, as well.

AFAIR the `@` was not an oversight, but required so that we could pass in
the Unix epoch.

Ciao,
Dscho


Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Johannes Schindelin
Hi Phillip,

On Mon, 30 Jul 2018, Phillip Wood wrote:

> On 30/07/18 10:29, Eric Sunshine wrote:
> > This series fixes bugs causing corruption of the root commit when
> > "rebase -i --root" is used to swap in a new root commit. In particular,
> > the "author" header has trailing garbage. Some tools handle the
> > corruption somewhat gracefully by showing a bogus date, but others barf
> > on it (gitk, for instance). git-fsck correctly identifies the
> > corruption. I discovered this after git-rebase corrupted one of my own
> > projects.
> > 
> > Unfortunately, these bugs (from js/sequencer-and-root-commits) made it
> > into the v2.18.0 release. It's worrying that a released Git can be
> > creating corrupt commits, but fortunately "rebase -i --root" is not
> > likely used often (especially on well-established projects).
> > Nevertheless, it may be 'maint' worthy and applies cleanly there.
> > 
> > It was only after I diagnosed and fixed these bugs that I thought to
> > check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> > fixing one of the three bugs which this series fixes. Akinori's fix has
> > the somewhat undesirable property that it adds an extra blank line to
> > the end of the script, as Phillip correctly pointed out in review[2].
> > Patch 2/2 of this series has the more "correct" fix, in addition to
> > fixing another bug.
> > 
> > Moreover, patch 2/2 of this series provides a more thorough fix overall
> > than Akinori, so it may make sense to replace his patch with this
> > series, though perhaps keep the test his patch adds to augment the
> > strict test of the "author" header added by this series.
> 
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git
> 
> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

I would like that, too.

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.

You are at least the second person (after Junio) with that wish.

Please note, however, that the purpose of author-script reading/writing is
very different in git-am vs rebase -i, or at least it used to be:
read_env_script() in sequencer.c very specifically wants to construct an
env parameter for use in run_command().

Don't let me stop you from trying to consolidate the two different code
paths, though.

Ciao,
Dscho

> 
> > [1]: https://public-inbox.org/git/86a7qwpt9g@idaemons.org/
> > [2]: 
> > https://public-inbox.org/git/f5b56540-d26a-044e-5f46-1d975f889...@talktalk.net/
> > 
> > Eric Sunshine (2):
> >   sequencer: fix "rebase -i --root" corrupting author header
> >   sequencer: fix "rebase -i --root" corrupting author header timezone
> > 
> >  sequencer.c   |  9 +++--
> >  t/t3404-rebase-interactive.sh | 10 +-
> >  2 files changed, 16 insertions(+), 3 deletions(-)
> > 
> 
> 


Re: Strange bug with "git log" - pdftotext?

2018-07-30 Thread stefan.naewe
Am 30.07.2018 um 14:49 schrieb Jeremy Morton:
> I'm trying to search my git log history for a particular term - "unobtrusive" 
> - so I run this command:
> 
> git log -S unobtrusive --oneline
> 
> When I do this, this is displayed and I'm in an interactive less terminal or 
> something:
> 
> pdftotext version 4.00
> Copyright 1996-2017 Glyph & Cog, LLC
> Usage: pdftotext [options]  []
>  -f  : first page to convert
>  -l  : last page to convert
>  -layout  : maintain original physical layout
>  -simple  : simple one-column page layout
>  -table   : similar to -layout, but optimized for tables
>  -lineprinter : use strict fixed-pitch/height layout
>  -raw : keep strings in content stream order
>  -fixed   : assume fixed-pitch (or tabular) text
>  -linespacing : fixed line spacing for LinePrinter mode
>  -clip    : separate clipped text
>  -nodiag  : discard diagonal text
>  -enc     : output text encoding name
>  -eol     : output end-of-line convention (unix, dos, or mac)
>  -nopgbrk : don't insert page breaks between pages
>  -bom : insert a Unicode BOM at the start of the text file
>  -opw     : owner password (for encrypted files)
>  -upw     : user password (for encrypted files)
>  -q   : don't print any messages or errors
>  -cfg     : configuration file to use in place of .xpdfrc
>  -v   : print copyright and version info
> :
> 
> When I hit the down arrow, it scrolls down, repeating this output infinitely 
> until I hit 'q'.  What is going on here??
> 

Wild guess: You're using "Git for Windows" on a version less than 
"2.18.0.windows.1" ?

Try upgrading (at least) to that version. IIRC that's one of the issues it 
fixes.

HTH
Stefan
-- 

/dev/random says: Never underestimate the power of human stupidity.
python -c "print 
'73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')"
 
GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9  9666 829B 49C5 9221 27AF

Re: Strange bug with "git log" - pdftotext?

2018-07-30 Thread Jeremy Morton
I discovered it was an issue with the version of Git for Windows I was 
using.  Upgraded to the latest version and it works now.


--
Best regards,
Jeremy Morton (Jez)

On 30/07/2018 16:37, Jeff King wrote:

On Mon, Jul 30, 2018 at 01:49:46PM +0100, Jeremy Morton wrote:


I'm trying to search my git log history for a particular term -
"unobtrusive" - so I run this command:

git log -S unobtrusive --oneline

When I do this, this is displayed and I'm in an interactive less terminal or
something:

pdftotext version 4.00
[...]


That's definitely weird.

My guess is that the repository has some .gitattributes set up to diff
pdf files in a particular way, and you have some matching config that
tries to call pdftotext.

What does:

   git config --list | grep ^diff

say? I'd expect to see an external or textconv option there running
pdftotext.

Another option is that your pager is somehow set up to call pdftotext,
but that seems much more nonsensical to use the tool there (but try "git
var GIT_PAGER" and "git config pager.log" to check).

-Peff



Re: [PATCH v4 01/21] linear-assignment: a function to solve least-cost assignment problems

2018-07-30 Thread Johannes Schindelin
Hi Thomas,

On Sat, 28 Jul 2018, Thomas Gummerer wrote:

> On 07/21, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin 
> > 
> > The problem solved by the code introduced in this commit goes like this:
> > given two sets of items, and a cost matrix which says how much it
> > "costs" to assign any given item of the first set to any given item of
> > the second, assign all items (except when the sets have different size)
> > in the cheapest way.
> > 
> > We use the Jonker-Volgenant algorithm to solve the assignment problem to
> > answer questions such as: given two different versions of a topic branch
> > (or iterations of a patch series), what is the best pairing of
> > commits/patches between the different versions?
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  Makefile|   1 +
> >  linear-assignment.c | 201 
> >  linear-assignment.h |  22 +
> >  3 files changed, 224 insertions(+)
> >  create mode 100644 linear-assignment.c
> >  create mode 100644 linear-assignment.h
> >
> > [...]
> > 
> > diff --git a/linear-assignment.h b/linear-assignment.h
> > new file mode 100644
> > index 0..fc4c502c8
> > --- /dev/null
> > +++ b/linear-assignment.h
> > @@ -0,0 +1,22 @@
> > +#ifndef HUNGARIAN_H
> > +#define HUNGARIAN_H
> 
> nit: maybe s/HUNGARIAN/LINEAR_ASSIGNMENT/ in the two lines above.

Makes sense.

Ciao,
Dscho

> 
> > +
> > +/*
> > + * Compute an assignment of columns -> rows (and vice versa) such that 
> > every
> > + * column is assigned to at most one row (and vice versa) minimizing the
> > + * overall cost.
> > + *
> > + * The parameter `cost` is the cost matrix: the cost to assign column j to 
> > row
> > + * i is `cost[j + column_count * i].
> > + *
> > + * The arrays column2row and row2column will be populated with the 
> > respective
> > + * assignments (-1 for unassigned, which can happen only if column_count !=
> > + * row_count).
> > + */
> > +void compute_assignment(int column_count, int row_count, int *cost,
> > +   int *column2row, int *row2column);
> > +
> > +/* The maximal cost in the cost matrix (to prevent integer overflows). */
> > +#define COST_MAX (1<<16)
> > +
> > +#endif
> > -- 
> > gitgitgadget
> > 
> 


[PATCH/RFC] Color merge conflicts

2018-07-30 Thread Nguyễn Thái Ngọc Duy
One of the things I notice when watching a normal git user face a
merge conflicts is the output is very verbose (especially when there
are multiple conflicts) and it's hard to spot the important parts to
start resolving conflicts unless you know what to look for.

This is the start to hopefully help that a bit. One thing I'm not sure
is what to color because that affects the config keys we use for
this. If we have to color different things, it's best to go
"color.merge." but if this is the only place to color, that slot
thing looks over-engineered.

Perhaps another piece to color is the conflicted path? Maybe. But on
the other hand, I don't think we want a chameleon output, just enough
visual cues to go from one conflict to the next...

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 merge-recursive.c | 133 +++---
 2 files changed, 79 insertions(+), 55 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 113c1d6962..b800101538 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -26,6 +26,7 @@
 #include "dir.h"
 #include "submodule.h"
 #include "revision.h"
+#include "color.h"
 
 struct path_hashmap_entry {
struct hashmap_entry e;
@@ -286,6 +287,28 @@ static void output(struct merge_options *o, int v, const 
char *fmt, ...)
flush_output(o);
 }
 
+__attribute__((format (printf, 3, 4)))
+static void conflict_output(struct merge_options *o, int v, const char *fmt, 
...)
+{
+   va_list ap;
+
+   if (!show(o, v))
+   return;
+
+   strbuf_addchars(&o->obuf, ' ', o->call_depth * 2);
+
+   strbuf_addf(&o->obuf, "%s%s%s ",
+   GIT_COLOR_RED, _("CONFLICT"), GIT_COLOR_RESET);
+
+   va_start(ap, fmt);
+   strbuf_vaddf(&o->obuf, fmt, ap);
+   va_end(ap);
+
+   strbuf_addch(&o->obuf, '\n');
+   if (!o->buffer_output)
+   flush_output(o);
+}
+
 static void output_commit_title(struct merge_options *o, struct commit *commit)
 {
struct merge_remote_desc *desc;
@@ -1497,27 +1520,27 @@ static int handle_change_delete(struct merge_options *o,
 */
if (!alt_path) {
if (!old_path) {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s in %s. Version %s of %s left in 
tree."),
-  change, path, delete_branch, change_past,
-  change_branch, change_branch, path);
+   conflict_output(o, 1, _("(%s/delete): %s 
deleted in %s "
+   "and %s in %s. Version 
%s of %s left in tree."),
+   change, path, delete_branch, 
change_past,
+   change_branch, change_branch, 
path);
} else {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s to %s in %s. Version %s of %s 
left in tree."),
-  change, old_path, delete_branch, 
change_past, path,
-  change_branch, change_branch, path);
+   conflict_output(o, 1, _("(%s/delete): %s 
deleted in %s "
+   "and %s to %s in %s. 
Version %s of %s left in tree."),
+   change, old_path, 
delete_branch, change_past, path,
+   change_branch, change_branch, 
path);
}
} else {
if (!old_path) {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s in %s. Version %s of %s left in 
tree at %s."),
-  change, path, delete_branch, change_past,
-  change_branch, change_branch, path, 
alt_path);
+   conflict_output(o, 1, _("(%s/delete): %s 
deleted in %s "
+   "and %s in %s. Version 
%s of %s left in tree at %s."),
+   change, path, delete_branch, 
change_past,
+   change_branch, change_branch, 
path, alt_path);
} else {
-   output(o, 1, _("CONFLICT (%s/delete): %s 
deleted in %s "
-  "and %s to %s in %s. Version %s of %s 
left in tree at %s."),
-  change, old_path, delete_branch, 
change_past, path,
-  change_branch, change_branch, path, 
alt_path);
+   conflict_output(o, 1, _("(%s/de

Re: [PATCH v4 03/21] range-diff: first rudimentary implementation

2018-07-30 Thread Johannes Schindelin
Hi Thomas,

On Sun, 29 Jul 2018, Thomas Gummerer wrote:

> On 07/21, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin 
> > 
> > At this stage, `git range-diff` can determine corresponding commits
> > of two related commit ranges. This makes use of the recently introduced
> > implementation of the linear assignment algorithm.
> > 
> > The core of this patch is a straight port of the ideas of tbdiff, the
> > apparently dormant project at https://github.com/trast/tbdiff.
> > 
> > The output does not at all match `tbdiff`'s output yet, as this patch
> > really concentrates on getting the patch matching part right.
> > 
> > Note: due to differences in the diff algorithm (`tbdiff` uses the Python
> > module `difflib`, Git uses its xdiff fork), the cost matrix calculated
> > by `range-diff` is different (but very similar) to the one calculated
> > by `tbdiff`. Therefore, it is possible that they find different matching
> > commits in corner cases (e.g. when a patch was split into two patches of
> > roughly equal length).
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  Makefile |   1 +
> >  builtin/range-diff.c |  43 +-
> >  range-diff.c | 311 +++
> >  range-diff.h |   7 +
> >  4 files changed, 361 insertions(+), 1 deletion(-)
> >  create mode 100644 range-diff.c
> >  create mode 100644 range-diff.h
> >
> > [...]
> > 
> > diff --git a/range-diff.c b/range-diff.c
> > new file mode 100644
> > index 0..15d418afa
> > --- /dev/null
> > +++ b/range-diff.c
> > @@ -0,0 +1,311 @@
> > +#include "cache.h"
> > +#include "range-diff.h"
> > +#include "string-list.h"
> > +#include "run-command.h"
> > +#include "argv-array.h"
> > +#include "hashmap.h"
> > +#include "xdiff-interface.h"
> > +#include "linear-assignment.h"
> > +
> > +struct patch_util {
> > +   /* For the search for an exact match */
> > +   struct hashmap_entry e;
> > +   const char *diff, *patch;
> > +
> > +   int i;
> > +   int diffsize;
> > +   size_t diff_offset;
> > +   /* the index of the matching item in the other branch, or -1 */
> > +   int matching;
> > +   struct object_id oid;
> > +};
> > +
> > +/*
> > + * Reads the patches into a string list, with the `util` field being 
> > populated
> > + * as struct object_id (will need to be free()d).
> > + */
> > +static int read_patches(const char *range, struct string_list *list)
> > +{
> > +   struct child_process cp = CHILD_PROCESS_INIT;
> > +   FILE *in;
> > +   struct strbuf buf = STRBUF_INIT, line = STRBUF_INIT;
> > +   struct patch_util *util = NULL;
> > +   int in_header = 1;
> > +
> > +   argv_array_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
> > +   "--reverse", "--date-order", "--decorate=no",
> > +   "--no-abbrev-commit", range,
> > +   NULL);
> 
> Compared to tbdiff, add "--decorate=no", and "--no-abbrev-commit".  I
> see we're abbreviating the commit hashes later.  We don't want ref
> names here, so "--decorate=no" makes sense as well.

Indeed. Compare also to https://github.com/trast/tbdiff/pull/8

> > +   cp.out = -1;
> > +   cp.no_stdin = 1;
> > +   cp.git_cmd = 1;
> > +
> > +   if (start_command(&cp))
> > +   return error_errno(_("could not start `log`"));
> > +   in = fdopen(cp.out, "r");
> > +   if (!in) {
> > +   error_errno(_("could not read `log` output"));
> > +   finish_command(&cp);
> > +   return -1;
> > +   }
> > +
> > +   while (strbuf_getline(&line, in) != EOF) {
> > +   const char *p;
> > +
> > +   if (skip_prefix(line.buf, "commit ", &p)) {
> > +   if (util) {
> > +   string_list_append(list, buf.buf)->util = util;
> > +   strbuf_reset(&buf);
> > +   }
> > +   util = xcalloc(sizeof(*util), 1);
> > +   if (get_oid(p, &util->oid)) {
> > +   error(_("could not parse commit '%s'"), p);
> > +   free(util);
> > +   string_list_clear(list, 1);
> > +   strbuf_release(&buf);
> > +   strbuf_release(&line);
> > +   fclose(in);
> > +   finish_command(&cp);
> > +   return -1;
> > +   }
> > +   util->matching = -1;
> > +   in_header = 1;
> > +   continue;
> > +   }
> > +
> > +   if (starts_with(line.buf, "diff --git")) {
> > +   in_header = 0;
> > +   strbuf_addch(&buf, '\n');
> > +   if (!util->diff_offset)
> > +   util->diff_offset = buf.len;
> > +   strbuf_addbuf(&buf, &line);
> > +   } else if (in_header) {
> > +   if (starts_with(line.buf, "Author: ")) {
> > +

Re: [PATCH v4 05/21] range-diff: also show the diff between patches

2018-07-30 Thread Johannes Schindelin
Hi Thomas & Eric,

On Sun, 29 Jul 2018, Thomas Gummerer wrote:

> On 07/29, Eric Sunshine wrote:
> > On Sun, Jul 29, 2018 at 3:04 PM Thomas Gummerer  
> > wrote:
> > > On 07/21, Johannes Schindelin via GitGitGadget wrote:
> > > > Just like tbdiff, we now show the diff between matching patches. This is
> > > > a "diff of two diffs", so it can be a bit daunting to read for the
> > > > beginner.
> > > > [...]
> > > > Note also: while tbdiff accepts the `--no-patches` option to suppress
> > > > these diffs between patches, we prefer the `-s` option that is
> > > > automatically supported via our use of diff_opt_parse().
> > >
> > > One slightly unfortunate thing here is that we don't show these
> > > options in 'git range-diff -h', which would be nice to have.  I don't
> > > know if that's possible in git right now, if it's not easily possible,
> > > I definitely wouldn't want to delay this series for that, and we could
> > > just add it to the list of possible future enhancements that other
> > > people mentioned.
> > 
> > This issue is not specific to git-range-diff; it's shared by other
> > commands which inherit diff options via diff_opt_parse(). For
> > instance, "git log -h" doesn't show diff-related options either, yet
> > it accepts them.
> 
> Fair enough, that makes sense.  Thanks for the pointer!
> 
> There's one more thing that I noticed here:
> 
> git range-diff --no-patches
> fatal: single arg format requires a symmetric range
> 
> Which is a slightly confusing error message.  In contrast git log does
> the following on an unrecognized argument:
> 
> git log --no-patches
> fatal: unrecognized argument: --no-patches
> 
> which is a little better I think.  I do however also thing the "fatal:
> single arg format requires a symmetric range" is useful when someone
> genuinely tries to use the single argument version of the command.  So
> I don't know what a good solution for this would be.

I immediately thought of testing for a leading `-` of the remaining
argument, but I could imagine that somebody enterprisey uses

git range-diff -- -my-first-attempt...-my-second-attempt

and I do not really want to complexify the code... Ideas?

> > > > diff --git a/range-diff.c b/range-diff.c
> > > > @@ -300,6 +325,9 @@ static void output(struct string_list *a, struct 
> > > > string_list *b)
> > > >   printf("%d: %s ! %d: %s\n",
> > > >  b_util->matching + 1, short_oid(a_util),
> > > >  j + 1, short_oid(b_util));
> > > > + if (!(diffopt->output_format & 
> > > > DIFF_FORMAT_NO_OUTPUT))
> > >
> > > Looking at this line, it looks like it would be easy to support
> > > '--no-patches' as well, which may be slightly easier to understand that
> > > '-s' to someone new to the command.  But again that can be added later
> > > if someone actually cares about it.
> > 
> > What wasn't mentioned (but was implied) by the commit message is that
> > "-s" is short for "--no-patch", which also comes for free via
> > diff_opt_parse(). True, "--no-patch" isn't spelled exactly the same as
> > "--no-patches", but git-range-diff isn't exactly a perfect tbdiff
> > clone, so hopefully not a git problem. Moreover, "--no-patch" is
> > internally consistent within the Git builtin commands.
> 
> Makes sense, thanks!  "--no-patch" does make sense to me.  There's
> still a lot of command line flags in git to learn for me, even after
> all this time using it ;)  Might be nice to spell it out in the commit
> message for someone like me, especially as "--no-patches" is already
> mentioned.  Though I guess most regulars here would know about
> "--no-patch", so maybe it's not worth it.  Anyway that is definitely
> not worth another round here.

Sure, but not many users learn from reading the commit history...

:-)

Ciao,
Dscho


[PATCH v5 1/3] rebase: start implementing it as a builtin

2018-07-30 Thread Pratik Karki
This commit imitates the strategy that was used to convert the
difftool to a builtin. We start by renaming the shell script
`git-rebase.sh` to `git-legacy-rebase.sh` and introduce a
`builtin/rebase.c` that simply executes the shell script version,
unless the config setting `rebase.useBuiltin` is set to `true`.

The motivation behind this is to rewrite all the functionality of the
shell script version in the aforementioned `rebase.c`, one by one and
be able to conveniently test new features by configuring
`rebase.useBuiltin`.

In the original difftool conversion, if sane_execvp() that attempts to
run the legacy scripted version returned with non-negative status, the
command silently exited without doing anything with success, but
sane_execvp() should not return with non-negative status in the first
place, so we use die() to notice such an abnormal case.

We intentionally avoid reading the config directly to avoid
messing up the GIT_* environment variables when we need to fall back to
exec()ing the shell script. The test of builtin rebase can be done by
`git -c rebase.useBuiltin=true rebase ...`

Signed-off-by: Pratik Karki 
---
 .gitignore|  1 +
 Makefile  |  3 +-
 builtin.h |  1 +
 builtin/rebase.c  | 58 +++
 git-rebase.sh => git-legacy-rebase.sh |  0
 git.c |  6 +++
 6 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (100%)

diff --git a/.gitignore b/.gitignore
index 3284a1e9b1..ec23959014 100644
--- a/.gitignore
+++ b/.gitignore
@@ -78,6 +78,7 @@
 /git-init-db
 /git-interpret-trailers
 /git-instaweb
+/git-legacy-rebase
 /git-log
 /git-ls-files
 /git-ls-remote
diff --git a/Makefile b/Makefile
index 08e5c54549..a0c410b7d6 100644
--- a/Makefile
+++ b/Makefile
@@ -609,7 +609,7 @@ SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
-SCRIPT_SH += git-rebase.sh
+SCRIPT_SH += git-legacy-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
 SCRIPT_SH += git-request-pull.sh
 SCRIPT_SH += git-stash.sh
@@ -1059,6 +1059,7 @@ BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
 BUILTIN_OBJS += builtin/push.o
 BUILTIN_OBJS += builtin/read-tree.o
+BUILTIN_OBJS += builtin/rebase.o
 BUILTIN_OBJS += builtin/rebase--helper.o
 BUILTIN_OBJS += builtin/receive-pack.o
 BUILTIN_OBJS += builtin/reflog.o
diff --git a/builtin.h b/builtin.h
index 0362f1ce25..44651a447f 100644
--- a/builtin.h
+++ b/builtin.h
@@ -202,6 +202,7 @@ extern int cmd_prune_packed(int argc, const char **argv, 
const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
 extern int cmd_push(int argc, const char **argv, const char *prefix);
 extern int cmd_read_tree(int argc, const char **argv, const char *prefix);
+extern int cmd_rebase(int argc, const char **argv, const char *prefix);
 extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix);
 extern int cmd_receive_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_reflog(int argc, const char **argv, const char *prefix);
diff --git a/builtin/rebase.c b/builtin/rebase.c
new file mode 100644
index 00..c44addb2a4
--- /dev/null
+++ b/builtin/rebase.c
@@ -0,0 +1,58 @@
+/*
+ * "git rebase" builtin command
+ *
+ * Copyright (c) 2018 Pratik Karki
+ */
+
+#include "builtin.h"
+#include "run-command.h"
+#include "exec-cmd.h"
+#include "argv-array.h"
+#include "dir.h"
+
+static int use_builtin_rebase(void)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   struct strbuf out = STRBUF_INIT;
+   int ret;
+
+   argv_array_pushl(&cp.args,
+"config", "--bool", "rebase.usebuiltin", NULL);
+   cp.git_cmd = 1;
+   if (capture_command(&cp, &out, 6)) {
+   strbuf_release(&out);
+   return 0;
+   }
+
+   strbuf_trim(&out);
+   ret = !strcmp("true", out.buf);
+   strbuf_release(&out);
+   return ret;
+}
+
+int cmd_rebase(int argc, const char **argv, const char *prefix)
+{
+   /*
+* NEEDSWORK: Once the builtin rebase has been tested enough
+* and git-legacy-rebase.sh is retired to contrib/, this preamble
+* can be removed.
+*/
+
+   if (!use_builtin_rebase()) {
+   const char *path = mkpath("%s/git-legacy-rebase",
+ git_exec_path());
+
+   if (sane_execvp(path, (char **)argv) < 0)
+   die_errno(_("could not exec %s"), path);
+   else
+   BUG("sane_execvp() returned???");
+   }
+
+   if (argc != 2)
+   die(_("Usage: %s "), argv[0]);
+   prefix = setup_git_directory();
+   trace_repo_setup(prefix);
+   setup_work_tree();
+
+   die("TODO");
+}
diff 

[GSoC] [PATCH v5 0/3] rebase: rewrite rebase in C

2018-07-30 Thread Pratik Karki
As a GSoC project, I have been working on the builtin rebase.

The motivation behind the rewrite of rebase i.e. from shell script to C
are for following reasons:

1.  Writing shell scripts and getting it to production is much faster
than doing the equivalent in C but lacks in performance and extra
workarounds are needed for non-POSIX platforms.

2.  Git for Windows is at loss as the installer size increases due to
addition of extra dependencies for the shell scripts which are usually
available in POSIX compliant platforms.

This series of patches serves to demonstrate a minimal builtin rebase
which supports running `git rebase ` and also serves to ask for
reviews.

Changes since v4:

  -  Remove the `do_reset()` refactored function from sequencer.
 In other words `sequencer: refactor the code to detach HEAD to checkout.c`
 patch was dropped to introduce a new function `reset_hard()` for `rebase.c`
 (as suggested by Johannes).

  -  Fix a case of leak in `rebase: start implementing it as a builtin`.
 (as pointed out by Andrei Rybak and Eric Sunshine).

  -  Wrap the user visible comments in `_()` and used `BUG()` depending on the
 scenarios (as pointed out by Duy Nguyen).

  -  Fix the macro `GIT_PATH_FUNC` which expands to function definition and
 doesn't require semicolons (as pointed out by Beat Bolli).

Pratik Karki (3):
  rebase: start implementing it as a builtin
  rebase: refactor common shell functions into their own file
  builtin/rebase: support running "git rebase "

 .gitignore|   2 +
 Makefile  |   4 +-
 builtin.h |   1 +
 builtin/rebase.c  | 406 ++
 git-rebase.sh => git-legacy-rebase.sh |  69 +
 git-rebase--common.sh |  68 +
 git.c |   6 +
 7 files changed, 488 insertions(+), 68 deletions(-)
 create mode 100644 builtin/rebase.c
 rename git-rebase.sh => git-legacy-rebase.sh (90%)
 create mode 100644 git-rebase--common.sh

-- 
2.18.0



[PATCH v5 2/3] rebase: refactor common shell functions into their own file

2018-07-30 Thread Pratik Karki
The functions present in `git-legacy-rebase.sh` are used by the rebase
backends as they are implemented as shell script functions in the
`git-rebase--` files.

To make the `builtin/rebase.c` work, we have to provide support via
a Unix shell script snippet that uses these functions and so, we
want to use the rebase backends *directly* from the builtin rebase
without going through `git-legacy-rebase.sh`.

This commit extracts the functions to a separate file,
`git-rebase--common`, that will be read by `git-legacy-rebase.sh` and
by the shell script snippets which will be used extensively in the
following commits.

Signed-off-by: Pratik Karki 
---
Unchanged since v4.

 .gitignore|  1 +
 Makefile  |  1 +
 git-legacy-rebase.sh  | 69 ++-
 git-rebase--common.sh | 68 ++
 4 files changed, 72 insertions(+), 67 deletions(-)
 create mode 100644 git-rebase--common.sh

diff --git a/.gitignore b/.gitignore
index ec23959014..824141cba1 100644
--- a/.gitignore
+++ b/.gitignore
@@ -117,6 +117,7 @@
 /git-read-tree
 /git-rebase
 /git-rebase--am
+/git-rebase--common
 /git-rebase--helper
 /git-rebase--interactive
 /git-rebase--merge
diff --git a/Makefile b/Makefile
index a0c410b7d6..c59e2f64a1 100644
--- a/Makefile
+++ b/Makefile
@@ -619,6 +619,7 @@ SCRIPT_SH += git-web--browse.sh
 SCRIPT_LIB += git-mergetool--lib
 SCRIPT_LIB += git-parse-remote
 SCRIPT_LIB += git-rebase--am
+SCRIPT_LIB += git-rebase--common
 SCRIPT_LIB += git-rebase--interactive
 SCRIPT_LIB += git-rebase--preserve-merges
 SCRIPT_LIB += git-rebase--merge
diff --git a/git-legacy-rebase.sh b/git-legacy-rebase.sh
index 7973447645..af2cdfef03 100755
--- a/git-legacy-rebase.sh
+++ b/git-legacy-rebase.sh
@@ -57,12 +57,7 @@ cd_to_toplevel
 LF='
 '
 ok_to_skip_pre_rebase=
-resolvemsg="
-$(gettext 'Resolve all conflicts manually, mark them as resolved with
-"git add/rm ", then run "git rebase --continue".
-You can instead skip this commit: run "git rebase --skip".
-To abort and get back to the state before "git rebase", run "git rebase 
--abort".')
-"
+
 squash_onto=
 unset onto
 unset restrict_revision
@@ -102,6 +97,7 @@ case "$(git config --bool commit.gpgsign)" in
 true)  gpg_sign_opt=-S ;;
 *) gpg_sign_opt= ;;
 esac
+. git-rebase--common
 
 read_basic_state () {
test -f "$state_dir/head-name" &&
@@ -132,67 +128,6 @@ read_basic_state () {
}
 }
 
-write_basic_state () {
-   echo "$head_name" > "$state_dir"/head-name &&
-   echo "$onto" > "$state_dir"/onto &&
-   echo "$orig_head" > "$state_dir"/orig-head &&
-   echo "$GIT_QUIET" > "$state_dir"/quiet &&
-   test t = "$verbose" && : > "$state_dir"/verbose
-   test -n "$strategy" && echo "$strategy" > "$state_dir"/strategy
-   test -n "$strategy_opts" && echo "$strategy_opts" > \
-   "$state_dir"/strategy_opts
-   test -n "$allow_rerere_autoupdate" && echo "$allow_rerere_autoupdate" > 
\
-   "$state_dir"/allow_rerere_autoupdate
-   test -n "$gpg_sign_opt" && echo "$gpg_sign_opt" > 
"$state_dir"/gpg_sign_opt
-   test -n "$signoff" && echo "$signoff" >"$state_dir"/signoff
-}
-
-output () {
-   case "$verbose" in
-   '')
-   output=$("$@" 2>&1 )
-   status=$?
-   test $status != 0 && printf "%s\n" "$output"
-   return $status
-   ;;
-   *)
-   "$@"
-   ;;
-   esac
-}
-
-move_to_original_branch () {
-   case "$head_name" in
-   refs/*)
-   message="rebase finished: $head_name onto $onto"
-   git update-ref -m "$message" \
-   $head_name $(git rev-parse HEAD) $orig_head &&
-   git symbolic-ref \
-   -m "rebase finished: returning to $head_name" \
-   HEAD $head_name ||
-   die "$(eval_gettext "Could not move back to \$head_name")"
-   ;;
-   esac
-}
-
-apply_autostash () {
-   if test -f "$state_dir/autostash"
-   then
-   stash_sha1=$(cat "$state_dir/autostash")
-   if git stash apply $stash_sha1 >/dev/null 2>&1
-   then
-   echo "$(gettext 'Applied autostash.')" >&2
-   else
-   git stash store -m "autostash" -q $stash_sha1 ||
-   die "$(eval_gettext "Cannot store \$stash_sha1")"
-   gettext 'Applying autostash resulted in conflicts.
-Your changes are safe in the stash.
-You can run "git stash pop" or "git stash drop" at any time.
-' >&2
-   fi
-   fi
-}
-
 finish_rebase () {
rm -f "$(git rev-parse --git-path REBASE_HEAD)"
apply_autostash &&
diff --git a/git-rebase--common.sh b/git-rebase--common.sh
new file mode 100644
index 00..7e39d22871
--- /dev/null
+++ b/git-rebase--common.sh
@@ -0,0 +1,68 @@
+
+resolvemsg="
+$(gettext 'Resolve all co

[PATCH v5 3/3] builtin/rebase: support running "git rebase "

2018-07-30 Thread Pratik Karki
This patch gives life to the skeleton added in the previous patches:
With this change, we can perform a elementary rebase (without any
options).

It can be tested thusly by:

git -c rebase.usebuiltin=true rebase HEAD~2

The rebase backends (i.e. the shell script functions defined in
`git-rebase--`) are still at work here and the "builtin
rebase"'s purpose is simply to parse the options and set
everything up so that those rebase backends can do their work.

Note: We take an alternative approach here which is *not* to set the
environment variables via `run_command_v_opt_cd_env()` because those
variables would then be visible by processes spawned from the rebase
backends. Instead, we work hard to set them in the shell script snippet.
On Windows, some of the tests fail merely due to core.fileMode not
being heeded that's why the core.*config variables is parsed here.

The next commits will handle and support all the wonderful rebase
options.

Signed-off-by: Pratik Karki 
---
 builtin/rebase.c | 350 ++-
 1 file changed, 349 insertions(+), 1 deletion(-)

diff --git a/builtin/rebase.c b/builtin/rebase.c
index c44addb2a4..5863139204 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -9,6 +9,24 @@
 #include "exec-cmd.h"
 #include "argv-array.h"
 #include "dir.h"
+#include "packfile.h"
+#include "refs.h"
+#include "quote.h"
+#include "config.h"
+#include "cache-tree.h"
+#include "unpack-trees.h"
+#include "lockfile.h"
+
+static GIT_PATH_FUNC(apply_dir, "rebase-apply")
+static GIT_PATH_FUNC(merge_dir, "rebase-merge")
+
+enum rebase_type {
+   REBASE_UNSPECIFIED = -1,
+   REBASE_AM,
+   REBASE_MERGE,
+   REBASE_INTERACTIVE,
+   REBASE_PRESERVE_MERGES
+};
 
 static int use_builtin_rebase(void)
 {
@@ -30,8 +48,243 @@ static int use_builtin_rebase(void)
return ret;
 }
 
+static int apply_autostash(void)
+{
+   warning("TODO");
+   return 0;
+}
+
+struct rebase_options {
+   enum rebase_type type;
+   const char *state_dir;
+   struct commit *upstream;
+   const char *upstream_name;
+   char *head_name;
+   struct object_id orig_head;
+   struct commit *onto;
+   const char *onto_name;
+   const char *revisions;
+   const char *root;
+   const char *restrict_revision;
+};
+
+static int finish_rebase(struct rebase_options *opts)
+{
+   struct strbuf dir = STRBUF_INIT;
+   const char *argv_gc_auto[] = { "gc", "--auto", NULL };
+
+   delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
+   apply_autostash();
+   close_all_packs(the_repository->objects);
+   /*
+* We ignore errors in 'gc --auto', since the
+* user should see them.
+*/
+   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
+   strbuf_addstr(&dir, opts->state_dir);
+   remove_dir_recursively(&dir, 0);
+   strbuf_release(&dir);
+
+   return 0;
+}
+
+static struct commit *peel_committish(const char *name)
+{
+   struct object *obj;
+   struct object_id oid;
+
+   if (get_oid(name, &oid))
+   return NULL;
+   obj = parse_object(&oid);
+   return (struct commit *)peel_to_type(name, 0, obj, OBJ_COMMIT);
+}
+
+static void add_var(struct strbuf *buf, const char *name, const char *value)
+{
+   strbuf_addstr(buf, name);
+   if (value) {
+   strbuf_addstr(buf, "=");
+   sq_quote_buf(buf, value);
+   }
+   strbuf_addstr(buf, "; ");
+}
+
+static int run_specific_rebase(struct rebase_options *opts)
+{
+   const char *argv[] = { NULL, NULL };
+   struct strbuf script_snippet = STRBUF_INIT;
+   int status;
+   const char *backend, *backend_func;
+
+   add_var(&script_snippet, "GIT_DIR", absolute_path(get_git_dir()));
+   add_var(&script_snippet, "state_dir", opts->state_dir);
+
+   add_var(&script_snippet, "upstream_name", opts->upstream_name);
+   add_var(&script_snippet, "upstream",
+oid_to_hex(&opts->upstream->object.oid));
+   add_var(&script_snippet, "head_name", opts->head_name);
+   add_var(&script_snippet, "orig_head", oid_to_hex(&opts->orig_head));
+   add_var(&script_snippet, "onto", oid_to_hex(&opts->onto->object.oid));
+   add_var(&script_snippet, "onto_name", opts->onto_name);
+   add_var(&script_snippet, "revisions", opts->revisions);
+   if (opts->restrict_revision == NULL)
+   add_var(&script_snippet, "restrict_revision", "");
+   else
+   add_var(&script_snippet, "restrict_revision",
+   opts->restrict_revision);
+
+   switch (opts->type) {
+   case REBASE_AM:
+   backend = "git-rebase--am";
+   backend_func = "git_rebase__am";
+   break;
+   case REBASE_INTERACTIVE:
+   backend = "git-rebase--interactive";
+   backend_func = "git_rebase__interactive";
+   break;
+   case REBASE_M

Re: [PATCH v4 11/21] range-diff: add tests

2018-07-30 Thread Johannes Schindelin
Hi Eric,

On Sun, 22 Jul 2018, Eric Sunshine wrote:

> On Sat, Jul 21, 2018 at 6:05 PM Thomas Rast via GitGitGadget
>  wrote:
> > These are essentially lifted from https://github.com/trast/tbdiff, with
> > light touch-ups to account for the command now being names `git
> 
> s/names/named/

Thanks.

I already pushed an update to https://github.com/gitgitgadget/git/pull/1.

Ciao,
Dscho

> 
> > range-diff`.
> >
> > Apart from renaming `tbdiff` to `range-diff`, only one test case needed
> > to be adjusted: 11 - 'changed message'.
> >
> > The underlying reason it had to be adjusted is that diff generation is
> > sometimes ambiguous. In this case, a comment line and an empty line are
> > added, but it is ambiguous whether they were added after the existing
> > empty line, or whether an empty line and the comment line are added
> > *before* the existing empty line. And apparently xdiff picks a different
> > option here than Python's difflib.
> >
> > Signed-off-by: Johannes Schindelin 
> 


Re: [PATCH] DO-NOT-MERGE: write and read commit-graph always

2018-07-30 Thread Stefan Beller
> I wonder though if all those changes to the testsuite shouldn't be
> merged.

I think Stolee doesn't want this to be merged after rereading
subject and the commit message.

Thanks,
Stefan


Re: [PATCH/RFC] Color merge conflicts

2018-07-30 Thread Stefan Beller
On Mon, Jul 30, 2018 at 9:00 AM Nguyễn Thái Ngọc Duy  wrote:
>
> One of the things I notice when watching a normal git user face a
> merge conflicts is the output is very verbose (especially when there
> are multiple conflicts) and it's hard to spot the important parts to
> start resolving conflicts unless you know what to look for.

I usually go by git-status, but I am not the watched normal user,
hopefully. Maybe we want to run git-status after a failed merge
for the user, too, though?

> This is the start to hopefully help that a bit. One thing I'm not sure
> is what to color because that affects the config keys we use for
> this. If we have to color different things, it's best to go
> "color.merge." but if this is the only place to color, that slot
> thing looks over-engineered.
>
> Perhaps another piece to color is the conflicted path? Maybe. But on
> the other hand, I don't think we want a chameleon output, just enough
> visual cues to go from one conflict to the next...

I would think we would want to highlight the type of conflict as well,
e.g.

   CONFLICT>  \
   (rename/rename):  \
  Rename a -> b in branch foo \
  rename a ->c in bar

but then again, this patch is better than not doing any colors.

I wonder if we want to have certain colors associated with certain
types of merge conflicts, e.g. anything involving a rename would
be yellow, any conflict with deletion to be dark red, regular merge
conflicts blue (and at the same time we could introduce coloring
of conflict markers to also be blue)
(I am just making up colors, not seriously suggesting them)

I like the idea!

Thanks,
Stefan


Re: [PATCH v3 10/11] rerere: teach rerere to handle nested conflicts

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> Currently rerere can't handle nested conflicts and will error out when
> it encounters such conflicts.  Do that by recursively calling the
> 'handle_conflict' function to normalize the conflict.
>
> The conflict ID calculation here deserves some explanation:
>
> As we are using the same handle_conflict function, the nested conflict
> is normalized the same way as for non-nested conflicts, which means
> the ancestor in the diff3 case is stripped out, and the parts of the
> conflict are ordered alphabetically.
>
> The conflict ID is however is only calculated in the top level
> handle_conflict call, so it will include the markers that 'rerere'
> adds to the output.  e.g. say there's the following conflict:
>
> <<< HEAD
> 1
> ===
> <<< HEAD
> 3
> ===
> 2
> >>> branch-2
> >>> branch-3~

Hmph, I vaguely recall that I made inner merges to use the conflict
markers automatically lengthened (by two, if I recall correctly)
than its immediate outer merge.  Wouldn't the above look more like

 <<< HEAD
 1
 ===
 < HEAD
 3
 =
 2
 > branch-2
 >>> branch-3~

Perhaps I am not recalling it correctly.

> it would be recorde as follows in the preimage:
>
> <<<
> 1
> ===
> <<<
> 2
> ===
> 3
> >>>
> >>>
>
> and the conflict ID would be calculated as
>
> sha1(1<<<
> 2
> ===
> 3
> >>>)
>
> Stripping out vs. leaving the conflict markers in place in the inner
> conflict should have no practical impact, but it simplifies the
> implementation.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  Documentation/technical/rerere.txt | 42 ++
>  rerere.c   | 10 +--
>  t/t4200-rerere.sh  | 37 ++
>  3 files changed, 87 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/technical/rerere.txt 
> b/Documentation/technical/rerere.txt
> index 4102cce7aa..60d48dc4fe 100644
> --- a/Documentation/technical/rerere.txt
> +++ b/Documentation/technical/rerere.txt
> @@ -138,3 +138,45 @@ SHA1('BC').
>  If there are multiple conflicts in one file, the sha1 is calculated
>  the same way with all hunks appended to each other, in the order in
>  which they appear in the file, separated by a  character.
> +
> +Nested conflicts
> +
> +
> +Nested conflicts are handled very similarly to "simple" conflicts.
> +Similar to simple conflicts, the conflict is first normalized by
> +stripping the labels from conflict markers, stripping the diff3
> +output, and the sorting the conflict hunks, both for the outer and the
> +inner conflict.  This is done recursively, so any number of nested
> +conflicts can be handled.
> +
> +The only difference is in how the conflict ID is calculated.  For the
> +inner conflict, the conflict markers themselves are not stripped out
> +before calculating the sha1.
> +
> +Say we have the following conflict for example:
> +
> +<<< HEAD
> +1
> +===
> +<<< HEAD
> +3
> +===
> +2
> +>>> branch-2
> +>>> branch-3~
> +
> +After stripping out the labels of the conflict markers, and sorting
> +the hunks, the conflict would look as follows:
> +
> +<<<
> +1
> +===
> +<<<
> +2
> +===
> +3
> +>>>
> +>>>
> +
> +and finally the conflict ID would be calculated as:
> +`sha1('1<<<\n3\n===\n2\n>>>')`
> diff --git a/rerere.c b/rerere.c
> index a35b88916c..f78bef80b1 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -365,12 +365,18 @@ static int handle_conflict(struct strbuf *out, struct 
> rerere_io *io,
>   RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
>   } hunk = RR_SIDE_1;
>   struct strbuf one = STRBUF_INIT, two = STRBUF_INIT;
> - struct strbuf buf = STRBUF_INIT;
> + struct strbuf buf = STRBUF_INIT, conflict = STRBUF_INIT;
>   int has_conflicts = -1;
>  
>   while (!io->getline(&buf, io)) {
>   if (is_cmarker(buf.buf, '<', marker_size)) {
> - break;
> + if (handle_conflict(&conflict, io, marker_size, NULL) < 
> 0)
> + break;
> + if (hunk == RR_SIDE_1)
> + strbuf_addbuf(&one, &conflict);
> + else
> + strbuf_addbuf(&two, &conflict);

Hmph, do we ever see the inner conflict block while we are skipping
and ignoring the common ancestor version, or it is impossible that
we see '<' only while processing either our or their side?

> + strbuf_release(&conflict);
>   } else if (is_cmarker(buf.buf, '|', marker_size)) {
>   if (hunk != RR_SIDE_1)
>   break;
> diff --git a/t/t4200-rerere.sh b/t/t4200-rere

Re: [PATCH] refspec: allow @ on the left-hand side of refspecs

2018-07-30 Thread Brandon Williams
On 07/29, brian m. carlson wrote:
> The object ID parsing machinery is aware of "@" as a synonym for "HEAD"
> and this is documented accordingly in gitrevisions(7).  The push
> documentation describes the source portion of a refspec as "any
> arbitrary 'SHA-1 expression'"; however, "@" is not allowed on the
> left-hand side of a refspec, since we attempt to check for it being a
> valid ref name and fail (since it is not).
> 
> Teach the refspec machinery about this alias and silently substitute
> "HEAD" when we see "@".  This handles the fact that HEAD is a symref and
> preserves its special behavior.  We need not handle other arbitrary
> object ID expressions (such as "@^") when pushing because the revision
> machinery already handles that for us.

So this claims that using "@^" should work despite not accounting for it
explicitly or am I misreading?  Unless I'm mistaken, it looks like we
don't really support arbitrary rev syntax in refspecs since "HEAD^"
doesn't work either.

> 
> Signed-off-by: brian m. carlson 
> ---
> I probably type "git push upstream HEAD" from five to thirty times a
> day, many of those where I typo "HEAD", so I decided to implement the
> shorter form.  This design handles @ as HEAD in both fetch and push,
> whereas alternate solutions would not.

I'm always a fan of finding shortcuts and reducing how much I type, so
thank you :)

> 
> check_refname_format explicitly rejects "@"; I tried at first to simply
> ignore that with a flag, but we end up calling that from several other
> places in the codebase and rejecting it and all of those places would
> have needed updating.
> 
> I thought about putting the if/else logic in a function, but since it's
> just four lines, I decided not to.  However, if people think it would be
> tidier, I can do so.
> 
> Note that the test portion of the patch is best read with git diff -w;
> the current version is very noisy.
> 
>  refspec.c |   6 ++-
>  t/t5516-fetch-push.sh | 104 +-
>  2 files changed, 58 insertions(+), 52 deletions(-)
> 
> diff --git a/refspec.c b/refspec.c
> index e8010dce0c..57c2f65104 100644
> --- a/refspec.c
> +++ b/refspec.c
> @@ -62,8 +62,12 @@ static int parse_refspec(struct refspec_item *item, const 
> char *refspec, int fet
>   return 0;
>   }
>  
> + if (llen == 1 && lhs[0] == '@')
> + item->src = xstrdup("HEAD");
> + else
> + item->src = xstrndup(lhs, llen);
> +

This is probably the easiest place to put the aliasing logic so I don't
have any issue with including it here.

-- 
Brandon Williams


Re: [PATCH v3 05/11] rerere: add documentation for conflict normalization

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> +Different conflict styles and branch names are normalized by stripping
> +the labels from the conflict markers, and removing extraneous
> +information from the `diff3` conflict style. Branches that are merged

s/extraneous information/commmon ancestor version/ perhaps, to be
fact-based without passing value judgment?

We drop the common ancestor version only because we cannot normalize
from `merge` style to `diff3` style by adding one, and not because
it is extraneous.  It does help humans understand the conflict a lot
better to have that section.

> +By extension, this means that rerere should recognize that the above
> +conflicts are the same.  To do this, the labels on the conflict
> +markers are stripped, and the diff3 output is removed.  The above

s/diff3 output/common ancestor version/, as "diff3 output" would
mean the whole thing between <<< and >>> to readers.

> diff --git a/rerere.c b/rerere.c
> index be98c0afcb..da1ab54027 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -394,10 +394,6 @@ static int is_cmarker(char *buf, int marker_char, int 
> marker_size)
>   * and NUL concatenated together.
>   *
>   * Return the number of conflict hunks found.
> - *
> - * NEEDSWORK: the logic and theory of operation behind this conflict
> - * normalization may deserve to be documented somewhere, perhaps in
> - * Documentation/technical/rerere.txt.
>   */
>  static int handle_path(unsigned char *sha1, struct rerere_io *io, int 
> marker_size)
>  {

Thanks for finally removing this age-old NEEDSWORK comment.


Re: [PATCH v3 06/11] rerere: fix crash when conflict goes unresolved

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> Currently when a user doesn't resolve a conflict in a file, but
> commits the file with the conflict markers, and later the file ends up
> in a state in which rerere can't handle it, subsequent rerere
> operations that are interested in that path, such as 'rerere clear' or
> 'rerere forget ' will fail, or even worse in the case of 'rerere
> clear' segfault.
>
> Such states include nested conflicts, or an extra conflict marker that
> doesn't have any match.
>
> This is because the first 'git rerere' when there was only one
> conflict in the file leaves an entry in the MERGE_RR file behind.  The

I find this sentence, especially the "only one conflict in the file"
part, a bit unclear.  What does the sentence count as one conflict?
One block of lines enclosed inside "<<<"...">>>" pair?  The command
behaves differently when there are two such blocks instead?

> next 'git rerere' will then pick the rerere ID for that file up, and
> not assign a new ID as it can't successfully calculate one.  It will
> however still try to do the rerere operation, because of the existing
> ID.  As the handle_file function fails, it will remove the 'preimage'
> for the ID in the process, while leaving the ID in the MERGE_RR file.
>
> Now when 'rerere clear' for example is run, it will segfault in
> 'has_rerere_resolution', because status is NULL.

I think this "status" refers to the collection->status[].  How do we
get into that state, though?

new_rerere_id() and new_rerere_id_hex() fills id->collection by
calling find_rerere_dir(), which either finds an existing rerere_dir
instance or manufactures one with .status==NULL.  The .status[]
array is later grown by calling fit_variant as we scan and find the
pre/post images, but because there is no pre/post image for a file
with unparseable conflicts, it is left NULL.

So another possible fix could be to make sure that .status[] is only
read when .status_nr says there is something worth reading.  I am
not saying that would be a better fix---I am just thinking out loud
to make sure I understand the issue correctly.

> To fix this, remove the rerere ID from the MERGE_RR file in the case
> when we can't handle it, and remove the corresponding variant from
> .git/rr-cache/.  Removing it unconditionally is fine here, because if
> the user would have resolved the conflict and ran rerere, the entry
> would no longer be in the MERGE_RR file, so we wouldn't have this
> problem in the first place, while if the conflict was not resolved,
> the only thing that's left in the folder is the 'preimage', which by
> itself will be regenerated by git if necessary, so the user won't
> loose any work.

s/loose/lose/

> Note that other variants that have the same conflict ID will not be
> touched.

Nice.  Thanks for a fix.

>
> Signed-off-by: Thomas Gummerer 
> ---
>  rerere.c  | 12 +++-
>  t/t4200-rerere.sh | 22 ++
>  2 files changed, 29 insertions(+), 5 deletions(-)
>
> diff --git a/rerere.c b/rerere.c
> index da1ab54027..895ad80c0c 100644
> --- a/rerere.c
> +++ b/rerere.c
> @@ -823,10 +823,7 @@ static int do_plain_rerere(struct string_list *rr, int 
> fd)
>   struct rerere_id *id;
>   unsigned char sha1[20];
>   const char *path = conflict.items[i].string;
> - int ret;
> -
> - if (string_list_has_string(rr, path))
> - continue;
> + int ret, has_string;
>  
>   /*
>* Ask handle_file() to scan and assign a
> @@ -834,7 +831,12 @@ static int do_plain_rerere(struct string_list *rr, int 
> fd)
>* yet.
>*/
>   ret = handle_file(path, sha1, NULL);
> - if (ret < 1)
> + has_string = string_list_has_string(rr, path);
> + if (ret < 0 && has_string) {
> + remove_variant(string_list_lookup(rr, path)->util);
> + string_list_remove(rr, path, 1);
> + }
> + if (ret < 1 || has_string)
>   continue;

We used to say "if we know about the path we do not do anything
here, if we do not see any conflict in the file we do nothing,
otherwise we assign a new id"; we now say "see if we can parse
and also see if we have conflict(s); if we know about the path and
we cannot parse, drop it from the rr database (because otherwise the
entry will cause us trouble elsewhere later).  Otherwise, if we do
not have any conflict or we already know about the path, no need to
do anything. Otherwise, i.e. a newly discovered path with conflicts
gets a new id".

Makes sense.  "A known path with unparseable conflict gets dropped"
is the important change in this hunk.

> diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
> index 8417e5a4b1..34f0518a5e 100755
> --- a/t/t4200-rerere.sh
> +++ b/t/t4200-rerere.sh
> @@ -580,4 +580,26 @@ test_expect_success 'multiple identical conflicts' '
>   count_pre_post 0 

Re: [PATCH v3 00/11] rerere: handle nested conflicts

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> Thomas Gummerer (11):
>   rerere: unify error messages when read_cache fails
>   rerere: lowercase error messages
>   rerere: wrap paths in output in sq
>   rerere: mark strings for translation
>   rerere: add documentation for conflict normalization
>   rerere: fix crash when conflict goes unresolved
>   rerere: only return whether a path has conflicts or not
>   rerere: factor out handle_conflict function
>   rerere: return strbuf from handle path
>   rerere: teach rerere to handle nested conflicts
>   rerere: recalculate conflict ID when unresolved conflict is committed

Even though I am not certain about the last two steps, everything
before them looked trivially correct and good changes (well, the
"strbuf" one's goodness obviously depends on the goodness of the
last two, which are helped by it).

Sorry for taking so long before getting to the series.


Re: [PATCH v3 08/11] rerere: factor out handle_conflict function

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> Factor out the handle_conflict function, which handles a single
> conflict in a path.  This is in preparation for a subsequent commit,
> where this function will be re-used.  No functional changes intended.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  rerere.c | 87 ++--
>  1 file changed, 47 insertions(+), 40 deletions(-)

Renumbering of the enum made me raise my eyebrow a bit briefly but
it is merely to keep track of the state locally and invisible from
the outside, so it is perfectly fine.

> - git_SHA_CTX ctx;
> - int has_conflicts = 0;
>   enum {
> - RR_CONTEXT = 0, RR_SIDE_1, RR_SIDE_2, RR_ORIGINAL
> - } hunk = RR_CONTEXT;
> + RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
> + } hunk = RR_SIDE_1;


Re: [PATCH v3 07/11] rerere: only return whether a path has conflicts or not

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> We currently return the exact number of conflict hunks a certain path
> has from the 'handle_paths' function.  However all of its callers only
> care whether there are conflicts or not or if there is an error.
> Return only that information, and document that only that information
> is returned.  This will simplify the code in the subsequent steps.
>
> Signed-off-by: Thomas Gummerer 
> ---
>  rerere.c | 23 ---
>  1 file changed, 12 insertions(+), 11 deletions(-)

I do recall writing this code without knowing if the actual number
of conflicts would be useful by callers, but it is apparent that it
wasn't.  I won't mind losing that bit of info at all.  Besides, we
won't risk mistaking a file with 2 billion conflicts with a file
whose conflicts cannot be parsed ;-).

The patch looks good.  Thanks.


Re: [PATCH v3 09/11] rerere: return strbuf from handle path

2018-07-30 Thread Junio C Hamano
Thomas Gummerer  writes:

> Currently we write the conflict to disk directly in the handle_path
> function.  To make it re-usable for nested conflicts, instead of
> writing the conflict out directly, store it in a strbuf and let the
> caller write it out.
>
> This does mean some slight increase in memory usage, however that
> increase is limited to the size of the largest conflict we've
> currently processed.  We already keep one copy of the conflict in
> memory, and it shouldn't be too large, so the increase in memory usage
> seems acceptable.
>
> As a bonus this lets us get replace the rerere_io_putconflict function
> with a trivial two line function.

Makes sense.


Re: [BUG] fetching sometimes doesn't update refs

2018-07-30 Thread Brandon Williams
On 07/29, Jeff King wrote:
> I've noticed for the past couple of weeks that some of my fetches don't
> seem to actually update refs, but a follow-up fetch will. I finally
> managed to catch it in the act and track it down. It bisects to your
> 989b8c4452 (fetch-pack: put shallow info in output parameter,
> 2018-06-27). 
> 
> A reproduction recipe is below. I can't imagine why this repo in
> particular triggers it, but it was the one where I initially saw the
> problem (and doing a tiny reproduction does not seem to work). I'm
> guessing it has something to do with the refs, since the main change in
> the offending commit is that we recompute the refmap.

I've noticed this behavior sporadically as well, though I've never been
able to reliably reproduce it, so thanks for creating a reproduction
recipe.  I suspected that it had to do with the ref-in-want series so
thanks for tracking that down too.  We'll take a look.

> 
> -- >8 --
> # clone the repo as it is today
> git clone https://github.com/cmcaine/tridactyl.git
> cd tridactyl
> 
> # roll back the refs so that there is something to fetch
> for i in refs/heads/master refs/remotes/origin/master; do
>   git update-ref $i $i^
> done
> 
> # and delete the now-unreferenced objects, pretending we are an earlier
> # clone that had not yet fetched
> rm -rf .git/logs
> git repack -ad
> 
> # now fetch; this will get the objects but fail to update refs
> git fetch
> 
> # and fetching again will actually update the refs
> git fetch
> -- 8< --
> 
> -Peff

-- 
Brandon Williams


Re: [PATCH 1/1] Add the `p4-pre-submit` hook

2018-07-30 Thread Junio C Hamano
Chen Bin  writes:

> The `p4-pre-submit` hook is executed before git-p4 submits code.
> If the hook exits with non-zero value, submit process not start.
>
> Signed-off-by: Chen Bin 
> ---

Luke, does this version look good to you?

I still think the addition to self.description is a bit too much but
other than that the incremental since the last one looked sensible
to my untrained eyes ;-)

Thanks, both.

>  Documentation/git-p4.txt   |  8 
>  Documentation/githooks.txt |  7 +++
>  git-p4.py  | 16 +++-
>  t/t9800-git-p4-basic.sh| 29 +
>  4 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
> index f0de3b891..a7aac1b92 100644
> --- a/Documentation/git-p4.txt
> +++ b/Documentation/git-p4.txt
> @@ -374,6 +374,14 @@ These options can be used to modify 'git p4 submit' 
> behavior.
>  been submitted. Implies --disable-rebase. Can also be set with
>  git-p4.disableP4Sync. Sync with origin/master still goes ahead if 
> possible.
>  
> +Hook for submit
> +~~~
> +The `p4-pre-submit` hook is executed if it exists and is executable.
> +The hook takes no parameter and nothing from standard input. Exiting with
> +non-zero status from this script prevents `git-p4 submit` from launching.
> +
> +One usage scenario is to run unit tests in the hook.
> +
>  Rebase options
>  ~~
>  These options can be used to modify 'git p4 rebase' behavior.
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index e3c283a17..22fcabbe2 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -485,6 +485,13 @@ The exit status determines whether git will use the data 
> from the
>  hook to limit its search.  On error, it will fall back to verifying
>  all files and folders.
>  
> +p4-pre-submit
> +~
> +
> +This hook is invoked by `git-p4 submit`. It takes no parameter and nothing
> +from standard input. Exiting with non-zero status from this script prevent
> +`git-p4 submit` from launching. Run `git-p4 submit --help` for details.
> +
>  GIT
>  ---
>  Part of the linkgit:git[1] suite
> diff --git a/git-p4.py b/git-p4.py
> index b449db1cc..879abfd2b 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -1494,7 +1494,13 @@ def __init__(self):
>  optparse.make_option("--disable-p4sync", 
> dest="disable_p4sync", action="store_true",
>   help="Skip Perforce sync of p4/master 
> after submit or shelve"),
>  ]
> -self.description = "Submit changes from git to the perforce depot."
> +self.description = """Submit changes from git to the perforce 
> depot.\n
> +The `p4-pre-submit` hook is executed if it exists and is executable.
> +The hook takes no parameter and nothing from standard input. Exiting with
> +non-zero status from this script prevents `git-p4 submit` from launching.
> +
> +One usage scenario is to run unit tests in the hook."""
> +
>  self.usage += " [name of git branch to submit into perforce depot]"
>  self.origin = ""
>  self.detectRenames = False
> @@ -2303,6 +2309,14 @@ def run(self, args):
>  sys.exit("number of commits (%d) must match number of shelved 
> changelist (%d)" %
>   (len(commits), num_shelves))
>  
> +hooks_path = gitConfig("core.hooksPath")
> +if len(hooks_path) <= 0:
> +hooks_path = os.path.join(os.environ.get("GIT_DIR", ".git"), 
> "hooks")
> +
> +hook_file = os.path.join(hooks_path, "p4-pre-submit")
> +if os.path.isfile(hook_file) and os.access(hook_file, os.X_OK) and 
> subprocess.call([hook_file]) != 0:
> +sys.exit(1)
> +
>  #
>  # Apply the commits, one at a time.  On failure, ask if should
>  # continue to try the rest of the patches, or quit.
> diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh
> index 4849edc4e..2b7baa95d 100755
> --- a/t/t9800-git-p4-basic.sh
> +++ b/t/t9800-git-p4-basic.sh
> @@ -261,6 +261,35 @@ test_expect_success 'unresolvable host in P4PORT should 
> display error' '
>   )
>  '
>  
> +# Test following scenarios:
> +#   - Without ".git/hooks/p4-pre-submit" , submit should continue
> +#   - With the hook returning 0, submit should continue
> +#   - With the hook returning 1, submit should abort
> +test_expect_success 'run hook p4-pre-submit before submit' '
> + test_when_finished cleanup_git &&
> + git p4 clone --dest="$git" //depot &&
> + (
> + cd "$git" &&
> + echo "hello world" >hello.txt &&
> + git add hello.txt &&
> + git commit -m "add hello.txt" &&
> + git config git-p4.skipSubmitEdit true &&
> + git-p4 submit --dry-run >out &&
> + grep "Would apply" out &&
> + mkdir -p .git/hooks &&
> + write_script .git/hooks/

Re: [PATCH v2 0/4] Speed up unpack_trees()

2018-07-30 Thread Ben Peart




On 7/29/2018 6:33 AM, Nguyễn Thái Ngọc Duy wrote:

This series speeds up unpack_trees() a bit by using cache-tree.
unpack-trees could bit split in three big parts

- the actual tree unpacking and running n-way merging
- update worktree, which could be expensive depending on how much I/O
   is involved
- repair cache-tree

This series focuses on the first part alone and could give 700%
speedup (best case possible scenario, real life ones probably not that
impressive).

It also shows that the reparing cache-tree is kinda expensive. I have
an idea of reusing cache-tree from the original index, but I'll leave
that to Ben or others to try out and see if it helps at all.

v2 fixes the comments from Junio, adds more performance tracing and
reduces the cost of adding index entries.

Nguyễn Thái Ngọc Duy (4):
   unpack-trees.c: add performance tracing
   unpack-trees: optimize walking same trees with cache-tree
   unpack-trees: reduce malloc in cache-tree walk
   unpack-trees: cheaper index update when walking by cache-tree

  cache-tree.c   |   2 +
  cache.h|   1 +
  read-cache.c   |   3 +-
  unpack-trees.c | 161 -
  unpack-trees.h |   1 +
  5 files changed, 166 insertions(+), 2 deletions(-)



I ran "git checkout" on a large repo and averaged the results of 3 runs. 
 This clearly demonstrates the benefit of the optimized unpack_trees() 
as even the final "diff-index" is essentially a 3rd call to unpack_trees().


baselinenew 
--
0.535510167 0.556558733 s: read cache .git/index
0.3057373   0.3147105   s: initialize name hash
0.0184082   0.023558433 s: preload index
0.086910967 0.089085967 s: refresh index
7.889590767 2.191554433 s: unpack trees
0.120760833 0.131941267 s: update worktree after a merge
2.2583504   2.572663167 s: repair cache-tree
0.8916137   0.959495233 s: write index, changed mask = 28
3.405199233 0.2710663   s: unpack trees
0.000999667 0.0021554   s: update worktree after a merge
3.4063306   0.273318333 s: diff-index
16.9524923	9.462943133	s: git command: 
'c:\git-sdk-64\usr\src\git\git.exe' checkout


The first call to unpack_trees() saves 72%
The 2nd and 3rd call save 92%
Total time savings for the entire command was 44%

In the performance game of whack-a-mole, that call to repair cache-tree 
is now looking quite expensive...


Ben


Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells

2018-07-30 Thread Jonathan Nieder
Hi,

Eric Sunshine wrote:

> The --chain-lint option detects broken &&-chains by forcing the test to
> exit early (as the very first step) with a sentinel value. If that
> sentinel is the test's overall exit code, then the &&-chain is intact;
> if not, then the chain is broken. Unfortunately, this detection does not
> extend to &&-chains within subshells even when the subshell itself is
> properly linked into the outer &&-chain.
>
> Address this shortcoming by feeding the body of the test to a
> lightweight "linter" which can peer inside subshells and identify broken
> &&-chains by pure textual inspection.

Interesting.

>Although the linter does not
> actually parse shell scripts, it has enough knowledge of shell syntax to
> reliably deal with formatting style variations (as evolved over the
> years) and to avoid being fooled by non-shell content (such as inside
> here-docs and multi-line strings).

This is causing contrib/subtree tests to fail for me: running "make -C
contrib/subtree test" produces

[...]
*** t7900-subtree.sh ***
ok 1 - no merge from non-existent subtree
ok 2 - no pull from non-existent subtree
ok 3 - add subproj as subtree into sub dir/ with --prefix
ok 4 - add subproj as subtree into sub dir/ with --prefix and --message
ok 5 - add subproj as subtree into sub dir/ with --prefix as -P and 
--message as -m
ok 6 - add subproj as subtree into sub dir/ with --squash and --prefix 
and --message
ok 7 - merge new subproj history into sub dir/ with --prefix
ok 8 - merge new subproj history into sub dir/ with --prefix and 
--message
ok 9 - merge new subproj history into sub dir/ with --squash and 
--prefix and --message
ok 10 - merge the added subproj again, should do nothing
ok 11 - merge new subproj history into subdir/ with a slash appended to 
the argument of --prefix
ok 12 - split requires option --prefix
ok 13 - split requires path given by option --prefix must exist
ok 14 - split sub dir/ with --rejoin
ok 15 - split sub dir/ with --rejoin from scratch
ok 16 - split sub dir/ with --rejoin and --message
ok 17 - split "sub dir"/ with --branch
ok 18 - check hash of split
ok 19 - split "sub dir"/ with --branch for an existing branch
ok 20 - split "sub dir"/ with --branch for an incompatible branch
error: bug in the test script: broken &&-chain or run-away HERE-DOC: 
subtree_test_create_repo "$subtree_test_count" &&
[...]
)

Makefile:44: recipe for target 't7900-subtree.sh' failed

The problematic test code looks like this:

(
cd "$subtree_test_count/sub proj" &&
git fetch .. subproj-br &&
git merge FETCH_HEAD &&

chks="sub1
sub2
sub3
sub4" &&
chks_sub=$(cat <

Re: [GSoC][PATCH v4 15/20] rebase -i: rewrite write_basic_state() in C

2018-07-30 Thread SZEDER Gábor
> diff --git a/sequencer.c b/sequencer.c
> index 1c035ceec7..d257903db0 100644
> --- a/sequencer.c
> +++ b/sequencer.c

> +int write_basic_state(struct replay_opts *opts, const char *head_name,
> +   const char *onto, const char *orig_head)
> +{
> + const char *quiet = getenv("GIT_QUIET");
> +
> + if (head_name)
> + write_file(rebase_path_head_name(), "%s\n", head_name);
> + if (onto)
> + write_file(rebase_path_onto(), "%s\n", onto);
> + if (orig_head)
> + write_file(rebase_path_orig_head(), "%s\n", orig_head);
> +
> + if (quiet)
> + write_file(rebase_path_quiet(), "%s\n", quiet);
> + else
> + write_file(rebase_path_quiet(), "");

This is not a faithful conversion of the original.  git-rebase.sh writes
this 'quiet' file with:

  echo "$GIT_QUIET" > "$state_dir"/quiet

which means that a single newline character was written even when
$GIT_QUIET was unset/empty.

I seem to recall a case in the past, when a shell-to-C conversion
accidentally dropped a newline from a similar state-file, which then
caused some issues later on.  But I don't remember the specifics and a
quick search didn't turn up anything relevant either...

> +
> + if (opts->verbose)
> + write_file(rebase_path_verbose(), "");
> + if (opts->strategy)
> + write_file(rebase_path_strategy(), "%s\n", opts->strategy);
> + if (opts->xopts_nr > 0)
> + write_strategy_opts(opts);
> +
> + if (opts->allow_rerere_auto == RERERE_AUTOUPDATE)
> + write_file(rebase_path_allow_rerere_autoupdate(), 
> "--rerere-autoupdate\n");
> + else if (opts->allow_rerere_auto == RERERE_NOAUTOUPDATE)
> + write_file(rebase_path_allow_rerere_autoupdate(), 
> "--no-rerere-autoupdate\n");
> +
> + if (opts->gpg_sign)
> + write_file(rebase_path_gpg_sign_opt(), "-S%s\n", 
> opts->gpg_sign);
> + if (opts->signoff)
> + write_file(rebase_path_signoff(), "--signoff\n");
> +
> + return 0;
> +}
> +
>  static int walk_revs_populate_todo(struct todo_list *todo_list,
>   struct replay_opts *opts)
>  {


Re: [PATCH] doc: fix want-capability separator

2018-07-30 Thread Stefan Beller
On Sat, Jul 28, 2018 at 2:16 PM Masaya Suzuki  wrote:
> Signed-off-by: Masaya Suzuki 

The email addresses mismatch?

> Unlike ref advertisement, client capabilities and the first want are
> separated by SP, not NUL, in the implementation. Fix the documentation
> to align with the implementation.

Makes sense! Thanks for the fix!

> pack-protocol.txt is already fixed.

which has

  capability-list  =  capability *(SP capability)

since b31222cfb7f (Update packfile transfer protocol
documentation, 2009-11-03), which is the first to mention
the capability line, so I'd claim it was always correct?

>
> ---
>  Documentation/technical/http-protocol.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/technical/http-protocol.txt 
> b/Documentation/technical/http-protocol.txt
> index 64f49d0bb..9c5b6f0fa 100644
> --- a/Documentation/technical/http-protocol.txt
> +++ b/Documentation/technical/http-protocol.txt
> @@ -338,11 +338,11 @@ server advertises capability `allow-tip-sha1-in-want` or
>request_end
>request_end   =  "" / "done"
>
> -  want_list =  PKT-LINE(want NUL cap_list LF)
> +  want_list =  PKT-LINE(want SP cap_list LF)
>*(want_pkt)
>want_pkt  =  PKT-LINE(want LF)
>want  =  "want" SP id
> -  cap_list  =  *(SP capability) SP
> +  cap_list  =  capability *(SP capability)
>
>have_list =  *PKT-LINE("have" SP id LF)

Just after these context lines we have

  TODO: Document this further.

which is a good hint that the existing documentation
can benefit from patches like these.

Thanks,
Stefan


Re: Is detecting endianness at compile-time unworkable?

2018-07-30 Thread Junio C Hamano
Junio C Hamano  writes:

> Ævar Arnfjörð Bjarmason  writes:
>
>> And, as an aside, the reason we can't easily make it better ourselves is
>> because the build process for git.git doesn't have a facility to run
>> code to detect this type of stuff (the configure script is always
>> optional). So we can't just run this test ourselves.
>
> It won't help those who cross-compile anyway.  I thought we declared
> "we make a reasonable effort to guess the target endianness from the
> system header by inspecting usual macros, but will not aim to cover
> every system on the planet---instead there is a knob to tweak it for
> those on exotic platforms" last time we discussed this?

Well, having said all that, I do not think I personally mind if
./configure learned to include a "compile small program and run it
to determine byte order on the build machine" as part of "we make a
reasonable effort" as long as it cleanly excludes cross building
case (and the result is made overridable just in case we misdetect
the "cross-ness" of the build).



Re: Is detecting endianness at compile-time unworkable?

2018-07-30 Thread Daniel Shumow
The change was definitely made for performance.  Removing the if
statements, conditioned upon endianess was an approx 10% improvement, which
was very important to getting this library accepted into git.

Thanks,
Dan


On Mon, Jul 30, 2018 at 11:32 AM, Junio C Hamano  wrote:

> Junio C Hamano  writes:
>
> > Ævar Arnfjörð Bjarmason  writes:
> >
> >> And, as an aside, the reason we can't easily make it better ourselves is
> >> because the build process for git.git doesn't have a facility to run
> >> code to detect this type of stuff (the configure script is always
> >> optional). So we can't just run this test ourselves.
> >
> > It won't help those who cross-compile anyway.  I thought we declared
> > "we make a reasonable effort to guess the target endianness from the
> > system header by inspecting usual macros, but will not aim to cover
> > every system on the planet---instead there is a knob to tweak it for
> > those on exotic platforms" last time we discussed this?
>
> Well, having said all that, I do not think I personally mind if
> ./configure learned to include a "compile small program and run it
> to determine byte order on the build machine" as part of "we make a
> reasonable effort" as long as it cleanly excludes cross building
> case (and the result is made overridable just in case we misdetect
> the "cross-ness" of the build).
>
>


Re: [PATCH 2/2] sequencer: fix "rebase -i --root" corrupting author header timezone

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 8:20 AM Phillip Wood  wrote:
> On 30/07/18 10:29, Eric Sunshine wrote:
> > When "git rebase -i --root" creates a new root commit, it corrupts the
> > "author" header's timezone by repeating the last digit:
> > [...]
> > Signed-off-by: Eric Sunshine 
> > ---
> > diff --git a/sequencer.c b/sequencer.c
> > @@ -654,6 +654,7 @@ static int write_author_script(const char *message)
> > + strbuf_addch(&buf, '\'');
> > @@ -724,7 +725,11 @@ static const char *read_author_ident(struct strbuf 
> > *buf)
> > - sq_dequote(in);
> > + if (!sq_dequote(in)) {
> > + warning(_("bad quoting on %s value in '%s'"),
> > + keys[i], rebase_path_author_script());
> > + return NULL;
>
> I think we want to handle the broken author script properly rather than
> returning NULL. If we had a single function
> int read_author_script(const char **name, const char **author, const
> char **date)
> to read the author script that tried sq_dequote() and then fell back to
> code based on read_env_script() that handled the missing "'" at the end
> and also the bad quoting of "'" if sq_dequote() failed it would make it
> easier to fix the existing bugs, rather than having to fix
> read_author_ident() and read_env_script() separately. What do you think?

That makes sense as a long-term plan, however, I'm concerned with the
immediate problem that a released version of Git can (and did, in my
case) corrupt commit objects. So, in the short term, I think it makes
sense to get this minimal fix landed, and build the more "correctly
engineered" solution on top of it, without the pressure of worrying
about corruption spreading.


Re: [PATCH 1/2] Document git config getter return value.

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 8:26 AM Han-Wen Nienhuys  wrote:
> ---
>  config.h | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)

Missing sign-off.


Re: [PATCH 1/1] Add the `p4-pre-submit` hook

2018-07-30 Thread Luke Diamand
On 30 July 2018 at 20:07, Junio C Hamano  wrote:
> Chen Bin  writes:
>
>> The `p4-pre-submit` hook is executed before git-p4 submits code.
>> If the hook exits with non-zero value, submit process not start.
>>
>> Signed-off-by: Chen Bin 
>> ---
>
> Luke, does this version look good to you?
>

Yes, I think so, We might find in the future that we do need an
additional hook *after* the change has been applied, but as per Chen's
comment, it sounds like that's not what is needed here; it can easily
be added in the future if necessary.

And there's no directly equivalent existing git-hook which could be
used instead, so this seems like a useful addition.

Possibly it should say "it takes no parameters" rather than "it takes
no parameter"; it is confusing that in English, zero takes the plural
rather than the singular. There's a PhD in linguistics there for
someone!

Luke


> I still think the addition to self.description is a bit too much but
> other than that the incremental since the last one looked sensible
> to my untrained eyes ;-)
>
> Thanks, both.
>
>>  Documentation/git-p4.txt   |  8 
>>  Documentation/githooks.txt |  7 +++
>>  git-p4.py  | 16 +++-
>>  t/t9800-git-p4-basic.sh| 29 +
>>  4 files changed, 59 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
>> index f0de3b891..a7aac1b92 100644
>> --- a/Documentation/git-p4.txt
>> +++ b/Documentation/git-p4.txt
>> @@ -374,6 +374,14 @@ These options can be used to modify 'git p4 submit' 
>> behavior.
>>  been submitted. Implies --disable-rebase. Can also be set with
>>  git-p4.disableP4Sync. Sync with origin/master still goes ahead if 
>> possible.
>>
>> +Hook for submit
>> +~~~
>> +The `p4-pre-submit` hook is executed if it exists and is executable.
>> +The hook takes no parameter and nothing from standard input. Exiting with
>> +non-zero status from this script prevents `git-p4 submit` from launching.
>> +
>> +One usage scenario is to run unit tests in the hook.
>> +
>>  Rebase options
>>  ~~
>>  These options can be used to modify 'git p4 rebase' behavior.
>> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
>> index e3c283a17..22fcabbe2 100644
>> --- a/Documentation/githooks.txt
>> +++ b/Documentation/githooks.txt
>> @@ -485,6 +485,13 @@ The exit status determines whether git will use the 
>> data from the
>>  hook to limit its search.  On error, it will fall back to verifying
>>  all files and folders.
>>
>> +p4-pre-submit
>> +~
>> +
>> +This hook is invoked by `git-p4 submit`. It takes no parameter and nothing
>> +from standard input. Exiting with non-zero status from this script prevent
>> +`git-p4 submit` from launching. Run `git-p4 submit --help` for details.
>> +
>>  GIT
>>  ---
>>  Part of the linkgit:git[1] suite
>> diff --git a/git-p4.py b/git-p4.py
>> index b449db1cc..879abfd2b 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -1494,7 +1494,13 @@ def __init__(self):
>>  optparse.make_option("--disable-p4sync", 
>> dest="disable_p4sync", action="store_true",
>>   help="Skip Perforce sync of p4/master 
>> after submit or shelve"),
>>  ]
>> -self.description = "Submit changes from git to the perforce depot."
>> +self.description = """Submit changes from git to the perforce 
>> depot.\n
>> +The `p4-pre-submit` hook is executed if it exists and is executable.
>> +The hook takes no parameter and nothing from standard input. Exiting 
>> with
>> +non-zero status from this script prevents `git-p4 submit` from 
>> launching.
>> +
>> +One usage scenario is to run unit tests in the hook."""
>> +
>>  self.usage += " [name of git branch to submit into perforce depot]"
>>  self.origin = ""
>>  self.detectRenames = False
>> @@ -2303,6 +2309,14 @@ def run(self, args):
>>  sys.exit("number of commits (%d) must match number of shelved 
>> changelist (%d)" %
>>   (len(commits), num_shelves))
>>
>> +hooks_path = gitConfig("core.hooksPath")
>> +if len(hooks_path) <= 0:
>> +hooks_path = os.path.join(os.environ.get("GIT_DIR", ".git"), 
>> "hooks")
>> +
>> +hook_file = os.path.join(hooks_path, "p4-pre-submit")
>> +if os.path.isfile(hook_file) and os.access(hook_file, os.X_OK) and 
>> subprocess.call([hook_file]) != 0:
>> +sys.exit(1)
>> +
>>  #
>>  # Apply the commits, one at a time.  On failure, ask if should
>>  # continue to try the rest of the patches, or quit.
>> diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh
>> index 4849edc4e..2b7baa95d 100755
>> --- a/t/t9800-git-p4-basic.sh
>> +++ b/t/t9800-git-p4-basic.sh
>> @@ -261,6 +261,35 @@ test_expect_success 'unresolvable host in P4PORT should 
>> display error' '
>>   )
>>  '
>>
>> +# Test fol

[PATCH 0/2] subtree: fix &&-chain and simplify tests (Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells)

2018-07-30 Thread Jonathan Nieder
(resetting cc list)
Jonathan Nieder wrote:

> This is causing contrib/subtree tests to fail for me: running "make -C
> contrib/subtree test" produces
[...]
>   error: bug in the test script: broken &&-chain or run-away HERE-DOC:
[...]
> Ugly quoting, useless use of "cat", etc, aside, I don't think it's
> missing any &&.  Hints?

Turns out it was missing a && too. :)

These patches are against "master".  Ideally this would have come
before es/chain-lint-in-subshell.  Since this is contrib/, I'm okay
with losing bisectability and having it come after, though.

Thoughts of all kinds welcome.

Jonathan Nieder (2):
  subtree: add missing && to &&-chain
  subtree: simplify preparation of expected results

 contrib/subtree/t/t7900-subtree.sh | 121 -
 1 file changed, 31 insertions(+), 90 deletions(-)


git@vger.kernel.org

2018-07-30 Thread Jonathan Nieder
Detected using t/chainlint.

Signed-off-by: Jonathan Nieder 
---
 contrib/subtree/t/t7900-subtree.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/subtree/t/t7900-subtree.sh 
b/contrib/subtree/t/t7900-subtree.sh
index d05c613c97..e6a28f2c3e 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -708,7 +708,7 @@ test_expect_success 'make sure each filename changed 
exactly once in the entire
test_create_commit "$subtree_test_count/sub proj" sub1 &&
(
cd "$subtree_test_count" &&
-   git config log.date relative
+   git config log.date relative &&
git fetch ./"sub proj" master &&
git subtree add --prefix="sub dir" FETCH_HEAD
) &&
-- 
2.18.0.345.g5c9ce644c3-goog



[PATCH 2/2] subtree test: simplify preparation of expected results

2018-07-30 Thread Jonathan Nieder
This mixture of quoting, pipes, and here-docs to produce expected
results in shell variables is difficult to follow.  Simplify by using
simpler constructs that write output to files instead.

Noticed because without this patch, t/chainlint is not able to
understand the script in order to validate that its subshells use an
unbroken &&-chain, causing "make -C contrib/subtree test" to fail with

error: bug in the test script: broken &&-chain or run-away HERE-DOC:

in t7900.21.

Signed-off-by: Jonathan Nieder 
---
That's the end of the series.  Thanks for reading.

Thanks,
Jonathan

 contrib/subtree/t/t7900-subtree.sh | 119 -
 1 file changed, 30 insertions(+), 89 deletions(-)

diff --git a/contrib/subtree/t/t7900-subtree.sh 
b/contrib/subtree/t/t7900-subtree.sh
index e6a28f2c3e..57ff4b25c1 100755
--- a/contrib/subtree/t/t7900-subtree.sh
+++ b/contrib/subtree/t/t7900-subtree.sh
@@ -540,26 +540,10 @@ test_expect_success 'make sure exactly the right set of 
files ends up in the sub
git fetch .. subproj-br &&
git merge FETCH_HEAD &&
 
-   chks="sub1
-sub2
-sub3
-sub4" &&
-   chks_sub=$(cat actual &&
+   test_cmp expect actual
)
 '
 
@@ -606,25 +590,11 @@ test_expect_success 'make sure the subproj *only* 
contains commits that affect t
git fetch .. subproj-br &&
git merge FETCH_HEAD &&
 
-   chks="sub1
-sub2
-sub3
-sub4" &&
-   chks_sub=$(cat log &&
+   sort actual &&
+   test_cmp expect actual
)
 '
 
@@ -675,29 +645,16 @@ test_expect_success 'make sure exactly the right set of 
files ends up in the mai
cd "$subtree_test_count" &&
git subtree pull --prefix="sub dir" ./"sub proj" master &&
 
-   chkm="main1
-main2" &&
-   chks="sub1
-sub2
-sub3
-sub4" &&
-   chks_sub=$(cat chkms 
&&
+   sed "s,^,sub dir/," chkms >chkms_sub &&
+   test_write_lines sub1 sub2 sub3 sub4 >chks &&
+   sed "s,^,sub dir/," chks >chks_sub &&
+
+   cat chkm chkms_sub chks_sub >expect &&
+   git ls-files >actual &&
+   test_cmp expect actual
+   )
 '
 
 next_test
@@ -748,37 +705,21 @@ test_expect_success 'make sure each filename changed 
exactly once in the entire
cd "$subtree_test_count" &&
git subtree pull --prefix="sub dir" ./"sub proj" master &&
 
-   chkm="main1
-main2" &&
-   chks="sub1
-sub2
-sub3
-sub4" &&
-   chks_sub=$(cat chks &&
+   test_write_lines main-sub1 main-sub2 main-sub3 main-sub4 >chkms 
&&
+   sed "s,^,sub dir/," chkms >chkms_sub &&
 
# main-sub?? and /"sub dir"/main-sub?? both change, because 
those are the
# changes that were split into their own history.  And "sub 
dir"/sub?? never
# change, since they were *only* changed in the subtree branch.
-   allchanges=$(git log --name-only --pretty=format:"" | sort | 
sed "/^$/d") &&
-   expected=''"$(cat actual &&
+
+   cat chkms chkm chks chkms_sub >expect-unsorted &&
+   sort expect-unsorted >expect &&
+   test_cmp expect actual
)
 '
 
-- 
2.18.0.345.g5c9ce644c3



Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 8:14 AM Phillip Wood  wrote:
> On 30/07/18 10:29, Eric Sunshine wrote:
> > It was only after I diagnosed and fixed these bugs that I thought to
> > check 'pu' and discovered that Akinori MUSHA already made a stab[1] at
> > fixing one of the three bugs which this series fixes. Akinori's fix has
> > the somewhat undesirable property that it adds an extra blank line to
> > the end of the script, as Phillip correctly pointed out in review[2].
> > Patch 2/2 of this series has the more "correct" fix, in addition to
> > fixing another bug.
> >
> > Moreover, patch 2/2 of this series provides a more thorough fix overall
> > than Akinori, so it may make sense to replace his patch with this
> > series, though perhaps keep the test his patch adds to augment the
> > strict test of the "author" header added by this series.
>
> Johannes and I have some fixups for Akinori's patch on the branch
> fix-t3403-author-script-test at https://github.com/phillipwood/git

I don't see a branch with that name there. There are a couple "wip"
branches, however, named wip/fix-t3403-author-script-test and
wip/fix-t3404-author-script-test. I'm guessing you wanted me to look
at the former.

> That branch also contains a fix for the bad quoting of names with "'" in
> them. I think it would be good to somehow try and combine this series
> with those patches.

It appears that your patches are fixing issues and a test outside the
issues fixed by my series (aside from the one line inserting the
missing closing quote). As such, I think your patches can be built
atop this series without worrying about conflicts. That would allow
this commit-corruption-bug-fixing series to land without being tied to
those "wip" patches which address lower-priority problems.

> I'd really like to see a single function to read and another to write
> the author script that is shared by 'git am' and 'git rebase -i', rather
> than the two writers and three readers we have at the moment. I was
> thinking of doing that in the longer term, but given the extra bug
> you've found in read_author_script() maybe we should do that sooner
> rather than later.

Agreed. That seems a reasonable long-term goal but needn't hold up
this series which addresses very real bugs leading to object
corruption.


Re: [PATCH 0/2] fix "rebase -i --root" corrupting root commit

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 11:47 AM Johannes Schindelin
 wrote:
> On Mon, 30 Jul 2018, Eric Sunshine wrote:
> > Unfortunately, after sending this series, I see that there is
> > additional corruption that needs to be addressed. In particular, the
> > "author" header time is incorrectly prefixed with "@", so this series
> > is going to need a re-roll to fix that, as well.
>
> AFAIR the `@` was not an oversight, but required so that we could pass in
> the Unix epoch.

I don't think it was an oversight either, but it is nevertheless
incorrect to use the "@" in the commit object's "author" header.

As I understand it, you do "need" the "@" to distinguish a Unix epoch
value assigned to GIT_AUTHOR_DATE, but the commit object format is
very exacting in the datestamp format it accepts, and it does not
allow "@". So, the date from GIT_AUTHOR_DATE needs to be converted to
a format acceptable to the commit object, otherwise the commit is
considered corrupt.


Re: [PATCH 8/8] commit-graph: close_commit_graph before shallow walk

2018-07-30 Thread Jakub Narebski
"Derrick Stolee via GitGitGadget"  writes:

> From: Derrick Stolee 
>
> Make close_commit_graph() work for arbitrary repositories.

The first line (above) does not match the rest of the commit message.

> Call close_commit_graph() when about to start a rev-list walk that
> includes shallow commits. This is necessary in code paths that "fake"
> shallow commits for the sake of fetch. Specifically, test 351 in
> t5500-fetch-pack.sh runs
>
>   git fetch --shallow-exclude one origin
>
> with a file-based transfer. When the "remote" has a commit-graph, we do
> not prevent the commit-graph from being loaded, but then the commits are
> intended to be dynamically transferred into shallow commits during
> get_shallow_commits_by_rev_list(). By closing the commit-graph before
> this call, we prevent this interaction.
>
> Signed-off-by: Derrick Stolee 
> ---
>  commit-graph.c | 8 
>  commit-graph.h | 1 +
>  upload-pack.c  | 2 ++
>  3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 233958e10..237d4e7d1 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -256,10 +256,10 @@ static int prepare_commit_graph(struct repository *r)
>   return !!r->objects->commit_graph;
>  }
>  
> -static void close_commit_graph(void)
> +void close_commit_graph(struct repository *r)
>  {
> - free_commit_graph(the_repository->objects->commit_graph);
> - the_repository->objects->commit_graph = NULL;
> + free_commit_graph(r->objects->commit_graph);
> + r->objects->commit_graph = NULL;
>  }
>  
>  static int bsearch_graph(struct commit_graph *g, struct object_id *oid, 
> uint32_t *pos)
> @@ -871,7 +871,7 @@ void write_commit_graph(const char *obj_dir,
>   write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr);
>   write_graph_chunk_large_edges(f, commits.list, commits.nr);
>  
> - close_commit_graph();
> + close_commit_graph(the_repository);
>   finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
>   commit_lock_file(&lk);
>  
> diff --git a/commit-graph.h b/commit-graph.h
> index 76e098934..13d736cdd 100644
> --- a/commit-graph.h
> +++ b/commit-graph.h
> @@ -59,6 +59,7 @@ void write_commit_graph(const char *obj_dir,
>  
>  int verify_commit_graph(struct repository *r, struct commit_graph *g);
>  
> +void close_commit_graph(struct repository *);
>  void free_commit_graph(struct commit_graph *);
>  
>  #endif
> diff --git a/upload-pack.c b/upload-pack.c
> index 4ca052d0b..52ad6c8e5 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -24,6 +24,7 @@
>  #include "quote.h"
>  #include "upload-pack.h"
>  #include "serve.h"
> +#include "commit-graph.h"
>  
>  /* Remember to update object flag allocation in object.h */
>  #define THEY_HAVE(1u << 11)
> @@ -739,6 +740,7 @@ static void deepen_by_rev_list(int ac, const char **av,
>  {
>   struct commit_list *result;
>  
> + close_commit_graph(the_repository);
>   result = get_shallow_commits_by_rev_list(ac, av, SHALLOW, NOT_SHALLOW);
>   send_shallow(result);
>   free_commit_list(result);


[PATCH 1/2] replace-objects: use arbitrary repositories

2018-07-30 Thread Stefan Beller
From: Derrick Stolee 

This is the smallest possible change that makes prepare_replace_objects
work properly with arbitrary repositories. By supplying the repository
as the cb_data, we do not need to modify any code in the ref iterator
logic. We will likely want to do a full replacement of the ref iterator
logic to provide a repository struct as a concrete parameter.

[sb: original commit message left as-is. I disagree with it.
We want to keep the ref store API clean and focussed on struct
ref_store. There is no need to treat a repository any special
for pass-through by the callback cookie. So instead let's just
pass the repository as a cb cookie and cleanup the API in follow
up patches]

Signed-off-by: Derrick Stolee 
Signed-off-by: Stefan Beller 
---
 replace-object.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/replace-object.c b/replace-object.c
index 801b5c16789..e99fcd1ff6e 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -14,6 +14,7 @@ static int register_replace_ref(const char *refname,
const char *slash = strrchr(refname, '/');
const char *hash = slash ? slash + 1 : refname;
struct replace_object *repl_obj = xmalloc(sizeof(*repl_obj));
+   struct repository *r = (struct repository *)cb_data;
 
if (get_oid_hex(hash, &repl_obj->original.oid)) {
free(repl_obj);
@@ -25,7 +26,7 @@ static int register_replace_ref(const char *refname,
oidcpy(&repl_obj->replacement, oid);
 
/* Register new object */
-   if (oidmap_put(the_repository->objects->replace_map, repl_obj))
+   if (oidmap_put(r->objects->replace_map, repl_obj))
die("duplicate replace ref: %s", refname);
 
return 0;
@@ -40,7 +41,7 @@ static void prepare_replace_object(struct repository *r)
xmalloc(sizeof(*r->objects->replace_map));
oidmap_init(r->objects->replace_map, 0);
 
-   for_each_replace_ref(r, register_replace_ref, NULL);
+   for_each_replace_ref(r, register_replace_ref, r);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
-- 
2.18.0.132.g195c49a2227



[PATCH 0/2] Cleanup refs API [WAS: Re: [PATCH 2/3] refs: introduce new API, wrap old API shallowly around new API]

2018-07-30 Thread Stefan Beller
> > I anticipate that we need to have a lot of back pointers to the repository
> > in question, hence I think we should have the repository pointer promoted
> > to not just a back pointer.
>
> I will probably need more time to study that commit and maybe the mail
> archive for the history of this series. But if I remember correctly
> some of these for_each_ api is quite a pain (perhaps it's the for_each
> version of reflog?) and it's probably better to redesign it (again
> talking without real understanding of the problem).

I stepped back a bit and reconsidered the point made above, and I do not
think that the repository argument is any special. If you need a repository
(for e.g. lookup_commit or friends), you'll have to pass it through the
callback cookie, whether directly or as part of a struct tailored to
your purpose.

Instead we should strive to make the refs API smaller and cleaner,
omitting the repository argument at all, and instead should be focussing
on a ref_store argument instead.

This series applies on master; when we decide to go this direction
we can drop origin/sb/refs-in-repo.

Thanks,
Stefan

Derrick Stolee (1):
  replace-objects: use arbitrary repositories

Stefan Beller (1):
  refs: switch for_each_replace_ref back to use a ref_store

 builtin/replace.c | 4 +---
 refs.c| 4 ++--
 refs.h| 2 +-
 replace-object.c  | 5 +++--
 4 files changed, 7 insertions(+), 8 deletions(-)

-- 
2.18.0.132.g195c49a2227



Re: [PATCH 2/8] t3206: add color test for range-diff --dual-color

2018-07-30 Thread Stefan Beller
On Fri, Jul 27, 2018 at 11:28 PM Eric Sunshine  wrote:
>
> On Fri, Jul 27, 2018 at 11:05 PM Stefan Beller  wrote:
> > The 'expect'ed outcome is taken by running the 'range-diff |decode';
> > it is not meant as guidance, rather as a documentation of the current
> > situation.
>
> I'm not really sure what this is trying to say. It seems _too_ brief.
>
> Did you want a space after the vertical bar before "decode"?

I am trying to say that this patch was generated by inserting
a "true && test_pause" first and then inside that paused test,
I just run

  source /t/test-lib-functions.sh
  git range-diff changed...changed-message --color --dual-color \
| test_decode_color

and saved that as the expected outcome, I was prepared to
massage it gently by s//Q/ but that was not needed,
but I forgot the q_to_tab in place.

By adding the test this way, I just want to state "I observed
the functionality as produced in this patch". I do not want
to endorse the coloring scheme or claim it is good (it is,
but I also still have nits to pick). So I tried to briefly say
that the test is essentially "autogenerated" by just observing
output at that point in time.

>
> > Signed-off-by: Stefan Beller 
> > ---
> > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> > +test_expect_success 'simple coloring' '
> > +   q_to_tab >expect <<-EOF &&
>
> Why 'q_to_tab'? I don't see any "q"'s in the body.
>
> I also don't see any variable interpolation in the body, so maybe you
> want -\EOF instead?

All good suggestions! Thanks for the review, I'll incorporate them.

Thanks,
Stefan


[PATCH 2/2] refs: switch for_each_replace_ref back to use a ref_store

2018-07-30 Thread Stefan Beller
This effectively reverts commit 0d296c57ae (refs: allow for_each_replace_ref
to handle arbitrary repositories, 2018-04-11) and 60ce76d3581 (refs: add
repository argument to for_each_replace_ref, 2018-04-11).

The repository argument is not any special from the ref-store's point
of life.  If you need a repository (for e.g. lookup_commit or friends),
you'll have to pass it through the callback cookie, whether directly or
as part of a struct tailored to your purpose.

So let's go back to the clean API, just requiring a ref_store as an
argument.

Signed-off-by: Stefan Beller 
---
 builtin/replace.c | 2 +-
 refs.c| 4 ++--
 refs.h| 2 +-
 replace-object.c  | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/replace.c b/builtin/replace.c
index deabda21012..52dc371eafc 100644
--- a/builtin/replace.c
+++ b/builtin/replace.c
@@ -87,7 +87,7 @@ static int list_replace_refs(const char *pattern, const char 
*format)
 "valid formats are 'short', 'medium' and 'long'\n",
 format);
 
-   for_each_replace_ref(the_repository, show_reference, (void *)&data);
+   for_each_replace_ref(show_reference, (void *)&data);
 
return 0;
 }
diff --git a/refs.c b/refs.c
index 08fb5a99148..2d713499125 100644
--- a/refs.c
+++ b/refs.c
@@ -1441,9 +1441,9 @@ int refs_for_each_fullref_in(struct ref_store *refs, 
const char *prefix,
return do_for_each_ref(refs, prefix, fn, 0, flag, cb_data);
 }
 
-int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data)
+int for_each_replace_ref(each_ref_fn fn, void *cb_data)
 {
-   return do_for_each_ref(get_main_ref_store(r),
+   return do_for_each_ref(get_main_ref_store(the_repository),
   git_replace_ref_base, fn,
   strlen(git_replace_ref_base),
   DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
diff --git a/refs.h b/refs.h
index cc2fb4c68c0..48d5ffd2082 100644
--- a/refs.h
+++ b/refs.h
@@ -307,7 +307,7 @@ int for_each_fullref_in(const char *prefix, each_ref_fn fn, 
void *cb_data,
 int for_each_tag_ref(each_ref_fn fn, void *cb_data);
 int for_each_branch_ref(each_ref_fn fn, void *cb_data);
 int for_each_remote_ref(each_ref_fn fn, void *cb_data);
-int for_each_replace_ref(struct repository *r, each_ref_fn fn, void *cb_data);
+int for_each_replace_ref(each_ref_fn fn, void *cb_data);
 int for_each_glob_ref(each_ref_fn fn, const char *pattern, void *cb_data);
 int for_each_glob_ref_in(each_ref_fn fn, const char *pattern,
 const char *prefix, void *cb_data);
diff --git a/replace-object.c b/replace-object.c
index e99fcd1ff6e..ee3374ab59b 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -41,7 +41,7 @@ static void prepare_replace_object(struct repository *r)
xmalloc(sizeof(*r->objects->replace_map));
oidmap_init(r->objects->replace_map, 0);
 
-   for_each_replace_ref(r, register_replace_ref, r);
+   for_each_replace_ref(register_replace_ref, r);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
-- 
2.18.0.132.g195c49a2227



Re: [PATCH 1/2] Document git config getter return value.

2018-07-30 Thread Junio C Hamano
Eric Sunshine  writes:

> On Mon, Jul 30, 2018 at 8:26 AM Han-Wen Nienhuys  wrote:
>> ---
>>  config.h | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> Missing sign-off.

Besides, the patch is corrupt in that it miscounts both preimage and
postimage lines and claims it applies to a 11-line block even though
there are only 10 lines there.


RE: Hash algorithm analysis

2018-07-30 Thread Dan Shumow
Hello all.   Johannes, thanks for adding me to this discussion.

So, as one of the coauthors of the SHA-1 collision detection code, I just 
wanted to chime in and say I'm glad to see the move to a longer hash function.  
Though, as a cryptographer, I have a few thoughts on the matter that I thought 
I would share.

I think that moving to SHA256 is a fine change, and I support it.

I'm not anywhere near the expert in this that Joan Daeman is.  I am someone who 
has worked in this space more or less peripherally.  However, I agree with Adam 
Langley that basically all of the finalists for a hash function replacement are 
about the same for the security needs of Git.  I think that, for this 
community, other software engineering considerations should be more important 
to the selection process.

I think Joan's survey of cryptanalysis papers and the numbers that he gives are 
interesting, and I had never seen the comparison laid out like that.  So, I 
think that there is a good argument to be made that SHA3 has had more 
cryptanalysis than SHA2.  Though, Joan, are the papers that you surveyed only 
focused on SHA2?  I'm curious if you think that the design/construction of 
SHA2, as it can be seen as an iteration of MD5/SHA1, means that the 
cryptanalysis papers on those constructions can be considered to apply to SHA2? 
 Again, I'm not an expert in this, but I do know that Marc Steven's techniques 
for constructing collisions also provided some small cryptanalytic improvements 
against the SHA2 family as well.  I also think that while the paper survey is a 
good way to look over all of this, the more time in the position of high 
profile visibility that SHA2 has can give us some confidence as well.

Also something worth pointing out is that the connection SHA2 has to SHA1 means 
that if Marc Steven's cryptanalysis of MD5/SHA-1 were ever successfully applied 
to SHA2, the SHA1 collision detection approach could be applied there as well, 
thus providing a drop in replacement in that situation.  That said, we don't 
know that there is not a similar way of addressing issues with the SHA3/Sponge 
design.  It's just that because we haven't seen any weaknesses of this sort in 
similar designs, we just don't know what a similar approach would be there yet. 
 I don't want to put too much stock in this argument, it's just saying "Well, 
we already know how SHA2 is likely to break, and we've had fixes for similar 
things in the past."  This is pragmatic but not inspiring or confidence 
building.

So, I also want to state my biases in favor of SHA2 as an employee of 
Microsoft.  Microsoft, being a corporation headquartered in a America, with the 
US Gov't as a major customer definitely prefers to defer to the US Gov't NIST 
standardization process.  And from that perspective SHA2 or SHA3 would be good 
choices.  I, personally, think that the NIST process is the best we have.  It 
is relatively transparent, and NIST employs a fair number of very competent 
cryptographers.  Also, I am encouraged by the widespread international 
participation that the NIST competitions and selection processes attract.

As such, and reflecting this bias, in the internal discussions that Johannes 
alluded to, SHA2 and SHA3 were the primary suggestions.  There was a slight 
preference for SHA2 because SHA3 is not exposed through the windows 
cryptographic APIs (though Git does not use those, so this is a nonissue for 
this discussion.)

I also wanted to thank Johannes for keeping the cryptographers that he 
discussed this with anonymous.  After all, cryptographers are known for being 
private.  And I wanted to say that Johannes did, in fact, accurately represent 
our internal discussions on the matter.

I also wanted to comment on the discussion of the "internal state having the 
same size as the output."  Linus referred to this several times.  This is known 
as narrow-pipe vs wide-pipe in the hash function design literature.  Linus is 
correct that wide-pipe designs are more in favor currently, and IIRC, all of 
the serious SHA3 candidates employed this.  That said, it did seem that in the 
discussion this was being equated with "length extension attacks."  And that 
connection is just not accurate.  Length extension attacks are primarily a 
motivation of the HMAC liked nested hashing design for MACs, because of a 
potential forgery attack.  Again, this doesn't really matter because the 
decision has been made despite this discussion.  I just wanted to set the 
record straight about this, as to avoid doing the right thing for the wrong 
reason (T.S. Elliot's "greatest treason.")

One other thing that I wanted to throw out there for the future is that in the 
crypto community there is currently a very large push to post quantum 
cryptography.  Whether the threat of quantum computers is real or imagined this 
is a hot area of research, with a NIST competition to select post quantum 
asymmetric cryptographic algorithms.  That is not directly of c

Re: [PATCH v2 1/4] unpack-trees.c: add performance tracing

2018-07-30 Thread Ben Peart




On 7/29/2018 6:33 AM, Nguyễn Thái Ngọc Duy wrote:

We're going to optimize unpack_trees() a bit in the following
patches. Let's add some tracing to measure how long it takes before
and after. This is the baseline ("git checkout -" on gcc.git, 80k
files on worktree)

 0.018239226 s: read cache .git/index
 0.052541655 s: preload index
 0.001537598 s: refresh index
 0.168167768 s: unpack trees
 0.002897186 s: update worktree after a merge
 0.131661745 s: repair cache-tree
 0.075389117 s: write index, changed mask = 2a
 0.111702023 s: unpack trees
 0.23245 s: update worktree after a merge
 0.111793866 s: diff-index
 0.587933288 s: git command: /home/pclouds/w/git/git checkout -

Signed-off-by: Nguyễn Thái Ngọc Duy 


I've reviewed this patch and it looks good to me.  Nice to see the 
additional breakdown on where time is being spent.




Re: [PATCH v4 11/21] range-diff: add tests

2018-07-30 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Sun, 22 Jul 2018, Eric Sunshine wrote:
>
>> On Sat, Jul 21, 2018 at 6:05 PM Thomas Rast via GitGitGadget
>>  wrote:
>> > These are essentially lifted from https://github.com/trast/tbdiff, with
>> > light touch-ups to account for the command now being names `git
>> 
>> s/names/named/
>
> Thanks.
>
> I already pushed an update to https://github.com/gitgitgadget/git/pull/1.

Should I take "pushed to ... GGG" to mean "do not merge what you
have to 'next' yet, as there will be an updated series (not
incremental) being prepared"?


Re: [PATCH v3 10/11] rerere: teach rerere to handle nested conflicts

2018-07-30 Thread Thomas Gummerer
On 07/30, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Currently rerere can't handle nested conflicts and will error out when
> > it encounters such conflicts.  Do that by recursively calling the
> > 'handle_conflict' function to normalize the conflict.
> >
> > The conflict ID calculation here deserves some explanation:
> >
> > As we are using the same handle_conflict function, the nested conflict
> > is normalized the same way as for non-nested conflicts, which means
> > the ancestor in the diff3 case is stripped out, and the parts of the
> > conflict are ordered alphabetically.
> >
> > The conflict ID is however is only calculated in the top level
> > handle_conflict call, so it will include the markers that 'rerere'
> > adds to the output.  e.g. say there's the following conflict:
> >
> > <<< HEAD
> > 1
> > ===
> > <<< HEAD
> > 3
> > ===
> > 2
> > >>> branch-2
> > >>> branch-3~
> 
> Hmph, I vaguely recall that I made inner merges to use the conflict
> markers automatically lengthened (by two, if I recall correctly)
> than its immediate outer merge.  Wouldn't the above look more like
> 
>  <<< HEAD
>  1
>  ===
>  < HEAD
>  3
>  =
>  2
>  > branch-2
>  >>> branch-3~
> 
> Perhaps I am not recalling it correctly.

The only way I could reproduce this is by not resolving a conflict
(just leaving the conflict markers in place, but running 'git add
conflicted'), and then merging something else, which produces another
conflict, where one of the sides was the one with conflict markers
already in the file, same as what I did in the test.

So in that case, the conflict markers of the already existing conflict
would just be treated as normal text during the merge I believe, and
thus the new conflict markers would be the same length.

The usage of git is really a bit wrong here, so I don't know if it's
actually worth helping the users at this point.  But trying to
understand how rerere exactly works, I had this written up already, so
I thought I would include it in this series anyway in case it helps
somebody :)

> > it would be recorde as follows in the preimage:
> >
> > <<<
> > 1
> > ===
> > <<<
> > 2
> > ===
> > 3
> > >>>
> > >>>
> >
> > and the conflict ID would be calculated as
> >
> > sha1(1<<<
> > 2
> > ===
> > 3
> > >>>)
> >
> > Stripping out vs. leaving the conflict markers in place in the inner
> > conflict should have no practical impact, but it simplifies the
> > implementation.
> >
> > Signed-off-by: Thomas Gummerer 
> > ---
> >  Documentation/technical/rerere.txt | 42 ++
> >  rerere.c   | 10 +--
> >  t/t4200-rerere.sh  | 37 ++
> >  3 files changed, 87 insertions(+), 2 deletions(-)
> >
> > [..snip..]
> > 
> > diff --git a/rerere.c b/rerere.c
> > index a35b88916c..f78bef80b1 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -365,12 +365,18 @@ static int handle_conflict(struct strbuf *out, struct 
> > rerere_io *io,
> > RR_SIDE_1 = 0, RR_SIDE_2, RR_ORIGINAL
> > } hunk = RR_SIDE_1;
> > struct strbuf one = STRBUF_INIT, two = STRBUF_INIT;
> > -   struct strbuf buf = STRBUF_INIT;
> > +   struct strbuf buf = STRBUF_INIT, conflict = STRBUF_INIT;
> > int has_conflicts = -1;
> >  
> > while (!io->getline(&buf, io)) {
> > if (is_cmarker(buf.buf, '<', marker_size)) {
> > -   break;
> > +   if (handle_conflict(&conflict, io, marker_size, NULL) < 
> > 0)
> > +   break;
> > +   if (hunk == RR_SIDE_1)
> > +   strbuf_addbuf(&one, &conflict);
> > +   else
> > +   strbuf_addbuf(&two, &conflict);
> 
> Hmph, do we ever see the inner conflict block while we are skipping
> and ignoring the common ancestor version, or it is impossible that
> we see '<' only while processing either our or their side?

As mentioned above, I haven't been able to reproduce creating an inner
conflict block outside of the case mentioned above, where the user
committed conflict markers, and then did another merge.

I don't think it can appear outside of that case in "normal"
operation.

> > +   strbuf_release(&conflict);
> > } else if (is_cmarker(buf.buf, '|', marker_size)) {
> > if (hunk != RR_SIDE_1)
> > break;
> > diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh
> > index 34f0518a5e..d63fe2b33b 100755
> > --- a/t/t4200-rerere.sh
> > +++ b/t/t4200-rerere.sh
> > @@ -602,4 +602,41 @@ test_expect_success 'rerere with unexpected conflict 
> > markers does not crash' '
> > git rerere clear
> >  '
> >  
> > +test_expect_success 'rerere with inner conflict markers' '
> > +

Re: [PATCH v3 05/11] rerere: add documentation for conflict normalization

2018-07-30 Thread Thomas Gummerer
On 07/30, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > +Different conflict styles and branch names are normalized by stripping
> > +the labels from the conflict markers, and removing extraneous
> > +information from the `diff3` conflict style. Branches that are merged
> 
> s/extraneous information/commmon ancestor version/ perhaps, to be
> fact-based without passing value judgment?

Yeah I meant "extraneous information for rerere", but common ancester
version is better.

> We drop the common ancestor version only because we cannot normalize
> from `merge` style to `diff3` style by adding one, and not because
> it is extraneous.  It does help humans understand the conflict a lot
> better to have that section.
> 
> > +By extension, this means that rerere should recognize that the above
> > +conflicts are the same.  To do this, the labels on the conflict
> > +markers are stripped, and the diff3 output is removed.  The above
> 
> s/diff3 output/common ancestor version/, as "diff3 output" would
> mean the whole thing between <<< and >>> to readers.

Makes sense, will fix in the re-roll, thanks!

> > diff --git a/rerere.c b/rerere.c
> > index be98c0afcb..da1ab54027 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -394,10 +394,6 @@ static int is_cmarker(char *buf, int marker_char, int 
> > marker_size)
> >   * and NUL concatenated together.
> >   *
> >   * Return the number of conflict hunks found.
> > - *
> > - * NEEDSWORK: the logic and theory of operation behind this conflict
> > - * normalization may deserve to be documented somewhere, perhaps in
> > - * Documentation/technical/rerere.txt.
> >   */
> >  static int handle_path(unsigned char *sha1, struct rerere_io *io, int 
> > marker_size)
> >  {
> 
> Thanks for finally removing this age-old NEEDSWORK comment.


Re: [PATCH v2 01/10] t/test-lib: teach --chain-lint to detect broken &&-chains in subshells

2018-07-30 Thread Eric Sunshine
On Mon, Jul 30, 2018 at 2:14 PM Jonathan Nieder  wrote:
> Eric Sunshine wrote:
> > Address this shortcoming by feeding the body of the test to a
> > lightweight "linter" which can peer inside subshells and identify broken
> > &&-chains by pure textual inspection.
>
> This is causing contrib/subtree tests to fail for me: running "make -C
> contrib/subtree test" produces

Thanks, I forgot that some of 'contrib' had bundled tests. (In fact, I
just checked the other 'contrib' tests and found that a MediaWiki test
has a broken top-level &&-chain.)

> The problematic test code looks like this:
>
> (
> chks_sub=$(cat < $chks
> TXT
> ) &&
>
> Ugly quoting, useless use of "cat", etc, aside, I don't think it's
> missing any &&.  Hints?

Yes, it's a false positive.

The subshell linter would normally fold out the here-doc content, but
'sed' isn't a proper programming language, so the linter can't
recognize arbitrary here-doc tags. Instead it has hard-coded knowledge
of the tags commonly used in the Git tests, specifically EOF, EOT, and
INPUT_END.

The linter also deals with multi-line $(...) expressions, however, it
currently only recognizes them when the $( is on its own line.

Had this test used one of the common here-doc tags _or_ had it
formatted the $(...) as described, then it wouldn't have misfired.

I could try to update the linter to not trip over this sort of input,
however, this test code is indeed ugly and difficult to understand,
and your rewrite[1] of it makes it far easier to grok, so I'm not sure
the effort would be worthwhile. What do you think?

[1]: 
https://public-inbox.org/git/20180730190738.gd156...@aiede.svl.corp.google.com/


Re: [PATCH v3 06/11] rerere: fix crash when conflict goes unresolved

2018-07-30 Thread Thomas Gummerer
On 07/30, Junio C Hamano wrote:
> Thomas Gummerer  writes:
> 
> > Currently when a user doesn't resolve a conflict in a file, but
> > commits the file with the conflict markers, and later the file ends up
> > in a state in which rerere can't handle it, subsequent rerere
> > operations that are interested in that path, such as 'rerere clear' or
> > 'rerere forget ' will fail, or even worse in the case of 'rerere
> > clear' segfault.
> >
> > Such states include nested conflicts, or an extra conflict marker that
> > doesn't have any match.
> >
> > This is because the first 'git rerere' when there was only one
> > conflict in the file leaves an entry in the MERGE_RR file behind.  The
> 
> I find this sentence, especially the "only one conflict in the file"
> part, a bit unclear.  What does the sentence count as one conflict?
> One block of lines enclosed inside "<<<"...">>>" pair?  The command
> behaves differently when there are two such blocks instead?

Yeah as you mentioned below, conflict marker(s) that cannot be parsed
here would make more sense.  Will adjust the commit message.

> > next 'git rerere' will then pick the rerere ID for that file up, and
> > not assign a new ID as it can't successfully calculate one.  It will
> > however still try to do the rerere operation, because of the existing
> > ID.  As the handle_file function fails, it will remove the 'preimage'
> > for the ID in the process, while leaving the ID in the MERGE_RR file.
> >
> > Now when 'rerere clear' for example is run, it will segfault in
> > 'has_rerere_resolution', because status is NULL.
> 
> I think this "status" refers to the collection->status[].  How do we
> get into that state, though?
> 
> new_rerere_id() and new_rerere_id_hex() fills id->collection by
> calling find_rerere_dir(), which either finds an existing rerere_dir
> instance or manufactures one with .status==NULL.  The .status[]
> array is later grown by calling fit_variant as we scan and find the
> pre/post images, but because there is no pre/post image for a file
> with unparseable conflicts, it is left NULL.
> 
> So another possible fix could be to make sure that .status[] is only
> read when .status_nr says there is something worth reading.  I am
> not saying that would be a better fix---I am just thinking out loud
> to make sure I understand the issue correctly.

Yeah what you are writing above matches my understanding, and that
should fix the issue as well.  I haven't actually tried what you're
proposing above, but I think I find it nicer to just remove the entry
we can't do anything with anyway.

> > To fix this, remove the rerere ID from the MERGE_RR file in the case
> > when we can't handle it, and remove the corresponding variant from
> > .git/rr-cache/.  Removing it unconditionally is fine here, because if
> > the user would have resolved the conflict and ran rerere, the entry
> > would no longer be in the MERGE_RR file, so we wouldn't have this
> > problem in the first place, while if the conflict was not resolved,
> > the only thing that's left in the folder is the 'preimage', which by
> > itself will be regenerated by git if necessary, so the user won't
> > loose any work.
> 
> s/loose/lose/
> 
> > Note that other variants that have the same conflict ID will not be
> > touched.
> 
> Nice.  Thanks for a fix.
> 
> >
> > Signed-off-by: Thomas Gummerer 
> > ---
> >  rerere.c  | 12 +++-
> >  t/t4200-rerere.sh | 22 ++
> >  2 files changed, 29 insertions(+), 5 deletions(-)
> >
> > diff --git a/rerere.c b/rerere.c
> > index da1ab54027..895ad80c0c 100644
> > --- a/rerere.c
> > +++ b/rerere.c
> > @@ -823,10 +823,7 @@ static int do_plain_rerere(struct string_list *rr, int 
> > fd)
> > struct rerere_id *id;
> > unsigned char sha1[20];
> > const char *path = conflict.items[i].string;
> > -   int ret;
> > -
> > -   if (string_list_has_string(rr, path))
> > -   continue;
> > +   int ret, has_string;
> >  
> > /*
> >  * Ask handle_file() to scan and assign a
> > @@ -834,7 +831,12 @@ static int do_plain_rerere(struct string_list *rr, int 
> > fd)
> >  * yet.
> >  */
> > ret = handle_file(path, sha1, NULL);
> > -   if (ret < 1)
> > +   has_string = string_list_has_string(rr, path);
> > +   if (ret < 0 && has_string) {
> > +   remove_variant(string_list_lookup(rr, path)->util);
> > +   string_list_remove(rr, path, 1);
> > +   }
> > +   if (ret < 1 || has_string)
> > continue;
> 
> We used to say "if we know about the path we do not do anything
> here, if we do not see any conflict in the file we do nothing,
> otherwise we assign a new id"; we now say "see if we can parse
> and also see if we have conflict(s); if we know about the path and
> we cannot parse, drop it from the rr database (because otherwise the
> e

  1   2   >