date:20160722

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Jens Lehmann


Am 21.07.2016 um 01:22 schrieb Stefan Beller:

So maybe we want to drop that series and first talk about a migration plan from
the current state to a world where we have the existence depending not
on the url
parameter, but a boolean variable submodule...
Depending on  a submodule would be ignored or tried to checkout
in e.g. `submodule update`


Whoa, that's a very intrusive change with a ton of compatibility
problems waiting to happen. Maybe its simpler to make "git submodule
sync" aware of worktrees and error out with an "you cannot use
submodules with different URLs in a worktree scenario" error when
the URL is going to change? That should make most use cases work
while avoiding the problematic ones.


If we want to move the current behavior of submodules forward, we
would want to have
anything but the url as shared variables and then use the url variable
as a per-worktree
existence flag.


Without having though deeply about all submodule variables, I see
them as worktree specific. E.g. "update=none" is used on our CI-
Server to avoid the disk space cost on some checkouts of a certain
superproject while using "update=checkout" on others where their
content is needed.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-subtree.sh: Use --allow-unrelated-histories when splitting with --rejoin

2016-07-22 Thread Johannes Schindelin

Hi Brett,

On Thu, 21 Jul 2016, Brett Cundal wrote:

> [re-sending as plain-text so the list software doesn't reject as
> spam... what decade is this?]

I know it feels good to get frustration off of your chest by ranting. I am
very sympathetic to that. Please note, however, that it puts the people
who are ready to help you immediately into a defensive corner. Probably
unintentional?

> Sorry about the empty commit message... there was one originally (albeit
> not as detailed as the pull request), but I guess it got stripped? As
> you have probably guessed, I have no idea how you guys do patch
> submission.

It is okay. For historical reasons, the patch submission is a bit more
involved than simply opening a Pull Request. We also frown upon
top-posting, among other rules of netiquette. You may dislike it, but you
would have to build more of a "street cred" if you truly wanted to try to
convince the Git developers to change it.

Back to the patch you wanted to submit: since this is an important bug
fix, I spent the time to clean it up. The only missing bit that requires
further effort from your side is that we need your Sign-off. See
https://github.com/git/git/blob/v2.9.2/Documentation/SubmittingPatches#L239-L291
for an explanation. Essentially, you are stating explicitly that you are
not legally prohibited from contributing this patch. If you can say that,
simply reply with a

Signed-off-by: Brett Cundal 

We can take it from there by inserting it into the following patch:

-- snipsnap --
From: Brett Cundal 
Subject: [PATCH] subtree: allow unrelated histories when splitting with --rejoin

Git 2.9 added a check against merging unrelated histories, which is
exactly what git subtree with --rejoin does. Adding the
--allow-unrelated-histories flag to merge will override this check.
---
 contrib/subtree/git-subtree.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index 7a39b30..556cd92 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -661,7 +661,7 @@ cmd_split()
if [ -n "$rejoin" ]; then
debug "Merging split branch into HEAD..."
latest_old=$(cache_get latest_old)
-   git merge -s ours \
+   git merge -s ours --allow-unrelated-histories \
-m "$(rejoin_msg "$dir" $latest_old $latest_new)" \
$latest_new >&2 || exit $?
fi
-- 
2.9.0.281.g286a8d9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH ew/daemon-socket-keepalive] Windows: add missing definition of ENOTSOCK

2016-07-22 Thread Johannes Schindelin

Hi Hannes,

On Thu, 21 Jul 2016, Johannes Sixt wrote:

> The previous commit introduced the first use of ENOTSOCK. This macro is
> not available on Windows. Define it as WSAENOTSOCK because that is the
> corresponding error value reported by the Windows versions of socket
> functions.

Thanks for catching this early.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Investment Funds

2016-07-22 Thread Miss Evelyn Kouassi

Good Day,
I am Miss Evelyn Kouassi. I will like to invest in your country. I
have a Mutual business proposal which will benefit both of us. I will
like to seek your consent first.
This project involves a huge specific amount which I can't mention
here for security reasons. If you feel you can handle this project,
please let me know, so that I give you comprehensive details about
me/the fund and explanations on my investment plan.
Sincerely, Miss Evelyn
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 00/16] Use merge_recursive() directly in the builtin am

2016-07-22 Thread Johannes Schindelin

This is the fourth iteration of the long-awaited re-roll of the attempt to
avoid spawning merge-recursive from the builtin am and use merge_recursive()
directly instead.

The *real* reason for the reroll is that I need a libified recursive
merge to accelerate the interactive rebase by teaching the sequencer to
do rebase -i's grunt work. Coming with a very nice 3x-5x speedup of
`rebase -i`.

In this endeavor, we need to be extra careful to retain backwards
compatibility. The test script t6022-merge-rename.sh, for example, verifies
that `git pull` exits with status 128 in case of a fatal error. To that end,
we need to make sure that fatal errors are handled by existing (builtin)
users via exit(128) (or die(), which calls exit(128) at the end).  New users
(such as a builtin helper doing rebase -i's grunt work) may want to print
some helpful advice what happened and how to get out of this mess before
erroring out.

The changes relative to the second iteration of this patch series (I
have a feeling that nobody reviewed the 3rd iteration because it was
based on `pu`):

- the was_tracked() function was adjusted as per Junio's suggestions

- the "counter gender bias" patch was submitted, and accepted,
  separately, even if the version we settled on sends a much weaker
  message than I would have preferred

- this patch series is on top of 'master' again

- a test was introduced to verify that we do not reintroduce the bug
  which required our hot fix to spawn the recursive merge again

- while at it, I fixed a ton of incorrect indentations that I missed in
  the first three iterations

- as the bc/cocci branch was merged, a couple of fixups were necessary
  to the patch that avoids spawning the recursive merge

- the interdiff is relative to v2 rebased onto master. That means that
  I had to simulate the "her_tree" change for the sake of the interdiff.

This patch series touches rather important code. Now that I addressed
concerns such as fixing translated bug reports, I would appreciate thorough
reviews with a focus on the critical parts of the code, those that could
result in regressions.


Johannes Schindelin (16):
  Verify that `git pull --rebase` shows the helpful advice when failing
  Report bugs consistently
  Avoid translating bug messages
  merge-recursive: clarify code in was_tracked()
  Prepare the builtins for a libified merge_recursive()
  merge_recursive: abort properly upon errors
  merge-recursive: avoid returning a wholesale struct
  merge-recursive: allow write_tree_from_memory() to error out
  merge-recursive: handle return values indicating errors
  merge-recursive: switch to returning errors instead of dying
  am -3: use merge_recursive() directly again
  merge-recursive: flush output buffer before printing error messages
  merge-recursive: write the commit title in one go
  merge-recursive: offer an option to retain the output in 'obuf'
  Ensure that the output buffer is released after calling merge_trees()
  merge-recursive: flush output buffer even when erroring out

 builtin/am.c   |  62 ++
 builtin/checkout.c |   5 +-
 builtin/ls-files.c |   3 +-
 builtin/merge.c|   2 +
 builtin/update-index.c |   2 +-
 grep.c |   8 +-
 imap-send.c|   4 +-
 merge-recursive.c  | 571 +
 merge-recursive.h  |   2 +-
 sequencer.c|   5 +
 sha1_file.c|   4 +-
 t/t5520-pull.sh|  30 +++
 trailer.c  |   2 +-
 transport.c|   2 +-
 wt-status.c|   4 +-
 15 files changed, 413 insertions(+), 293 deletions(-)

Published-As: 
https://github.com/dscho/git/releases/tag/am-3-merge-recursive-direct-v4
Interdiff vs v3:

 diff --git a/builtin/am.c b/builtin/am.c
 index bf4f79b..cfb79ea 100644
 --- a/builtin/am.c
 +++ b/builtin/am.c
 @@ -1583,15 +1583,14 @@ static int build_fake_ancestor(const struct am_state 
*state, const char *index_f
   */
  static int fall_back_threeway(const struct am_state *state, const char 
*index_path)
  {
 -  unsigned char orig_tree[GIT_SHA1_RAWSZ], her_tree[GIT_SHA1_RAWSZ],
 -our_tree[GIT_SHA1_RAWSZ];
 -  const unsigned char *bases[1] = {orig_tree};
 +  struct object_id orig_tree, their_tree, our_tree;
 +  const struct object_id *bases[1] = { &orig_tree };
struct merge_options o;
struct commit *result;
 -  char *her_tree_name;
 +  char *their_tree_name;
  
 -  if (get_sha1("HEAD", our_tree) < 0)
 -  hashcpy(our_tree, EMPTY_TREE_SHA1_BIN);
 +  if (get_oid("HEAD", &our_tree) < 0)
 +  hashcpy(our_tree.hash, EMPTY_TREE_SHA1_BIN);
  
if (build_fake_ancestor(state, index_path))
return error("could not build fake ancestor");
 @@ -1599,7 +1598,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
discard_cache();
read_cache_from(index_path);
  
 -  if (write_index_a

[PATCH v4 07/16] merge-recursive: avoid returning a wholesale struct

2016-07-22 Thread Johannes Schindelin

It is technically allowed, as per C89, for functions' return type to
be complete structs (i.e. *not* just pointers to structs).

However, it was just an oversight of this developer when converting
Python code to C code in 6d297f8 (Status update on merge-recursive in
C, 2006-07-08) which introduced such a return type.

Besides, by converting this construct to pass in the struct, we can now
start returning a value that can indicate errors in future patches. This
will help the current effort to libify merge-recursive.c.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 106 --
 1 file changed, 56 insertions(+), 50 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2d4cb80..e1cea96 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -885,47 +885,47 @@ static int merge_3way(struct merge_options *o,
return merge_status;
 }
 
-static struct merge_file_info merge_file_1(struct merge_options *o,
+static int merge_file_1(struct merge_options *o,
   const struct diff_filespec *one,
   const struct diff_filespec *a,
   const struct diff_filespec *b,
   const char *branch1,
-  const char *branch2)
+  const char *branch2,
+  struct merge_file_info *result)
 {
-   struct merge_file_info result;
-   result.merge = 0;
-   result.clean = 1;
+   result->merge = 0;
+   result->clean = 1;
 
if ((S_IFMT & a->mode) != (S_IFMT & b->mode)) {
-   result.clean = 0;
+   result->clean = 0;
if (S_ISREG(a->mode)) {
-   result.mode = a->mode;
-   oidcpy(&result.oid, &a->oid);
+   result->mode = a->mode;
+   oidcpy(&result->oid, &a->oid);
} else {
-   result.mode = b->mode;
-   oidcpy(&result.oid, &b->oid);
+   result->mode = b->mode;
+   oidcpy(&result->oid, &b->oid);
}
} else {
if (!oid_eq(&a->oid, &one->oid) && !oid_eq(&b->oid, &one->oid))
-   result.merge = 1;
+   result->merge = 1;
 
/*
 * Merge modes
 */
if (a->mode == b->mode || a->mode == one->mode)
-   result.mode = b->mode;
+   result->mode = b->mode;
else {
-   result.mode = a->mode;
+   result->mode = a->mode;
if (b->mode != one->mode) {
-   result.clean = 0;
-   result.merge = 1;
+   result->clean = 0;
+   result->merge = 1;
}
}
 
if (oid_eq(&a->oid, &b->oid) || oid_eq(&a->oid, &one->oid))
-   oidcpy(&result.oid, &b->oid);
+   oidcpy(&result->oid, &b->oid);
else if (oid_eq(&b->oid, &one->oid))
-   oidcpy(&result.oid, &a->oid);
+   oidcpy(&result->oid, &a->oid);
else if (S_ISREG(a->mode)) {
mmbuffer_t result_buf;
int merge_status;
@@ -937,64 +937,66 @@ static struct merge_file_info merge_file_1(struct 
merge_options *o,
die(_("Failed to execute internal merge"));
 
if (write_sha1_file(result_buf.ptr, result_buf.size,
-   blob_type, result.oid.hash))
+   blob_type, result->oid.hash))
die(_("Unable to add %s to database"),
a->path);
 
free(result_buf.ptr);
-   result.clean = (merge_status == 0);
+   result->clean = (merge_status == 0);
} else if (S_ISGITLINK(a->mode)) {
-   result.clean = merge_submodule(result.oid.hash,
+   result->clean = merge_submodule(result->oid.hash,
   one->path,
   one->oid.hash,
   a->oid.hash,
   b->oid.hash,
   !o->call_depth);
} else if (S_ISLNK(a->mode)) {
-   oidcpy(&result.oid, &a->oid);
+   oidcpy(&result->oid, &a->oid);

[PATCH v4 08/16] merge-recursive: allow write_tree_from_memory() to error out

2016-07-22 Thread Johannes Schindelin

It is possible that a tree cannot be written (think: disk full). We
will want to give the caller a chance to clean up instead of letting
the program die() in such a case.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index e1cea96..d5b9b1c 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1879,8 +1879,8 @@ int merge_trees(struct merge_options *o,
else
clean = 1;
 
-   if (o->call_depth)
-   *result = write_tree_from_memory(o);
+   if (o->call_depth && !(*result = write_tree_from_memory(o)))
+   return -1;
 
return clean;
 }
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 05/16] Prepare the builtins for a libified merge_recursive()

2016-07-22 Thread Johannes Schindelin

Previously, callers of merge_trees() or merge_recursive() expected that
code to die() with an error message. This used to be okay because we
called those commands from scripts, and had a chance to print out a
message in case the command failed fatally (read: with exit code 128).

As scripting incurs its own set of problems (portability, speed,
idiosynchracies of different shells, limited data structures leading to
inefficient code), we are converting more and more of these scripts into
builtins, using library functions directly.

We already tried to use merge_recursive() directly in the builtin
git-am, for example. Unfortunately, we had to roll it back temporarily
because some of the code in merge-recursive.c still deemed it okay to
call die(), when the builtin am code really wanted to print out a useful
advice after the merge failed fatally. In the next commits, we want to
fix that.

The code touched by this commit expected merge_trees() to die() with
some useful message when there is an error condition, but merge_trees()
is going to be improved by converting all die() calls to return error()
instead (i.e. return value -1 after printing out the message as before),
so that the caller can react more flexibly.

This is a step to prepare for the version of merge_trees() that no
longer dies,  even if we just imitate the previous behavior by calling
exit(128): this is what callers of e.g. `git merge` have come to expect.

Note that the callers of the sequencer (revert and cherry-pick) already
fail fast even for the return value -1; The only difference is that they
now get a chance to say " failed".

A caller of merge_trees() might want handle error messages themselves
(or even suppress them). As this patch is already complex enough, we
leave that change for a later patch.

Signed-off-by: Johannes Schindelin 
---
 builtin/checkout.c | 4 +++-
 builtin/merge.c| 2 ++
 sequencer.c| 4 
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 27c1a05..07dea3b 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -567,8 +567,10 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
o.ancestor = old->name;
o.branch1 = new->name;
o.branch2 = "local";
-   merge_trees(&o, new->commit->tree, work,
+   ret = merge_trees(&o, new->commit->tree, work,
old->commit->tree, &result);
+   if (ret < 0)
+   exit(128);
ret = reset_tree(new->commit->tree, opts, 0,
 writeout_error);
if (ret)
diff --git a/builtin/merge.c b/builtin/merge.c
index 19b3bc2..148a9a5 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -673,6 +673,8 @@ static int try_merge_strategy(const char *strategy, struct 
commit_list *common,
hold_locked_index(&lock, 1);
clean = merge_recursive(&o, head,
remoteheads->item, reversed, &result);
+   if (clean < 0)
+   exit(128);
if (active_cache_changed &&
write_locked_index(&the_index, &lock, COMMIT_LOCK))
die (_("unable to write %s"), get_index_file());
diff --git a/sequencer.c b/sequencer.c
index cdfac82..286a435 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -293,6 +293,8 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
clean = merge_trees(&o,
head_tree,
next_tree, base_tree, &result);
+   if (clean < 0)
+   return clean;
 
if (active_cache_changed &&
write_locked_index(&the_index, &index_lock, COMMIT_LOCK))
@@ -559,6 +561,8 @@ static int do_pick_commit(struct commit *commit, struct 
replay_opts *opts)
if (!opts->strategy || !strcmp(opts->strategy, "recursive") || 
opts->action == REPLAY_REVERT) {
res = do_recursive_merge(base, next, base_label, next_label,
 head, &msgbuf, opts);
+   if (res < 0)
+   return res;
write_message(&msgbuf, git_path_merge_msg());
} else {
struct commit_list *common = NULL;
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 04/16] merge-recursive: clarify code in was_tracked()

2016-07-22 Thread Johannes Schindelin

It can be puzzling to see that was_tracked() asks to get an index entry
by name, but does not take a negative return value for an answer.

The reason we have to do this is that cache_name_pos() only looks for
entries in stage 0, even if nobody asked for any stage in particular.

Let's rewrite the logic a little bit, to handle the easy case early: if
cache_name_pos() returned a non-negative position, we know it is a match,
and we do not even have to compare the name again (cache_name_pos() did
that for us already). We can say right away: yes, this file was tracked.

Only if there was no exact match do we need to look harder for any
matching entry in stage 2.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 30 ++
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index e0d5a57..dc3182b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -658,23 +658,21 @@ static int was_tracked(const char *path)
 {
int pos = cache_name_pos(path, strlen(path));
 
-   if (pos < 0)
-   pos = -1 - pos;
-   while (pos < active_nr &&
-  !strcmp(path, active_cache[pos]->name)) {
-   /*
-* If stage #0, it is definitely tracked.
-* If it has stage #2 then it was tracked
-* before this merge started.  All other
-* cases the path was not tracked.
-*/
-   switch (ce_stage(active_cache[pos])) {
-   case 0:
-   case 2:
+   if (0 <= pos)
+   /* we have been tracking this path */
+   return 1;
+
+   /*
+* Look for an unmerged entry for the path,
+* specifically stage #2, which would indicate
+* that "our" side before the merge started
+* had the path tracked (and resulted in a conflict).
+*/
+   for (pos = -1 - pos;
+pos < active_nr && !strcmp(path, active_cache[pos]->name);
+pos++)
+   if (ce_stage(active_cache[pos]) == 2)
return 1;
-   }
-   pos++;
-   }
return 0;
 }
 
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 02/16] Report bugs consistently

2016-07-22 Thread Johannes Schindelin

The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".

As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.

Signed-off-by: Johannes Schindelin 
---
 builtin/ls-files.c |  3 ++-
 builtin/update-index.c |  2 +-
 grep.c |  8 
 imap-send.c|  4 ++--
 merge-recursive.c  | 15 +++
 sha1_file.c|  4 ++--
 trailer.c  |  2 +-
 transport.c|  2 +-
 wt-status.c|  4 ++--
 9 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index f02e3d2..00ea91a 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -118,7 +118,8 @@ static void show_killed_files(struct dir_struct *dir)
 */
pos = cache_name_pos(ent->name, ent->len);
if (0 <= pos)
-   die("bug in show-killed-files");
+   die("BUG: killed-file %.*s not found",
+   ent->len, ent->name);
pos = -pos - 1;
while (pos < active_nr &&
   ce_stage(active_cache[pos]))
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 6cdfd5f..ba04b19 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1146,7 +1146,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
report(_("Untracked cache enabled for '%s'"), 
get_git_work_tree());
break;
default:
-   die("Bug: bad untracked_cache value: %d", untracked_cache);
+   die("BUG: bad untracked_cache value: %d", untracked_cache);
}
 
if (active_cache_changed) {
diff --git a/grep.c b/grep.c
index 394c856..22cbb73 100644
--- a/grep.c
+++ b/grep.c
@@ -693,10 +693,10 @@ static struct grep_expr *prep_header_patterns(struct 
grep_opt *opt)
 
for (p = opt->header_list; p; p = p->next) {
if (p->token != GREP_PATTERN_HEAD)
-   die("bug: a non-header pattern in grep header list.");
+   die("BUG: a non-header pattern in grep header list.");
if (p->field < GREP_HEADER_FIELD_MIN ||
GREP_HEADER_FIELD_MAX <= p->field)
-   die("bug: unknown header field %d", p->field);
+   die("BUG: unknown header field %d", p->field);
compile_regexp(p, opt);
}
 
@@ -709,7 +709,7 @@ static struct grep_expr *prep_header_patterns(struct 
grep_opt *opt)
 
h = compile_pattern_atom(&pp);
if (!h || pp != p->next)
-   die("bug: malformed header expr");
+   die("BUG: malformed header expr");
if (!header_group[p->field]) {
header_group[p->field] = h;
continue;
@@ -1514,7 +1514,7 @@ static int grep_source_1(struct grep_opt *opt, struct 
grep_source *gs, int colle
case GREP_BINARY_TEXT:
break;
default:
-   die("bug: unknown binary handling mode");
+   die("BUG: unknown binary handling mode");
}
}
 
diff --git a/imap-send.c b/imap-send.c
index db0fafe..67d67f8 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -506,12 +506,12 @@ static char *next_arg(char **s)
 
 static int nfsnprintf(char *buf, int blen, const char *fmt, ...)
 {
-   int ret;
+   int ret = -1;
va_list va;
 
va_start(va, fmt);
if (blen <= 0 || (unsigned)(ret = vsnprintf(buf, blen, fmt, va)) >= 
(unsigned)blen)
-   die("Fatal: buffer too small. Please report a bug.");
+   die("BUG: buffer too small (%d < %d)", ret, blen);
va_end(va);
return ret;
 }
diff --git a/merge-recursive.c b/merge-recursive.c
index 48fe7e7..7400034 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -259,7 +259,7 @@ struct tree *write_tree_from_memory(struct merge_options *o)
fprintf(stderr, "BUG: %d %.*s\n", ce_stage(ce),
(int)ce_namelen(ce), ce->name);
}
-   die("Bug in merge-recursive.c");
+   die("BUG: unmerged index entries in merge-recursive.c");
}
 
if (!active_cache_tree)
@@ -957,9 +957,8 @@ static struct merge_file_info merge_file_1(struct 
merge_options *o,
 
if (!oid_eq(&a->oid, &b->oid))
result.clean = 0;
-   } else {
-   die(_("unsupported object type in the tree

[PATCH v4 10/16] merge-recursive: switch to returning errors instead of dying

2016-07-22 Thread Johannes Schindelin

The recursive merge machinery is supposed to be a library function, i.e.
it should return an error when it fails. Originally the functions were
part of the builtin "merge-recursive", though, where it was simpler to
call die() and be done with error handling.

The existing callers were already prepared to detect negative return
values to indicate errors and to behave as previously: exit with code 128
(which is the same thing that die() does, after printing the message).

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 62 +++
 1 file changed, 35 insertions(+), 27 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 697ba03..24b42d6 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -266,8 +266,10 @@ struct tree *write_tree_from_memory(struct merge_options 
*o)
active_cache_tree = cache_tree();
 
if (!cache_tree_fully_valid(active_cache_tree) &&
-   cache_tree_update(&the_index, 0) < 0)
-   die(_("error building trees"));
+   cache_tree_update(&the_index, 0) < 0) {
+   error(_("error building trees"));
+   return NULL;
+   }
 
result = lookup_tree(active_cache_tree->sha1);
 
@@ -707,12 +709,10 @@ static int make_room_for_path(struct merge_options *o, 
const char *path)
/* Make sure leading directories are created */
status = safe_create_leading_directories_const(path);
if (status) {
-   if (status == SCLD_EXISTS) {
+   if (status == SCLD_EXISTS)
/* something else exists */
-   error(msg, path, _(": perhaps a D/F conflict?"));
-   return -1;
-   }
-   die(msg, path, "");
+   return error(msg, path, _(": perhaps a D/F conflict?"));
+   return error(msg, path, "");
}
 
/*
@@ -740,6 +740,8 @@ static int update_file_flags(struct merge_options *o,
 int update_cache,
 int update_wd)
 {
+   int ret = 0;
+
if (o->call_depth)
update_wd = 0;
 
@@ -760,9 +762,11 @@ static int update_file_flags(struct merge_options *o,
 
buf = read_sha1_file(oid->hash, &type, &size);
if (!buf)
-   die(_("cannot read object %s '%s'"), oid_to_hex(oid), 
path);
-   if (type != OBJ_BLOB)
-   die(_("blob expected for %s '%s'"), oid_to_hex(oid), 
path);
+   return error(_("cannot read object %s '%s'"), 
oid_to_hex(oid), path);
+   if (type != OBJ_BLOB) {
+   ret = error(_("blob expected for %s '%s'"), 
oid_to_hex(oid), path);
+   goto free_buf;
+   }
if (S_ISREG(mode)) {
struct strbuf strbuf = STRBUF_INIT;
if (convert_to_working_tree(path, buf, size, &strbuf)) {
@@ -783,8 +787,11 @@ static int update_file_flags(struct merge_options *o,
else
mode = 0666;
fd = open(path, O_WRONLY | O_TRUNC | O_CREAT, mode);
-   if (fd < 0)
-   die_errno(_("failed to open '%s'"), path);
+   if (fd < 0) {
+   ret = error_errno(_("failed to open '%s'"),
+ path);
+   goto free_buf;
+   }
write_in_full(fd, buf, size);
close(fd);
} else if (S_ISLNK(mode)) {
@@ -792,18 +799,18 @@ static int update_file_flags(struct merge_options *o,
safe_create_leading_directories_const(path);
unlink(path);
if (symlink(lnk, path))
-   die_errno(_("failed to symlink '%s'"), path);
+   ret = error_errno(_("failed to symlink '%s'"), 
path);
free(lnk);
} else
-   die(_("do not know what to do with %06o %s '%s'"),
-   mode, oid_to_hex(oid), path);
+   ret = error(_("do not know what to do with %06o %s 
'%s'"),
+   mode, oid_to_hex(oid), path);
  free_buf:
free(buf);
}
  update_index:
-   if (update_cache)
+   if (!ret && update_cache)
add_cacheinfo(mode, oid, path, 0, update_wd, 
ADD_CACHE_OK_TO_ADD);
-   return 0;
+   return ret;
 }
 
 static int update_file(struct merge_options *o,
@@ -929,20 +936,22 @@ static int merge_file_1(struct merge_options *o,
oidcpy(&result->oid, &a->oid);
else if (S_ISREG(a->mode)) {
mmbuffer

[PATCH v4 01/16] Verify that `git pull --rebase` shows the helpful advice when failing

2016-07-22 Thread Johannes Schindelin

It was noticed by Brendan Forster last October that the builtin `git am`
regressed on that. Our hot fix reverted to spawning the recursive merge
instead of using it as a library function.

As we are about to revert that hot fix, after making the recursive merge a
true library function (i.e. a function that does not die() in case of
"normal" errors), let's add a test that verifies that we do not regress on
the same problem which made the hot fix necessary in the first place.

Signed-off-by: Johannes Schindelin 
---
 t/t5520-pull.sh | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh
index 37ebbcf..d289056 100755
--- a/t/t5520-pull.sh
+++ b/t/t5520-pull.sh
@@ -255,6 +255,36 @@ test_expect_success '--rebase' '
test new = "$(git show HEAD:file2)"
 '
 
+test_expect_success '--rebase with conflicts shows advice' '
+   test_when_finished "git rebase --abort; git checkout -f to-rebase" &&
+   git checkout -b seq &&
+   printf "1\\n2\\n3\\n4\\n5\\n" >seq.txt &&
+   git add seq.txt &&
+   test_tick &&
+   git commit -m "Add seq.txt" &&
+   printf "6\\n" >>seq.txt &&
+   test_tick &&
+   git commit -m "Append to seq.txt" seq.txt &&
+   git checkout -b with-conflicts HEAD^ &&
+   printf "conflicting\\n" >>seq.txt &&
+   test_tick &&
+   git commit -m "Create conflict" seq.txt &&
+   test_must_fail git pull --rebase . seq 2>err >out &&
+   grep "When you have resolved this problem" out
+'
+test_expect_success 'failed --rebase shows advice' '
+   test_when_finished "git rebase --abort; git checkout -f to-rebase" &&
+   git checkout -b diverging &&
+   test_commit attributes .gitattributes "* text=auto" attrs &&
+   sha1="$(printf "1\\r\\n" | git hash-object -w --stdin)" &&
+   git update-index --cacheinfo 0644 $sha1 file &&
+   git commit -m v1-with-cr &&
+   git checkout -f -b fails-to-rebase HEAD^ &&
+   test_commit v2-without-cr file "2" file2-lf &&
+   test_must_fail git pull --rebase . diverging 2>err >out &&
+   grep "When you have resolved this problem" out
+'
+
 test_expect_success '--rebase fails with multiple branches' '
git reset --hard before-rebase &&
test_must_fail git pull --rebase . copy master 2>err &&
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 09/16] merge-recursive: handle return values indicating errors

2016-07-22 Thread Johannes Schindelin

We are about to libify the recursive merge machinery, where we only
die() in case of a bug or memory contention. To that end, we must heed
negative return values as indicating errors.

This requires our functions to be careful to pass through error
conditions in call chains, and for quite a few functions this means
that they have to return values to begin with.

The next step will be to convert the places where we currently die() to
return negative values (read: -1) instead.

Note that we ignore errors reported by make_room_for_path(), consistent
with the previous behavior (update_file_flags() used the return value of
make_room_for_path() only to indicate an early return, but not a fatal
error): if the error is really a fatal error, we will notice later; If
not, it was not that serious a problem to begin with. (Witnesses in
favor of this reasoning are t4151-am-abort and t7610-mergetool, which
would start failing if we stopped on errors reported by
make_room_for_path()).

Also note: while this patch makes the code slightly less readable in
update_file_flags() (we introduce a new "goto free_buf;" instead of
an explicit "free(buf); return;"), it is a preparatory change for
the next patch where we will convert all of the die() calls in the same
function to go through the free_buf return path instead.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 252 --
 1 file changed, 150 insertions(+), 102 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index d5b9b1c..697ba03 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -733,12 +733,12 @@ static int make_room_for_path(struct merge_options *o, 
const char *path)
return error(msg, path, _(": perhaps a D/F conflict?"));
 }
 
-static void update_file_flags(struct merge_options *o,
- const struct object_id *oid,
- unsigned mode,
- const char *path,
- int update_cache,
- int update_wd)
+static int update_file_flags(struct merge_options *o,
+const struct object_id *oid,
+unsigned mode,
+const char *path,
+int update_cache,
+int update_wd)
 {
if (o->call_depth)
update_wd = 0;
@@ -774,8 +774,7 @@ static void update_file_flags(struct merge_options *o,
 
if (make_room_for_path(o, path) < 0) {
update_wd = 0;
-   free(buf);
-   goto update_index;
+   goto free_buf;
}
if (S_ISREG(mode) || (!has_symlinks && S_ISLNK(mode))) {
int fd;
@@ -798,20 +797,22 @@ static void update_file_flags(struct merge_options *o,
} else
die(_("do not know what to do with %06o %s '%s'"),
mode, oid_to_hex(oid), path);
+ free_buf:
free(buf);
}
  update_index:
if (update_cache)
add_cacheinfo(mode, oid, path, 0, update_wd, 
ADD_CACHE_OK_TO_ADD);
+   return 0;
 }
 
-static void update_file(struct merge_options *o,
-   int clean,
-   const struct object_id *oid,
-   unsigned mode,
-   const char *path)
+static int update_file(struct merge_options *o,
+  int clean,
+  const struct object_id *oid,
+  unsigned mode,
+  const char *path)
 {
-   update_file_flags(o, oid, mode, path, o->call_depth || clean, 
!o->call_depth);
+   return update_file_flags(o, oid, mode, path, o->call_depth || clean, 
!o->call_depth);
 }
 
 /* Low level file merging, update and removal */
@@ -1010,7 +1011,7 @@ static int merge_file_one(struct merge_options *o,
return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
 }
 
-static void handle_change_delete(struct merge_options *o,
+static int handle_change_delete(struct merge_options *o,
 const char *path,
 const struct object_id *o_oid, int o_mode,
 const struct object_id *a_oid, int a_mode,
@@ -1018,6 +1019,7 @@ static void handle_change_delete(struct merge_options *o,
 const char *change, const char *change_past)
 {
char *renamed = NULL;
+   int ret = 0;
if (dir_in_way(path, !o->call_depth)) {
renamed = unique_path(o, path, a_oid ? o->branch1 : o->branch2);
}
@@ -1028,21 +1030,23 @@ static void handle_change_delete(struct merge_options 
*o,
 * correct; since there is no true "middle point" between
 * them, simply reuse the base version for virt

[PATCH v4 06/16] merge_recursive: abort properly upon errors

2016-07-22 Thread Johannes Schindelin

There are a couple of places where return values indicating errors
are ignored. Let's teach them manners.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index dc3182b..2d4cb80 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1949,8 +1949,9 @@ int merge_recursive(struct merge_options *o,
saved_b2 = o->branch2;
o->branch1 = "Temporary merge branch 1";
o->branch2 = "Temporary merge branch 2";
-   merge_recursive(o, merged_common_ancestors, iter->item,
-   NULL, &merged_common_ancestors);
+   if (merge_recursive(o, merged_common_ancestors, iter->item,
+   NULL, &merged_common_ancestors) < 0)
+   return -1;
o->branch1 = saved_b1;
o->branch2 = saved_b2;
o->call_depth--;
@@ -1966,6 +1967,8 @@ int merge_recursive(struct merge_options *o,
o->ancestor = "merged common ancestors";
clean = merge_trees(o, h1->tree, h2->tree, 
merged_common_ancestors->tree,
&mrtree);
+   if (clean < 0)
+   return clean;
 
if (o->call_depth) {
*result = make_virtual_commit(mrtree, "merged tree");
@@ -2022,6 +2025,9 @@ int merge_recursive_generic(struct merge_options *o,
hold_locked_index(lock, 1);
clean = merge_recursive(o, head_commit, next_commit, ca,
result);
+   if (clean < 0)
+   return clean;
+
if (active_cache_changed &&
write_locked_index(&the_index, lock, COMMIT_LOCK))
return error(_("Unable to write index."));
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 03/16] Avoid translating bug messages

2016-07-22 Thread Johannes Schindelin

While working on the patch series that avoids die()ing in recursive
merges, the issue came up that bug reports (i.e. die("BUG: ...")
constructs) should never be translated, as the target audience is the
Git developer community, not necessarily the current user, and hence
a translated message would make it *harder* to address the problem.

So let's stop translating the obvious ones. As it is really, really
outside the purview of this patch series to see whether there are more
die() statements that report bugs and are currently translated, that
task is left for another day and patch.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7400034..e0d5a57 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -958,7 +958,7 @@ static struct merge_file_info merge_file_1(struct 
merge_options *o,
if (!oid_eq(&a->oid, &b->oid))
result.clean = 0;
} else
-   die(_("BUG: unsupported object type in the tree"));
+   die("BUG: unsupported object type in the tree");
}
 
return result;
@@ -1802,7 +1802,7 @@ static int process_entry(struct merge_options *o,
 */
remove_file(o, 1, path, !a_mode);
} else
-   die(_("BUG: fatal merge failure, shouldn't happen."));
+   die("BUG: fatal merge failure, shouldn't happen.");
 
return clean_merge;
 }
@@ -1860,7 +1860,7 @@ int merge_trees(struct merge_options *o,
for (i = 0; i < entries->nr; i++) {
struct stage_data *e = entries->items[i].util;
if (!e->processed)
-   die(_("BUG: unprocessed path??? %s"),
+   die("BUG: unprocessed path??? %s",
entries->items[i].string);
}
 
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 14/16] merge-recursive: offer an option to retain the output in 'obuf'

2016-07-22 Thread Johannes Schindelin

Since 66a155b (Enable output buffering in merge-recursive., 2007-01-14),
we already accumulate the output in a buffer. The idea was to avoid
interfering with the progress output that goes to stderr, which is
unbuffered, when we write to stdout, which is buffered.

We extend that buffering to allow the caller to handle the output
(possibly suppressing it). This will help us when extending the
sequencer to do rebase -i's brunt work: it does not want the picks to
print anything by default but instead determine itself whether to print
the output or not.

Note that we also redirect the error messages into the output buffer
when the caller asked not to flush the output buffer, for two reasons:
1) to retain the correct output order, and 2) to allow the caller to
suppress *all* output.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 17 +
 merge-recursive.h |  2 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2bc41fe..1746c38 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -25,7 +25,7 @@
 
 static void flush_output(struct merge_options *o)
 {
-   if (o->obuf.len) {
+   if (o->buffer_output < 2 && o->obuf.len) {
fputs(o->obuf.buf, stdout);
strbuf_reset(&o->obuf);
}
@@ -36,10 +36,19 @@ static int err(struct merge_options *o, const char *err, 
...)
va_list params;
 
va_start(params, err);
-   flush_output(o);
+   if (o->buffer_output < 2)
+   flush_output(o);
+   else {
+   strbuf_complete(&o->obuf, '\n');
+   strbuf_addstr(&o->obuf, "error: ");
+   }
strbuf_vaddf(&o->obuf, err, params);
-   error("%s", o->obuf.buf);
-   strbuf_reset(&o->obuf);
+   if (o->buffer_output > 1)
+   strbuf_addch(&o->obuf, '\n');
+   else {
+   error("%s", o->obuf.buf);
+   strbuf_reset(&o->obuf);
+   }
va_end(params);
 
return -1;
diff --git a/merge-recursive.h b/merge-recursive.h
index d415724..340704c 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -13,7 +13,7 @@ struct merge_options {
MERGE_RECURSIVE_THEIRS
} recursive_variant;
const char *subtree_shift;
-   unsigned buffer_output : 1;
+   unsigned buffer_output : 2; /* 1: output at end, 2: keep buffered */
unsigned renormalize : 1;
long xdl_opts;
int verbosity;
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 11/16] am -3: use merge_recursive() directly again

2016-07-22 Thread Johannes Schindelin

Last October, we had to change this code to run `git merge-recursive`
in a child process: git-am wants to print some helpful advice when the
merge failed, but the code in question was not prepared to return, it
die()d instead.

We are finally at a point when the code *is* prepared to return errors,
and can avoid the child process again.

This reverts commit c63d4b2 (am -3: do not let failed merge from
completing the error codepath, 2015-10-09), with the necessary changes
to adjust for the fact that Git's source code changed in the meantime
(such as: using OIDs instead of hashes in the recursive merge, and a
removed gender bias).

Note: the code now calls merge_recursive_generic() again. Unlike
merge_trees() and merge_recursive(), this function returns 0 upon success,
as most of Git's functions. Therefore, the error value -1 naturally is
handled correctly, and we do not have to take care of it specifically.

Signed-off-by: Johannes Schindelin 
---
 builtin/am.c | 62 +---
 1 file changed, 22 insertions(+), 40 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index b77bf11..cfb79ea 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -1579,47 +1579,18 @@ static int build_fake_ancestor(const struct am_state 
*state, const char *index_f
 }
 
 /**
- * Do the three-way merge using fake ancestor, their tree constructed
- * from the fake ancestor and the postimage of the patch, and our
- * state.
- */
-static int run_fallback_merge_recursive(const struct am_state *state,
-   unsigned char *orig_tree,
-   unsigned char *our_tree,
-   unsigned char *their_tree)
-{
-   struct child_process cp = CHILD_PROCESS_INIT;
-   int status;
-
-   cp.git_cmd = 1;
-
-   argv_array_pushf(&cp.env_array, "GITHEAD_%s=%.*s",
-sha1_to_hex(their_tree), linelen(state->msg), 
state->msg);
-   if (state->quiet)
-   argv_array_push(&cp.env_array, "GIT_MERGE_VERBOSITY=0");
-
-   argv_array_push(&cp.args, "merge-recursive");
-   argv_array_push(&cp.args, sha1_to_hex(orig_tree));
-   argv_array_push(&cp.args, "--");
-   argv_array_push(&cp.args, sha1_to_hex(our_tree));
-   argv_array_push(&cp.args, sha1_to_hex(their_tree));
-
-   status = run_command(&cp) ? (-1) : 0;
-   discard_cache();
-   read_cache();
-   return status;
-}
-
-/**
  * Attempt a threeway merge, using index_path as the temporary index.
  */
 static int fall_back_threeway(const struct am_state *state, const char 
*index_path)
 {
-   unsigned char orig_tree[GIT_SHA1_RAWSZ], their_tree[GIT_SHA1_RAWSZ],
- our_tree[GIT_SHA1_RAWSZ];
+   struct object_id orig_tree, their_tree, our_tree;
+   const struct object_id *bases[1] = { &orig_tree };
+   struct merge_options o;
+   struct commit *result;
+   char *their_tree_name;
 
-   if (get_sha1("HEAD", our_tree) < 0)
-   hashcpy(our_tree, EMPTY_TREE_SHA1_BIN);
+   if (get_oid("HEAD", &our_tree) < 0)
+   hashcpy(our_tree.hash, EMPTY_TREE_SHA1_BIN);
 
if (build_fake_ancestor(state, index_path))
return error("could not build fake ancestor");
@@ -1627,7 +1598,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
discard_cache();
read_cache_from(index_path);
 
-   if (write_index_as_tree(orig_tree, &the_index, index_path, 0, NULL))
+   if (write_index_as_tree(orig_tree.hash, &the_index, index_path, 0, 
NULL))
return error(_("Repository lacks necessary blobs to fall back 
on 3-way merge."));
 
say(state, stdout, _("Using index info to reconstruct a base tree..."));
@@ -1643,7 +1614,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
init_revisions(&rev_info, NULL);
rev_info.diffopt.output_format = DIFF_FORMAT_NAME_STATUS;
diff_opt_parse(&rev_info.diffopt, &diff_filter_str, 1, 
rev_info.prefix);
-   add_pending_sha1(&rev_info, "HEAD", our_tree, 0);
+   add_pending_sha1(&rev_info, "HEAD", our_tree.hash, 0);
diff_setup_done(&rev_info.diffopt);
run_diff_index(&rev_info, 1);
}
@@ -1652,7 +1623,7 @@ static int fall_back_threeway(const struct am_state 
*state, const char *index_pa
return error(_("Did you hand edit your patch?\n"
"It does not apply to blobs recorded in its 
index."));
 
-   if (write_index_as_tree(their_tree, &the_index, index_path, 0, NULL))
+   if (write_index_as_tree(their_tree.hash, &the_index, index_path, 0, 
NULL))
return error("could not write tree");
 
say(state, stdout, _("Falling back to patching base and 3-way 
merge..."));
@@ -1668,11 +1639,22 @@ static int fall_back_three

[PATCH v4 12/16] merge-recursive: flush output buffer before printing error messages

2016-07-22 Thread Johannes Schindelin

The data structure passed to the recursive merge machinery has a feature
where the caller can ask for the output to be buffered into a strbuf, by
setting the field 'buffer_output'.

Previously, we simply swallowed the buffered output when showing error
messages. With this patch, we show the output first, and only then print
the error message.

Currently, the only user of that buffering is merge_recursive() itself,
to avoid the progress output to interfere.

In the next patches, we will introduce a new buffer_output mode that
forces merge_recursive() to retain the output buffer for further
processing by the caller. If the caller asked for that, we will then
also write the error messages into the output buffer. This is necessary
to give the caller more control not only how to react in case of errors
but also control how/if to display the error messages.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 116 --
 1 file changed, 68 insertions(+), 48 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 24b42d6..3ef1e2f 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -23,6 +23,28 @@
 #include "dir.h"
 #include "submodule.h"
 
+static void flush_output(struct merge_options *o)
+{
+   if (o->obuf.len) {
+   fputs(o->obuf.buf, stdout);
+   strbuf_reset(&o->obuf);
+   }
+}
+
+static int err(struct merge_options *o, const char *err, ...)
+{
+   va_list params;
+
+   va_start(params, err);
+   flush_output(o);
+   strbuf_vaddf(&o->obuf, err, params);
+   error("%s", o->obuf.buf);
+   strbuf_reset(&o->obuf);
+   va_end(params);
+
+   return -1;
+}
+
 static struct tree *shift_tree_object(struct tree *one, struct tree *two,
  const char *subtree_shift)
 {
@@ -148,14 +170,6 @@ static int show(struct merge_options *o, int v)
return (!o->call_depth && o->verbosity >= v) || o->verbosity >= 5;
 }
 
-static void flush_output(struct merge_options *o)
-{
-   if (o->obuf.len) {
-   fputs(o->obuf.buf, stdout);
-   strbuf_reset(&o->obuf);
-   }
-}
-
 __attribute__((format (printf, 3, 4)))
 static void output(struct merge_options *o, int v, const char *fmt, ...)
 {
@@ -198,7 +212,8 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
}
 }
 
-static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
+static int add_cacheinfo(struct merge_options *o,
+   unsigned int mode, const struct object_id *oid,
const char *path, int stage, int refresh, int options)
 {
struct cache_entry *ce;
@@ -206,7 +221,7 @@ static int add_cacheinfo(unsigned int mode, const struct 
object_id *oid,
  (refresh ? (CE_MATCH_REFRESH |
  CE_MATCH_IGNORE_MISSING) : 0 ));
if (!ce)
-   return error(_("addinfo_cache failed for path '%s'"), path);
+   return err(o, _("addinfo_cache failed for path '%s'"), path);
return add_cache_entry(ce, options);
 }
 
@@ -267,7 +282,7 @@ struct tree *write_tree_from_memory(struct merge_options *o)
 
if (!cache_tree_fully_valid(active_cache_tree) &&
cache_tree_update(&the_index, 0) < 0) {
-   error(_("error building trees"));
+   err(o, _("error building trees"));
return NULL;
}
 
@@ -535,7 +550,8 @@ static struct string_list *get_renames(struct merge_options 
*o,
return renames;
 }
 
-static int update_stages(const char *path, const struct diff_filespec *o,
+static int update_stages(struct merge_options *opt, const char *path,
+const struct diff_filespec *o,
 const struct diff_filespec *a,
 const struct diff_filespec *b)
 {
@@ -554,13 +570,13 @@ static int update_stages(const char *path, const struct 
diff_filespec *o,
if (remove_file_from_cache(path))
return -1;
if (o)
-   if (add_cacheinfo(o->mode, &o->oid, path, 1, 0, options))
+   if (add_cacheinfo(opt, o->mode, &o->oid, path, 1, 0, options))
return -1;
if (a)
-   if (add_cacheinfo(a->mode, &a->oid, path, 2, 0, options))
+   if (add_cacheinfo(opt, a->mode, &a->oid, path, 2, 0, options))
return -1;
if (b)
-   if (add_cacheinfo(b->mode, &b->oid, path, 3, 0, options))
+   if (add_cacheinfo(opt, b->mode, &b->oid, path, 3, 0, options))
return -1;
return 0;
 }
@@ -711,8 +727,8 @@ static int make_room_for_path(struct merge_options *o, 
const char *path)
if (status) {
if (status == SCLD_EXISTS)
/* something else exists */
-   return error(msg

[PATCH v4 15/16] Ensure that the output buffer is released after calling merge_trees()

2016-07-22 Thread Johannes Schindelin

The recursive merge machinery accumulates its output in an output
buffer, to be flushed at the end of merge_recursive(). At this point,
we forgot to release the output buffer.

When calling merge_trees() (i.e. the non-recursive part of the recursive
merge) directly, the output buffer is never flushed because the caller
may be merge_recursive() which wants to flush the output itself.

For the same reason, merge_trees() cannot release the output buffer: it
may still be needed.

Forgetting to release the output buffer did not matter much when running
git-checkout, or git-merge-recursive, because we exited after the
operation anyway. Ever since cherry-pick learned to pick a commit range,
however, this memory leak had the potential of becoming a problem.

Signed-off-by: Johannes Schindelin 
---
 builtin/checkout.c | 1 +
 merge-recursive.c  | 2 ++
 sequencer.c| 1 +
 3 files changed, 4 insertions(+)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 07dea3b..8d852d4 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -573,6 +573,7 @@ static int merge_working_tree(const struct checkout_opts 
*opts,
exit(128);
ret = reset_tree(new->commit->tree, opts, 0,
 writeout_error);
+   strbuf_release(&o.obuf);
if (ret)
return ret;
}
diff --git a/merge-recursive.c b/merge-recursive.c
index 1746c38..dcf1535 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2068,6 +2068,8 @@ int merge_recursive(struct merge_options *o,
commit_list_insert(h2, &(*result)->parents->next);
}
flush_output(o);
+   if (o->buffer_output < 2)
+   strbuf_release(&o->obuf);
if (show(o, 2))
diff_warn_rename_limit("merge.renamelimit",
   o->needed_rename_limit, 0);
diff --git a/sequencer.c b/sequencer.c
index 286a435..ec50519 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -293,6 +293,7 @@ static int do_recursive_merge(struct commit *base, struct 
commit *next,
clean = merge_trees(&o,
head_tree,
next_tree, base_tree, &result);
+   strbuf_release(&o.obuf);
if (clean < 0)
return clean;
 
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 13/16] merge-recursive: write the commit title in one go

2016-07-22 Thread Johannes Schindelin

In 66a155b (Enable output buffering in merge-recursive., 2007-01-14), we
changed the code such that it prints the output in one go, to avoid
interfering with the progress output.

Let's make sure that the same holds true when outputting the commit
title: previously, we used several printf() statements to stdout and
speculated that stdout's buffer is large enough to hold the entire
commit title.

Apart from making that speculation unnecessary, we change the code to
add the message to the output buffer before flushing for another reason:
the next commit will introduce a new level of output buffering, where
the caller can request the output not to be flushed, but to be retained
for further processing.

This latter feature will be needed when teaching the sequencer to do
rebase -i's brunt work: it wants to control the output of the
cherry-picks (i.e. recursive merges).

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3ef1e2f..2bc41fe 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -191,25 +191,26 @@ static void output(struct merge_options *o, int v, const 
char *fmt, ...)
 
 static void output_commit_title(struct merge_options *o, struct commit *commit)
 {
-   int i;
-   flush_output(o);
-   for (i = o->call_depth; i--;)
-   fputs("  ", stdout);
+   strbuf_addchars(&o->obuf, ' ', o->call_depth * 2);
if (commit->util)
-   printf("virtual %s\n", merge_remote_util(commit)->name);
+   strbuf_addf(&o->obuf, "virtual %s\n",
+   merge_remote_util(commit)->name);
else {
-   printf("%s ", find_unique_abbrev(commit->object.oid.hash, 
DEFAULT_ABBREV));
+   strbuf_addf(&o->obuf, "%s ",
+   find_unique_abbrev(commit->object.oid.hash,
+   DEFAULT_ABBREV));
if (parse_commit(commit) != 0)
-   printf(_("(bad commit)\n"));
+   strbuf_addf(&o->obuf, _("(bad commit)\n"));
else {
const char *title;
const char *msg = get_commit_buffer(commit, NULL);
int len = find_commit_subject(msg, &title);
if (len)
-   printf("%.*s\n", len, title);
+   strbuf_addf(&o->obuf, "%.*s\n", len, title);
unuse_commit_buffer(commit, msg);
}
}
+   flush_output(o);
 }
 
 static int add_cacheinfo(struct merge_options *o,
-- 
2.9.0.281.g286a8d9


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 16/16] merge-recursive: flush output buffer even when erroring out

2016-07-22 Thread Johannes Schindelin

Ever since 66a155b (Enable output buffering in merge-recursive.,
2007-01-14), we had a problem: When the merge failed in a fatal way, all
regular output was swallowed because we called die() and did not get a
chance to drain the output buffers.

To fix this, several modifications were necessary:

- we needed to stop die()ing, to give callers a chance to do something
  when an error occurred (in this case, flush the output buffers),

- we needed to delay printing the error message so that the caller can
  print the buffered output before that, and

- we needed to make sure that the output buffers are flushed even when
  the return value indicates an error.

The first two changes were introduced through earlier commits in this
patch series, and this commit addresses the third one.

Signed-off-by: Johannes Schindelin 
---
 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index dcf1535..501cfb5 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2059,6 +2059,7 @@ int merge_recursive(struct merge_options *o,
o->ancestor = "merged common ancestors";
clean = merge_trees(o, h1->tree, h2->tree, 
merged_common_ancestors->tree,
&mrtree);
+   flush_output(o);
if (clean < 0)
return clean;
 
@@ -2067,7 +2068,6 @@ int merge_recursive(struct merge_options *o,
commit_list_insert(h1, &(*result)->parents);
commit_list_insert(h2, &(*result)->parents->next);
}
-   flush_output(o);
if (o->buffer_output < 2)
strbuf_release(&o->obuf);
if (show(o, 2))
-- 
2.9.0.281.g286a8d9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 12/12] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-07-22 Thread Pranit Bauva

Hey Torsten,

On Fri, Jul 22, 2016 at 7:59 AM, Torsten Bögershausen  wrote:
>
>
> On 07/20/2016 11:47 PM, Pranit Bauva wrote:
>>
>> Reimplement the `get_terms` and `bisect_terms` shell function in C and
>> add `bisect-terms` subcommand to `git bisect--helper` to call it from
>> git-bisect.sh .
>>
>> In the shell version, the terms were identified by strings but in C
>> version its done by bit manipulation and passing the integer value to
>> the function.
>>
>> Using `--bisect-terms` subcommand is a temporary measure to port shell
>> function in C so as to use the existing test suite. As more functions
>> are ported, this subcommand will be retired and will be called by some
>> other methods.
>>
>> Mentored-by: Lars Schneider 
>> Mentored-by: Christian Couder 
>> Signed-off-by: Pranit Bauva 
>> ---
>>  builtin/bisect--helper.c | 74
>> +++-
>>  git-bisect.sh| 35 ++-
>>  2 files changed, 75 insertions(+), 34 deletions(-)
>>
>> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
>> index 001096a..185a8ad 100644
>> --- a/builtin/bisect--helper.c
>> +++ b/builtin/bisect--helper.c
>> @@ -8,6 +8,13 @@
>>  #include "run-command.h"
>>  #include "prompt.h"
>>
>> +enum terms_defined {
>> +   TERM_BAD = 1,
>> +   TERM_GOOD = 2,
>> +   TERM_NEW = 4,
>> +   TERM_OLD = 8
>> +};
>> +
>
> What does TERM stand for  ?
> It could be TERMinal, TERMinator or just TERM.
> Something like BIS_TERM_DEF_BAD .. may be more intuitive,
> and may avoid name clashes in the long run.
>
> And why are the defines 1,2,4,8 ?
> It looks as if a #define bitmap may be a better choice here ?
> How do we do these kind of bit-wise opions otherwise ?

I am not sure as why bitmaps would be a better choice except for git's
source code. I saw the source code (especially config.c) and it uses
"#defines" bitmap style. I haven't been able to find this method
before. Also it uses "(1<<2)" instead of "4".

>>  static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
>>  static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
>>  static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
>> @@ -26,6 +33,7 @@ static const char * const git_bisect_helper_usage[] = {
>> N_("git bisect--helper --bisect-write  
>>   []"),
>> N_("git bisect--helper --bisect-check-and-set-terms 
>>  "),
>> N_("git bisect--helper --bisect-next-check [] 
>> > +   N_("git bisect--helper --bisect-terms [--term-good | --term-old |
>> --term-bad | --term-new]"),
>> NULL
>>  };
>>
>> @@ -359,6 +367,43 @@ static int bisect_next_check(const struct
>> bisect_terms *terms,
>> return 0;
>>  }
>>
>> +static int get_terms(struct bisect_terms *terms)
>> +{
>> +   FILE *fp;
>> +   int res;
>> +   fp = fopen(git_path_bisect_terms(), "r");
>> +   if (!fp)
>> +   return -1;
>> +
>> +   bisect_terms_reset(terms);
>> +   res = strbuf_getline(&terms->term_bad, fp) == EOF ||
>> + strbuf_getline(&terms->term_good, fp) == EOF;
>> +
>> +   fclose(fp);
>> +   return res ? -1 : 0;
>> +}
>> +
>> +static int bisect_terms(struct bisect_terms *terms, int term_defined)
>> +{
>> +   if (get_terms(terms)) {
>> +   fprintf(stderr, "no terms defined\n");
>> +   return -1;
>> +   }
>> +   if (!term_defined) {
>> +   printf("Your current terms are %s for the old state\nand "
>> +  "%s for the new state.\n", terms->term_good.buf,
>> +  terms->term_bad.buf);
>> +   return 0;
>> +   }
>> +
>> +   if (term_defined == TERM_GOOD || term_defined == TERM_OLD)
>> +   printf("%s\n", terms->term_good.buf);
>> +   if (term_defined == TERM_BAD || term_defined == TERM_NEW)
>> +   printf("%s\n", terms->term_bad.buf);
>
> May be a switch-case ?
> Or at least "else if" ?

Yes. I will use a "else if". Thanks!

>> +
>> +   return 0;
>> +}
>> +
>>  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
>>  {
>> enum {
>> @@ -369,9 +414,11 @@ int cmd_bisect__helper(int argc, const char **argv,
>> const char *prefix)
>> CHECK_EXPECTED_REVS,
>> BISECT_WRITE,
>> CHECK_AND_SET_TERMS,
>> -   BISECT_NEXT_CHECK
>> +   BISECT_NEXT_CHECK,
>> +   BISECT_TERMS
>> } cmdmode = 0;
>> int no_checkout = 0, res = 0;
>> +   enum terms_defined term_defined = 0;
>> struct option options[] = {
>> OPT_CMDMODE(0, "next-all", &cmdmode,
>>  N_("perform 'git bisect next'"), NEXT_ALL),
>> @@ -389,6 +436,16 @@ int cmd_bisect__helper(int argc, const char **argv,
>> const char *prefix)
>>  N_("check and set terms in a bisection state"),
>> CHECK_AND_SET_TERMS),
>> OPT_CMDMODE(0, "bisec

[PATCH] diff: do not reuse worktree files that need "clean" conversion

2016-07-22 Thread Jeff King

On Thu, Jul 21, 2016 at 05:37:40PM -0400, Jeff King wrote:

> The next three are to show the final diffstat after the commit
> completes. It looks like the "should we reuse the worktree file?"
> optimization in diff_populate_filespec() does not take into account
> whether we will have to convert the contents, and it probably should.
> The point is that sometimes mmap-ing the file contents is more efficient
> than zlib inflating the object from disk. But if we have to exec an
> extra process and read the whole object contents into a strbuf, that is
> probably not a win.
> 
> Adding a "return 0" at the top of reuse_worktree_file() drops our 6 to
> 3 (but obviously it should actually check the attributes).

Here's a patch for that. I put it in t0021, as well, since I couldn't
find a more appropriate place. This means it conflicts with your earlier
patch, but I think it does a better job of addressing the specific area
it fixes.

We may want further tests on top as we fix more areas (though as I said
earlier, I think by avoiding the racy timestamps, your "commit" call
drops to just a single invocation per file. That still seems like one
more than we should need, but it at least matches your original
expectation :) ).

-- >8 --
Subject: diff: do not reuse worktree files that need "clean" conversion

When accessing a blob for a diff, we may try to reuse file
contents in the working tree, under the theory that it is
faster to mmap those file contents than it would be to
extract the content from the object database.

When we have to filter those contents, though, that
assumption does not hold. Even for our internal conversions
like CRLF, we have to allocate and fill a new buffer anyway.
But much worse, for external clean filters we have to exec
an arbitrary script, and we have no idea how expensive it
may be to run.

So let's skip this optimization when conversion into git's
"clean" form is required. This applies whenever the
"want_file" flag is false. When it's true, the caller
actually wants the smudged worktree contents, which the
reused file by definition already has (in fact, this is a
key optimization going the other direction, since reusing
the worktree file there lets us skip smudge filters).

Signed-off-by: Jeff King 
---
 diff.c|  7 +++
 t/t0021-conversion.sh | 11 +++
 2 files changed, 18 insertions(+)

diff --git a/diff.c b/diff.c
index 7d03419..b43d3dd 100644
--- a/diff.c
+++ b/diff.c
@@ -2683,6 +2683,13 @@ static int reuse_worktree_file(const char *name, const 
unsigned char *sha1, int
if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(sha1))
return 0;

+   /*
+* Similarly, if we'd have to convert the file contents anyway, that
+* makes the optimization not worthwhile.
+*/
+   if (!want_file && would_convert_to_git(name))
+   return 0;
+
len = strlen(name);
pos = cache_name_pos(name, len);
if (pos < 0)
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7bac2bc..e799e59 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -268,4 +268,15 @@ test_expect_success 'disable filter with empty override' '
test_must_be_empty err
 '

+test_expect_success 'diff does not reuse worktree files that need cleaning' '
+   test_config filter.counter.clean "echo . >>count; sed s/^/clean:/" &&
+   echo "file filter=counter" >.gitattributes &&
+   test_commit one file &&
+   test_commit two file &&
+
+   >count &&
+   git diff-tree -p HEAD &&
+   test_line_count = 0 count
+'
+
 test_done
-- 
2.9.2.506.g8452fe7

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1 1/3] convert: quote filter names in error messages

2016-07-22 Thread larsxschneider

From: Lars Schneider 

Git filter with spaces (e.g. `filter.sh foo`) are hard to read in
error messages. Quote them to improve the readability.

Signed-off-by: Lars Schneider 
---
 convert.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/convert.c b/convert.c
index b1614bf..522e2c5 100644
--- a/convert.c
+++ b/convert.c
@@ -397,7 +397,7 @@ static int filter_buffer_or_fd(int in, int out, void *data)
child_process.out = out;
 
if (start_command(&child_process))
-   return error("cannot fork to run external filter %s", 
params->cmd);
+   return error("cannot fork to run external filter '%s'", 
params->cmd);
 
sigchain_push(SIGPIPE, SIG_IGN);
 
@@ -415,13 +415,13 @@ static int filter_buffer_or_fd(int in, int out, void 
*data)
if (close(child_process.in))
write_err = 1;
if (write_err)
-   error("cannot feed the input to external filter %s", 
params->cmd);
+   error("cannot feed the input to external filter '%s'", 
params->cmd);
 
sigchain_pop(SIGPIPE);
 
status = finish_command(&child_process);
if (status)
-   error("external filter %s failed %d", params->cmd, status);
+   error("external filter '%s' failed %d", params->cmd, status);
 
strbuf_release(&cmd);
return (write_err || status);
@@ -462,15 +462,15 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
return 0;   /* error was already reported */
 
if (strbuf_read(&nbuf, async.out, len) < 0) {
-   error("read from external filter %s failed", cmd);
+   error("read from external filter '%s' failed", cmd);
ret = 0;
}
if (close(async.out)) {
-   error("read from external filter %s failed", cmd);
+   error("read from external filter '%s' failed", cmd);
ret = 0;
}
if (finish_async(&async)) {
-   error("external filter %s failed", cmd);
+   error("external filter '%s' failed", cmd);
ret = 0;
}
 
-- 
2.9.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1 2/3] convert: modernize tests

2016-07-22 Thread larsxschneider

From: Lars Schneider 

Use `test_config` to set the config, check that files are empty with
`test_must_be_empty`, compare files with `test_cmp`, and remove spaces
after ">".

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 62 +--
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7bac2bc..a05a8d2 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -13,8 +13,8 @@ EOF
 chmod +x rot13.sh
 
 test_expect_success setup '
-   git config filter.rot13.smudge ./rot13.sh &&
-   git config filter.rot13.clean ./rot13.sh &&
+   test_config filter.rot13.smudge ./rot13.sh &&
+   test_config filter.rot13.clean ./rot13.sh &&
 
{
echo "*.t filter=rot13"
@@ -38,8 +38,8 @@ script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
 
 test_expect_success check '
 
-   cmp test.o test &&
-   cmp test.o test.t &&
+   test_cmp test.o test &&
+   test_cmp test.o test.t &&
 
# ident should be stripped in the repository
git diff --raw --exit-code :test :test.i &&
@@ -47,10 +47,10 @@ test_expect_success check '
embedded=$(sed -ne "$script" test.i) &&
test "z$id" = "z$embedded" &&
 
-   git cat-file blob :test.t > test.r &&
+   git cat-file blob :test.t >test.r &&
 
-   ./rot13.sh < test.o > test.t &&
-   cmp test.r test.t
+   ./rot13.sh < test.o >test.t &&
+   test_cmp test.r test.t
 '
 
 # If an expanded ident ever gets into the repository, we want to make sure that
@@ -130,7 +130,7 @@ test_expect_success 'filter shell-escaped filenames' '
 
# delete the files and check them out again, using a smudge filter
# that will count the args and echo the command-line back to us
-   git config filter.argc.smudge "sh ./argc.sh %f" &&
+   test_config filter.argc.smudge "sh ./argc.sh %f" &&
rm "$normal" "$special" &&
git checkout -- "$normal" "$special" &&
 
@@ -141,7 +141,7 @@ test_expect_success 'filter shell-escaped filenames' '
test_cmp expect "$special" &&
 
# do the same thing, but with more args in the filter expression
-   git config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&
+   test_config filter.argc.smudge "sh ./argc.sh %f --my-extra-arg" &&
rm "$normal" "$special" &&
git checkout -- "$normal" "$special" &&
 
@@ -154,9 +154,9 @@ test_expect_success 'filter shell-escaped filenames' '
 '
 
 test_expect_success 'required filter should filter data' '
-   git config filter.required.smudge ./rot13.sh &&
-   git config filter.required.clean ./rot13.sh &&
-   git config filter.required.required true &&
+   test_config filter.required.smudge ./rot13.sh &&
+   test_config filter.required.clean ./rot13.sh &&
+   test_config filter.required.required true &&
 
echo "*.r filter=required" >.gitattributes &&
 
@@ -165,17 +165,17 @@ test_expect_success 'required filter should filter data' '
 
rm -f test.r &&
git checkout -- test.r &&
-   cmp test.o test.r &&
+   test_cmp test.o test.r &&
 
./rot13.sh expected &&
git cat-file blob :test.r >actual &&
-   cmp expected actual
+   test_cmp expected actual
 '
 
 test_expect_success 'required filter smudge failure' '
-   git config filter.failsmudge.smudge false &&
-   git config filter.failsmudge.clean cat &&
-   git config filter.failsmudge.required true &&
+   test_config filter.failsmudge.smudge false &&
+   test_config filter.failsmudge.clean cat &&
+   test_config filter.failsmudge.required true &&
 
echo "*.fs filter=failsmudge" >.gitattributes &&
 
@@ -186,9 +186,9 @@ test_expect_success 'required filter smudge failure' '
 '
 
 test_expect_success 'required filter clean failure' '
-   git config filter.failclean.smudge cat &&
-   git config filter.failclean.clean false &&
-   git config filter.failclean.required true &&
+   test_config filter.failclean.smudge cat &&
+   test_config filter.failclean.clean false &&
+   test_config filter.failclean.required true &&
 
echo "*.fc filter=failclean" >.gitattributes &&
 
@@ -197,8 +197,8 @@ test_expect_success 'required filter clean failure' '
 '
 
 test_expect_success 'filtering large input to small output should use little 
memory' '
-   git config filter.devnull.clean "cat >/dev/null" &&
-   git config filter.devnull.required true &&
+   test_config filter.devnull.clean "cat >/dev/null" &&
+   test_config filter.devnull.required true &&
for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
echo "30MB filter=devnull" >.gitattributes &&
GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
@@ -207,7 +207,7 @@ test_expect_success 'filtering large input to small output 
should use little mem
 test_expect_success 'filter that does not read is fine' '

[PATCH v1 0/3] Git filter protocol

2016-07-22 Thread larsxschneider

From: Lars Schneider 

Hi,

Git's clean/smudge mechanism invokes an external filter process for every
single blob that is affected by a filter. If Git filters a lot of blobs
then the startup time of the external filter processes can become a
significant part of the overall Git execution time. This patch series
addresses this problem.

The first two patches are cleanup patches which are not really necessary
for the feature. The third patch is the relevant one :-)

You will notice that I do not check the exact number of "clean" filter
invocations in the tests. The reason is that Git calls "clean" multiple
times (up to 4 times for the same blob!). I posted a patch to demonstrate
the problem and I would prefer to tackle the problem in a seperate patch
series: http://thread.gmane.org/gmane.comp.version-control.git/300028/

The main reason for this Git core patch is to speed up Git large file
extensions such as git-annex or Git LFS. I proposed an according Git LFS
patch here: https://github.com/github/git-lfs/pull/1382

In addition to the Git core tests, all Git LFS integration tests run clean
on my machine.

Thanks,
Lars


Lars Schneider (3):
  convert: quote filter names in error messages
  convert: modernize tests
  convert: add filter..useProtocol option

 Documentation/gitattributes.txt |  41 ++-
 convert.c   | 222 +++---
 t/t0021-conversion.sh   | 232 ++--
 t/t0021/rot13.pl|  80 ++
 4 files changed, 531 insertions(+), 44 deletions(-)
 create mode 100755 t/t0021/rot13.pl

--
2.9.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread larsxschneider

From: Lars Schneider 

Git's clean/smudge mechanism invokes an external filter process for every
single blob that is affected by a filter. If Git filters a lot of blobs
then the startup time of the external filter processes can become a
significant part of the overall Git execution time.

This patch adds the filter..useProtocol option which, if enabled,
keeps the external filter process running and processes all blobs with
the following protocol over stdin/stdout.

1. Git starts the filter on first usage and expects a welcome message
with protocol version number:
Git <-- Filter: "git-filter-protocol\n"
Git <-- Filter: "version 1"

2. Git sends the command (either "smudge" or "clean"), the filename, the
content size in bytes, and the content separated by a newline character:
Git --> Filter: "smudge\n"
Git --> Filter: "testfile.dat\n"
Git --> Filter: "7\n"
Git --> Filter: "CONTENT"

3. The filter is expected to answer with the result content size in
bytes and the result content separated by a newline character:
Git <-- Filter: "15\n"
Git <-- Filter: "SMUDGED_CONTENT"

4. The filter is expected to wait for the next file in step 2, again.

Please note that the protocol filters do not support stream processing
with this implemenatation because the filter needs to know the length of
the result in advance. A protocol version 2 could address this in a
future patch.

Signed-off-by: Lars Schneider 
Helped-by: Martin-Louis Bright 
---
 Documentation/gitattributes.txt |  41 +++-
 convert.c   | 210 ++--
 t/t0021-conversion.sh   | 170 
 t/t0021/rot13.pl|  80 +++
 4 files changed, 494 insertions(+), 7 deletions(-)
 create mode 100755 t/t0021/rot13.pl

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8882a3e..7026d62 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -300,7 +300,10 @@ checkout, when the `smudge` command is specified, the 
command is
 fed the blob object from its standard input, and its standard
 output is used to update the worktree file.  Similarly, the
 `clean` command is used to convert the contents of worktree file
-upon checkin.
+upon checkin. By default these commands process only a single
+blob and terminate. If the setting filter..useProtocol is
+enabled then Git can process all blobs with a single filter command
+invocation (see filter protocol below).
 
 One use of the content filtering is to massage the content into a shape
 that is more convenient for the platform, filesystem, and the user to use.
@@ -375,6 +378,42 @@ substitution.  For example:
 
 
 
+Filter Protocol
+^^^
+
+If the setting filter..useProtocol is enabled then Git
+can process all blobs with a single filter command invocation
+by talking with the following protocol over stdin/stdout.
+
+Git starts the filter on first usage and expects a welcome
+message with protocol version number:
+
+Git < Filter: "git-filter-protocol\n"
+Git < Filter: "version 1"
+
+
+Afterwards Git sends a blob command (either "smudge" or "clean"),
+the filename, the content size in bytes, and the content separated
+by a newline character:
+
+Git > Filter: "smudge\n"
+Git > Filter: "testfile.dat\n"
+Git > Filter: "7\n"
+Git > Filter: "CONTENT"
+
+
+The filter is expected to answer with the result content size in
+bytes and the result content separated by a newline character:
+
+Git < Filter: "15\n"
+Git < Filter: "SMUDGED_CONTENT"
+
+
+Afterwards the filter is expected to wait for the next blob process
+command. A demo implementation can be found in `t/t0021/rot13.pl`
+located in the Git core repository.
+
+
 Interaction between checkin/checkout attributes
 ^^^
 
diff --git a/convert.c b/convert.c
index 522e2c5..91ce86f 100644
--- a/convert.c
+++ b/convert.c
@@ -481,12 +481,188 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
return ret;
 }
 
+static int cmd_process_map_init = 0;
+static struct hashmap cmd_process_map;
+
+struct cmd2process {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *cmd;
+   long protocol;
+   struct child_process process;
+};
+
+static int cmd2process_cmp(const struct cmd2process *e1,
+   const struct 
cmd2process *e2,
+   const void *unused)
+{
+   return strcmp(e1->cmd, e2->cmd);
+}
+
+static struct cmd2process *find_protocol_filter_entry(const char *cmd)
+{
+   struct cmd2process k;
+   hashmap_entry_init(&k, strhash(cmd));
+   k.cmd = cmd;
+

Re: What's cooking in git.git (Jul 2016, #06; Tue, 19)

2016-07-22 Thread Junio C Hamano

Pranit Bauva  writes:

> The current branch contains 30-40 % of the work. I am going to send 3
> more functions check_and_set_terms(), bisect_next_check() and
> bisect_terms() positively by today. After that the work will be around
> 50-60% complete. After that bisect_start() (I have ported this but it
> has a bug right now) then bisect_next(), bisect_replay(),
> bisect_state(), and a few small more.

Thanks for an interim report of the current state within the overall
plan.  IIRC, bisect_next (which gets called from bisect_auto_next)
and the functions it calls comprise the bulk of the logic of the
program, so when you are ready to tackle bisect_next, which would
mean that you have done with all the functions it calls, you would
roughly be halfway there.  Nice.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Git and SHA-1 security (again)

2016-07-22 Thread Junio C Hamano

Johannes Schindelin  writes:

> Hi Junio,
>
> On Mon, 18 Jul 2016, Junio C Hamano wrote:
>
>> "brian m. carlson"  writes:
>> 
>> > I will say that the pack format will likely require some changes,
>> > because it assumes ...  The reason is that we can't have an
>> > unambiguous parse of the current objects if two hash algorithms are in
>> > use  So when we look at a new hash, we need to provide an
>> > unambiguous way to know what hash is in use.  The two choices are to
>> > either require all object use the new hash, or to extend the objects
>> > to include the hash.  Until a couple days ago, I had planned to do the
>> > former.  I had not even considered using a multihash approach due to
>> > the complexity.
>> 
>> Objects in Git identify themselves, but once you introduce the second
>> hash function (as opposed to replacing the hash function to a new one),
>> you would allow people to call the same object by two names.  That has
>> interesting implications.
>> 
>> [...]
>
> So essentially you are saying that the multi-hash approach has too many
> negative implications, right? At least that is what I understand.
>
> Looks more and more like we do need to convert repositories wholesale, and
> keep a two-way mapping for talking to remote repositories.
>
> Would you concur?

Not necessarily.

That was me thinking aloud, listing some issues that I would imagine
to be tricky to solve, without even attempting to be exhaustive,
that I expect to see solved in a good end-result implementation.
For example, "I do not see a nice way to solve X myself without
doing Y" in the message you are responding to does not necessarily
mean there is no good solution to X (just "I do not think of any
offhand"), and it does not mean I think it is terrible that we have
to do Y to solve X.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 12:32 AM, Jens Lehmann  wrote:
> Am 21.07.2016 um 01:22 schrieb Stefan Beller:
>>
>> So maybe we want to drop that series and first talk about a migration plan
>> from
>> the current state to a world where we have the existence depending not
>> on the url
>> parameter, but a boolean variable submodule...
>> Depending on  a submodule would be ignored or tried to checkout
>> in e.g. `submodule update`
>
>
> Whoa, that's a very intrusive change with a ton of compatibility
> problems waiting to happen. Maybe its simpler to make "git submodule
> sync" aware of worktrees and error out with an "you cannot use
> submodules with different URLs in a worktree scenario" error when
> the URL is going to change? That should make most use cases work
> while avoiding the problematic ones.

I think fixing sync alone is just a drop of water on the oven.
Actually I can think of scenarios that have different URLs for different
worktrees (think of the automatic CI thing that should only fetch from
an internal server, whereas the dev-checkout fetches from upstream)
Actually each config variable (including the update strategy as you
mention below, but also the depth, branch, path) may be different in
one work tree.

I do not want to forbid the existence of different settings (URLs)
per worktree. Rather I think a different setting is a user decision,
hence they will want to run "git config --worktree ..."

And one of the unfortunate things is the coupling of existence of a
submodule and the URL. If that were to be decoupled, you could do
a "git config --worktree submodule..exists true" (or it is wrapped
fancily in "git submodule init") and the URL would not have to be copied
from the .gitmodules file.

I agree that this is a breaking change, which is why I'd guard it with
a config option such that the user can make the choice if they want
to go with the old behavior or the new behavior.

>
>> If we want to move the current behavior of submodules forward, we
>> would want to have
>> anything but the url as shared variables and then use the url variable
>> as a per-worktree
>> existence flag.
>
>
> Without having though deeply about all submodule variables, I see
> them as worktree specific. E.g. "update=none" is used on our CI-
> Server to avoid the disk space cost on some checkouts of a certain
> superproject while using "update=checkout" on others where their
> content is needed.

But this is a conscious user choice, so you would have configured
that on a per-worktree basis anyway?
i.e. it seems to me as if "update=checkout" is a default that is good
for all but one worktree, so why would you want to configure that n times
instead of just once as default?
The non default behavior is then overwritten in the specific worktree.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] use strbuf_addbuf() for appending a strbuf to another

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

> I don't think we do. Going back to the original discussion:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/136141/focus=136774
>
> it was mostly just "hey, this would fail really confusingly if we ever
> did, so let's make it safe".
>
> The second strbuf_grow() is by definition a noop (which is why 81d2cae
> works at all), but we do pay the size-computation cost.

Both true.  If the second one is not a noop, nothing is fixed by
81d2cae at all, but it is subtle.

René's update makes it far easier to understand what is going on.

>> -- >8 --
>> Subject: strbuf: avoid calling strbuf_grow() twice in strbuf_addbuf()
>> 
>> Implement strbuf_addbuf() as a normal function in order to avoid calling
>> strbuf_grow() twice, with the second callinside strbud_add() being a
>> no-op.  This is slightly faster and also reduces the text size a bit.
>
> Seems reasonable.  IMHO the main advantage is that one does not have to
> reason about the double strbuf_grow() (i.e., that the strbuf_add() is
> safe because we know its grow is a noop).
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/3] minor git-jump improvements

2016-07-22 Thread Jeff King

Here are a few quick fixes and features for git-jump. The first is a bug
I noticed and fixed recently. And that reminded me of the second one,
which I'd been carrying in my local copy for a long time.

  [1/3]: contrib/git-jump: fix greedy regex when matching hunks
  [2/3]: contrib/git-jump: add whitespace-checking mode
  [3/3]: contrib/git-jump: fix typo in README

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] contrib/git-jump: fix greedy regex when matching hunks

2016-07-22 Thread Jeff King

The hunk-header regex looks for "\+\d+" to find the
post-image line numbers, but it skips the pre-image line
numbers with a simple ".*". That means we may greedily eat
the post-image numbers and match a "\+\d" further on, in the
funcname text.

For example, commit 6b9c38e has this hunk header:

  diff --git a/t/t0006-date.sh b/t/t0006-date.sh
  [...]
  @@ -50,8 +50,8 @@ check_show iso-local "$TIME" '2016-06-15 14:13:20 +'

If you run:

  git checkout 6b9c38e
  git jump diff HEAD^ t/

it will erroneously match "+" as the starting line
number and jump there, rather than line 50.

We can fix it by just making the "skip" regex non-greedy,
taking the first "+" we see, which should be the post-image
line information.

Signed-off-by: Jeff King 
---
 contrib/git-jump/git-jump | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index dc90cd6..1f1b996 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -25,7 +25,7 @@ mode_diff() {
perl -ne '
if (m{^\+\+\+ (.*)}) { $file = $1; next }
defined($file) or next;
-   if (m/^@@ .*\+(\d+)/) { $line = $1; next }
+   if (m/^@@ .*?\+(\d+)/) { $line = $1; next }
defined($line) or next;
if (/^ /) { $line++; next }
if (/^[-+]\s*(.*)/) {
-- 
2.9.2.506.g8452fe7

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] contrib/git-jump: add whitespace-checking mode

2016-07-22 Thread Jeff King

If you have whitespace errors in lines you've introduced, it
can be convenient to be able to jump directly to them for
fixing.  You can't quite use "git jump diff" for this,
because though it passes arbitrary options to "git diff", it
expects to see an actual unified diff in the output.

Whereas "git diff --check" actually produces lines that look
like compiler quickfix lines already, meaning we just need
to run it and feed the output directly to the editor.

Signed-off-by: Jeff King 
---
 contrib/git-jump/git-jump | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 1f1b996..427f206 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -12,6 +12,8 @@ diff: elements are diff hunks. Arguments are given to diff.
 merge: elements are merge conflicts. Arguments are ignored.
 
 grep: elements are grep hits. Arguments are given to grep.
+
+ws: elements are whitespace errors. Arguments are given to diff --check.
 EOF
 }
 
@@ -55,6 +57,10 @@ mode_grep() {
'
 }
 
+mode_ws() {
+   git diff --check "$@"
+}
+
 if test $# -lt 1; then
usage >&2
exit 1
-- 
2.9.2.506.g8452fe7

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] contrib/git-jump: fix typo in README

2016-07-22 Thread Jeff King

Signed-off-by: Jeff King 
---
 contrib/git-jump/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/git-jump/README b/contrib/git-jump/README
index 1cebc32..202b499 100644
--- a/contrib/git-jump/README
+++ b/contrib/git-jump/README
@@ -83,7 +83,7 @@ complete list of files and line numbers for each match.
 Limitations
 ---
 
-This scripts was written and tested with vim. Given that the quickfix
+This script was written and tested with vim. Given that the quickfix
 format is the same as what gcc produces, I expect emacs users have a
 similar feature for iterating through the list, but I know nothing about
 how to activate it.
-- 
2.9.2.506.g8452fe7
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 2/3] contrib/git-jump: add whitespace-checking mode

2016-07-22 Thread Jeff King

On Fri, Jul 22, 2016 at 12:29:45PM -0400, Jeff King wrote:

> If you have whitespace errors in lines you've introduced, it
> can be convenient to be able to jump directly to them for
> fixing.  You can't quite use "git jump diff" for this,
> because though it passes arbitrary options to "git diff", it
> expects to see an actual unified diff in the output.
> 
> Whereas "git diff --check" actually produces lines that look
> like compiler quickfix lines already, meaning we just need
> to run it and feed the output directly to the editor.
> 
> Signed-off-by: Jeff King 
> ---
>  contrib/git-jump/git-jump | 6 ++
>  1 file changed, 6 insertions(+)

Whoops. I updated the README, too, but forgot to actually commit it.
Here's an updated patch.

-- >8 --
Subject: contrib/git-jump: add whitespace-checking mode

If you have whitespace errors in lines you've introduced, it
can be convenient to be able to jump directly to them for
fixing.  You can't quite use "git jump diff" for this,
because though it passes arbitrary options to "git diff", it
expects to see an actual unified diff in the output.

Whereas "git diff --check" actually produces lines that look
like compiler quickfix lines already, meaning we just need
to run it and feed the output directly to the editor.

Signed-off-by: Jeff King 
---
 contrib/git-jump/README   | 4 +++-
 contrib/git-jump/git-jump | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/contrib/git-jump/README b/contrib/git-jump/README
index 1cebc32..3e0b65b 100644
--- a/contrib/git-jump/README
+++ b/contrib/git-jump/README
@@ -29,7 +29,7 @@ Obviously this trivial case isn't that interesting; you could 
just open
 `foo.c` yourself. But when you have many changes scattered across a
 project, you can use the editor's support to "jump" from point to point.
 
-Git-jump can generate three types of interesting lists:
+Git-jump can generate four types of interesting lists:
 
   1. The beginning of any diff hunks.
 
@@ -37,6 +37,8 @@ Git-jump can generate three types of interesting lists:
 
   3. Any grep matches.
 
+  4. Any whitespace errors detected by `git diff --check`.
+
 
 Using git-jump
 --
diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 1f1b996..427f206 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -12,6 +12,8 @@ diff: elements are diff hunks. Arguments are given to diff.
 merge: elements are merge conflicts. Arguments are ignored.
 
 grep: elements are grep hits. Arguments are given to grep.
+
+ws: elements are whitespace errors. Arguments are given to diff --check.
 EOF
 }
 
@@ -55,6 +57,10 @@ mode_grep() {
'
 }
 
+mode_ws() {
+   git diff --check "$@"
+}
+
 if test $# -lt 1; then
usage >&2
exit 1
-- 
2.9.2.506.g8452fe7

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Junio C Hamano

Stefan Beller  writes:

> From a users POV there are:
> * non existent submodules (no gitlink recorded, no config set,
>   no repo in place)
> * not initialized submodules (gitlink is recorded, no config set,
>   and an empty repo is put in the working tree as a place holder).

This is no different from what you later call "embedded".  The only
difference is that embedded thing hasn't seen its initial commit.

> * initialized submodules (gitlink is recorded, the config
>   submodule ..url is copied from the .gitmodules file to .git/config.
>   an empty dir in the working tree as a place holder)
>   A user may change the configuration before the next step as the url in
>   the .gitmodules file may be wrong and the user doesn't want to
>   rewrite history

i.e. what "submodule init" gives you.

> * existing submodules (gitlink is recorded, the config option is set
>   and instead of an empty placeholder dir, we actually have a git
>   repo there.)

i.e. what "submodule update" after "submodule init" gives you.

> * matching submodules (the recorded git link matches
>   the actual checked out state of the repo!, config option and repo exist)

Is this any different from "existing" case for the purpose of
discussing the interaction between a submodule (and its checkout)
and having possibly multiple worktrees of its superproject?

I agree that when a top-level superproject has multiple worktrees
these multiple worktrees may want to have the same submodule in
different states, but I'd imagine that they want to share the same
physical repository (i.e. $GIT_DIR/modules/$name of the primary
worktree of the superproject)---is everybody involved in the
discussion share this assumption?

Assuming that everybody is on the same page, that means "do we have
the repository for that submodule, and if so where in our local
filesystem?" is a bit of information shared across the worktrees of
the superproject.  And the "name" used to identify the submodule is
also shared across these worktrees of the superproject, as it is
meant to be a unique (within the superproject) identifier for that
"other" project it uses, no matter where in the superproject's
working tree (note: this is "working tree", not "worktree") it would
be checked out, and where the upstream URL to get further updates to
the submodule is (i.e. that URL may change over time if they relocate,
or it may even change when the user who works on the superproject
decides to use a different mirror).

What can be different between the instantiation of the same
submodule in these multiple worktrees, and how they should be
recorded?

 * submodule.$name.URL?  I am not sure if we want to have different
   "upstreams" depending on the worktree of the superproject.  While
   there is no fundamental reason to forbid it, having to maintain a
   single local repository for a submodule would mean they would
   need to be treated as separate "remotes" in the submodule
   repository.

 * submodule.$name.path of course can be different depending on
   which commit of the superproject is checked out in the worktree,
   as the superproject may move the submodule binding site across
   its versions.

 * submodule.$name.update, submodule.$name.ignore,
   submodule.$name.branch, etc. would need to be all different among
   worktrees of the superproject, as that is the whole point of
   being able to work on separate branches of the superproject in
   separate worktrees.

Somewhere in this discussion thread, you present the conclusion of
your discussion with Jonathan Nieder that there needs a separate
"should the submodule directory be populated?" bit, which currently
is tied to submodule.$name.URL in $GIT_DIR/config.  I tend to agree
that knowing where you get other people's update of that submodule
repository should come from and wanting to have/keep a checkout of
that submodule in the working tree of a particular worktree are two
different things, so such a separate bit would be needed, and that
would belong to per-worktree configuration.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 3/6] Per-file output for Porcelain Status V2

2016-07-22 Thread Junio C Hamano

Jeff Hostetler  writes:

>>> +   case DIFF_STATUS_DELETED:
>>> +   d->porcelain_v2.mode_index = p->one->mode;
>>> +   hashcpy(d->porcelain_v2.sha1_index, p->one->sha1);
>>> +   break;
>>> +
>>> +   case DIFF_STATUS_COPIED:
>>> +   case DIFF_STATUS_MODIFIED:
>>> +   case DIFF_STATUS_RENAMED:
>>
>> Can RENAMED or COPIED happen here?
>
> If we don't report renamed or copied for unstaged changes, then no.

The question was "do we report such cases"?

>>> +   if (!d->index_status) {
>>> +   if (d->worktree_status == DIFF_STATUS_MODIFIED ||
>>> +   d->worktree_status == DIFF_STATUS_DELETED) {
>>> +   /* X=' ' Y=[MD]
>>> +* The item did not change in head-vs-index scan so the 
>>> head
>>> +* column was never set. (The index column was set 
>>> during the
>>> +* index-vs-worktree scan.)
>>> +* Force set the head column to make the output 
>>> complete.
>>> +*/
>>> +   d->porcelain_v2.mode_head = d->porcelain_v2.mode_index;
>>> +   hashcpy(d->porcelain_v2.sha1_head, 
>>> d->porcelain_v2.sha1_index);
>>> +   }
>>
>> Can there be "else" clause for this inner "if"?  When
>> d->index_status is not set, but worktree_status is not one of these
>> two, what does it indicate?  The same for the next one.
>
> For a normal entry (not unmerged) when X is ' ', Y can only
> be 'M' or 'D'.  The only other value for Y is ' ', but then if
> both were ' ' the entry would not be in the list.  So I think
> we're OK, but I'll document that here.  And below.

Good.

>>> +   memset(stages, 0, sizeof(stages));
>>> +   pos = cache_name_pos(it->string, strlen(it->string));
>>> +   assert(pos < 0);
>>> +   pos = -pos-1;
>>> +   while (pos < active_nr) {
>>> +   ce = active_cache[pos++];
>>> +   stage = ce_stage(ce);
>>> +   if (strcmp(ce->name, it->string) || !stage)
>>> +   break;
>>> +   stages[stage - 1].mode = ce->ce_mode;
>>> +   hashcpy(stages[stage - 1].sha1, ce->sha1);
>>> +   }
>>
>> You did assert(pos < 0) to make sure that the path the caller told
>> you is unmerged indeed is unmerged, which is a very good check.  In
>> the same spirit, you would want to make sure that you got at least
>> one result from the above loop, to diagnose a bug where the path did
>> not even exist at all in the index.
>
> Would that be possible for a conflict/unmerged entry?

It is possible to exactly the same degree as it is possible you
would get !(pos < 0) answer from cache_name_pos() here, I would
think.  The assert() you have up above is protecting us from either
a broken caller to this function that gives an it that points at a
merged path (in which case the assert is violated) or a breakage in
cache_name_pos().  Making sure the loop finds at least one unmerged
entry protects us from similar kind of breakage that violates the
expectation this code has from the other parts of the code.


Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Duy Nguyen

On Thu, Jul 21, 2016 at 1:22 AM, Stefan Beller  wrote:
> On Wed, Jul 20, 2016 at 10:24 AM, Nguyễn Thái Ngọc Duy
>  wrote:
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  Documentation/git-worktree.txt | 8 
>>  git-submodule.sh   | 8 
>>  2 files changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
>> index 41350db..2a5661d 100644
>> --- a/Documentation/git-worktree.txt
>> +++ b/Documentation/git-worktree.txt
>> @@ -142,6 +142,14 @@ to share to all working directories:
>> you are sure you always use sparse checkout for all working
>> directories.
>>
>> + - `submodule.*` in current state should not be shared because the
>> +   information is tied to a particular version of .gitmodules in a
>> +   working directory.
>
> While the submodule.* settings are copied from the .gitmodules file initially,
> they can be changed in the config later. (That was actually the whole
> point of it,
> so you can change the submodule remotes URL without having to change history.)
>
> And I would think that most submodule related settings (such as remote URL,
> name, path, even depth recommendation) should be the same for all worktrees,
> and a different value for one worktree is a carefully crafted
> exception by the user.
>
> So while the .gitmodules file can diverge in the work trees I do not
> think that the
> actual remotes for the submodules in the different worktrees differ
> though. The change
> of the .gitmodule files may be because you checked out an old commit, that
> has outdated information on where to get the submodule from.

I just quickly glanced through the rest of this mail because, as a
submodule ignorant, it's just mumbo jumbo to me. But what I see here
is, there may be problems if we choose to share some submodule info,
but I haven't seen any good thing from sharing any submodule info at
all.

I can imagine long term you may want to just clone a submodule repo
once and share it across worktrees that use it, so maybe it's just me
not seeing things and this may be a step towards that.

Anyway, I assume some people will be working on the submodule side.
And because I have not heard any bad thing about the new config
design, I'm going to drop submodule patches from this series and focus
on polishing config stuff.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] git-svn: allow --version to work anywhere

2016-07-22 Thread Junio C Hamano

Eric Wong  writes:

> Checking the version of the installed SVN libraries should not
> require a git repository at all.  This matches the behavior of
> "git --version".
>
> Add a test for "git svn help" for the same behavior while we're
> at it, too.
>
> Signed-off-by: Eric Wong 
> ---
>   I'm hoping "cd /" in the test will always succeed;
>   but I suppose non-*nix systems might fail, here.

How about digging a few levels of directory hierarchy, exporting
GIT_CEILING_DIRECTORIES so that we won't find any repository and
going there to run these tests?

>   And maybe a BOFH did "chmod 700 /"  :(
>
>   Anyways this is sitting in master of git://bogomips.org/git-svn.git
>
>  git-svn.perl | 4 ++--
>  t/t9100-git-svn-basic.sh | 8 
>  2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/git-svn.perl b/git-svn.perl
> index f609e54..4d41d22 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -339,7 +339,7 @@ if ($cmd && $cmd =~ /(?:clone|init|multi-init)$/) {
>   die "failed to open $ENV{GIT_DIR}: $!\n";
>   $ENV{GIT_DIR} = $1 if <$fh> =~ /^gitdir: (.+)$/;
>   }
> -} else {
> +} elsif ($cmd) {
>   my ($git_dir, $cdup);
>   git_cmd_try {
>   $git_dir = command_oneline([qw/rev-parse --git-dir/]);
> @@ -356,7 +356,7 @@ if ($cmd && $cmd =~ /(?:clone|init|multi-init)$/) {
>  
>  my %opts = %{$cmd{$cmd}->[2]} if (defined $cmd);
>  
> -read_git_config(\%opts);
> +read_git_config(\%opts) if $ENV{GIT_DIR};
>  if ($cmd && ($cmd eq 'log' || $cmd eq 'blame')) {
>   Getopt::Long::Configure('pass_through');
>  }
> diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh
> index 28082b1..10408d0 100755
> --- a/t/t9100-git-svn-basic.sh
> +++ b/t/t9100-git-svn-basic.sh
> @@ -19,6 +19,14 @@ case "$GIT_SVN_LC_ALL" in
>   ;;
>  esac
>  
> +test_expect_success 'git svn --version works anywhere' '
> + ( cd / || exit 0; git svn --version )
> +'
> +
> +test_expect_success 'git svn help works anywhere' '
> + ( cd / || exit 0; git svn help )
> +'
> +
>  test_expect_success \
>  'initialize git svn' '
>   mkdir import &&
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/4] submodule: update core.worktree using git-config

2016-07-22 Thread Duy Nguyen

On Thu, Jul 21, 2016 at 12:04 AM, Stefan Beller  wrote:
>> diff --git a/submodule.c b/submodule.c
>> index abc2ac2..b912871 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -1128,7 +1128,9 @@ void connect_work_tree_and_git_dir(const char 
>> *work_tree, const char *git_dir)
>>  {
>> struct strbuf file_name = STRBUF_INIT;
>> struct strbuf rel_path = STRBUF_INIT;
>> +   struct strbuf path = STRBUF_INIT;
>> const char *real_work_tree = xstrdup(real_path(work_tree));
>> +   struct child_process cp = CHILD_PROCESS_INIT;
>>
>> /* Update gitfile */
>> strbuf_addf(&file_name, "%s/.git", work_tree);
>> @@ -1136,13 +1138,17 @@ void connect_work_tree_and_git_dir(const char 
>> *work_tree, const char *git_dir)
>>relative_path(git_dir, real_work_tree, &rel_path));
>>
>> /* Update core.worktree setting */
>> -   strbuf_reset(&file_name);
>> -   strbuf_addf(&file_name, "%s/config", git_dir);
>> -   git_config_set_in_file(file_name.buf, "core.worktree",
>> -  relative_path(real_work_tree, git_dir,
>> -&rel_path));
>> +   strbuf_addstr(&path, relative_path(real_work_tree, git_dir,
>> +  &rel_path));
>> +   cp.git_cmd = 1;
>> +   argv_array_pushl(&cp.args, "-C", work_tree, NULL);
>> +   argv_array_pushl(&cp.args, "--work-tree", ".", NULL);
>> +   argv_array_pushl(&cp.args, "config", "core.worktree", path.buf, 
>> NULL);
>> +   if (run_command(&cp) < 0)
>> +   die(_("failed to update core.worktree for %s"), git_dir);
>
> Do we need to make this conditional on the extensions.worktreeConfig
> variable, though? When I just run
>
> git config --worktree . foo bar
> fatal: Per-worktree configuration requires extensions.worktreeConfig
> Please read section CONFIGURATION in `git help worktree` before
> enabling it.
>
> which would trigger the failure here?

It was intended, but I was probably just paranoid. The thinking back
then was, you are switching from "share whole config" to "not share
something". This should not be taken lightly and you should examine
your config file and decide what to share, before making the switch.
It's dangerous!

But then, if everything has been shared before (assuming there are
more than one worktree) and you are probably happy with it (or you
would have made done something to unshare), so it's probably good to
keep on sharing. Which means we can set extensions.worktreeConfig
automatically here (when "git config --worktree" is used) instead of
dying. We would need to move core.bare and core.worktree to main
worktree, but that's manageable.

So in short, you would not see this message in this context in future again.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] git-subtree.sh: Use --allow-unrelated-histories when splitting with --rejoin

2016-07-22 Thread Brett Cundal

On 22 July 2016 at 01:11, Johannes Schindelin
 wrote:

> I know it feels good to get frustration off of your chest by ranting. I am
> very sympathetic to that. Please note, however, that it puts the people
> who are ready to help you immediately into a defensive corner. Probably
> unintentional?

Shortest rant ever. :) Sorry if I caused any offense. It just seemed very odd to
block multi-part MIME that happened to contain a text/html part in addition to
text/plain, given that (AFAIK) the vast majority of email clients in
recent years
produce such a thing. Anyhow, off-topic... your list policy is your business.

> Back to the patch you wanted to submit: since this is an important bug
> fix, I spent the time to clean it up. The only missing bit that requires
> further effort from your side is that we need your Sign-off. See
> https://github.com/git/git/blob/v2.9.2/Documentation/SubmittingPatches#L239-L291
> for an explanation. Essentially, you are stating explicitly that you are
> not legally prohibited from contributing this patch. If you can say that,
> simply reply with a
>
> Signed-off-by: Brett Cundal 
>
> We can take it from there by inserting it into the following patch:

So, again I'll have to apologise... I should have submitted this as a bug
report rather than a patch, since the ownership is not legally clear. Didn't
even occur to me for such a trivial fix. If you can just treat this as
a bug report
at this point and 'reimplement' it, that would probably be the
simplest thing for
everyone... I guess this may be tricky since there are limited ways to possibly
implement this, but for the same reason it would be impossible IMO for me or
anyone else to legally claim authorship of it (but IANAL). If this is
going to cause
this fix to be jammed in limbo for all eternity, then let me know and
I'll see what
I can do to get the proper authorization.

-- Brett
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC/PATCH] status: suggest 'git merge --abort' when appropriate

2016-07-22 Thread Junio C Hamano

Matthieu Moy  writes:

> We already suggest 'git rebase --abort' during a conflicted rebase.
> Similarly, suggest 'git merge --abort' during conflict resolution on
> 'git merge'.
>
> Signed-off-by: Matthieu Moy 
> ---

It wasn't immediately obvious without the context that the changed
code will trigger only during a merge, and not just any other case
where we have unmerged index entries.  show_merge_in_progress() is
only called when s.merge_in_progress is set, which in turn is set
only when MERGE_HEAD is there, so we are good here.

Will apply; thanks.

>  t/t7060-wtstatus.sh| 4 
>  t/t7512-status-help.sh | 1 +
>  wt-status.c| 7 +--
>  3 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7060-wtstatus.sh b/t/t7060-wtstatus.sh
> index 44bf1d8..4d17363 100755
> --- a/t/t7060-wtstatus.sh
> +++ b/t/t7060-wtstatus.sh
> @@ -34,6 +34,7 @@ test_expect_success 'M/D conflict does not segfault' '
>  On branch side
>  You have unmerged paths.
>(fix conflicts and run "git commit")
> +  (use "git merge --abort" to abort the merge)
>  
>  Unmerged paths:
>(use "git add/rm ..." as appropriate to mark resolution)
> @@ -138,6 +139,7 @@ test_expect_success 'status when conflicts with add and 
> rm advice (deleted by th
>  On branch master
>  You have unmerged paths.
>(fix conflicts and run "git commit")
> +  (use "git merge --abort" to abort the merge)
>  
>  Unmerged paths:
>(use "git add/rm ..." as appropriate to mark resolution)
> @@ -171,6 +173,7 @@ test_expect_success 'status when conflicts with add and 
> rm advice (both deleted)
>  On branch conflict_second
>  You have unmerged paths.
>(fix conflicts and run "git commit")
> +  (use "git merge --abort" to abort the merge)
>  
>  Unmerged paths:
>(use "git add/rm ..." as appropriate to mark resolution)
> @@ -195,6 +198,7 @@ test_expect_success 'status when conflicts with only rm 
> advice (both deleted)' '
>  On branch conflict_second
>  You have unmerged paths.
>(fix conflicts and run "git commit")
> +  (use "git merge --abort" to abort the merge)
>  
>  Changes to be committed:
>  
> diff --git a/t/t7512-status-help.sh b/t/t7512-status-help.sh
> index 49d19a3..5c3db65 100755
> --- a/t/t7512-status-help.sh
> +++ b/t/t7512-status-help.sh
> @@ -29,6 +29,7 @@ test_expect_success 'status when conflicts unresolved' '
>  On branch conflicts
>  You have unmerged paths.
>(fix conflicts and run "git commit")
> +  (use "git merge --abort" to abort the merge)
>  
>  Unmerged paths:
>(use "git add ..." to mark resolution)
> diff --git a/wt-status.c b/wt-status.c
> index de62ab2..1f21b6a 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -948,9 +948,12 @@ static void show_merge_in_progress(struct wt_status *s,
>  {
>   if (has_unmerged(s)) {
>   status_printf_ln(s, color, _("You have unmerged paths."));
> - if (s->hints)
> + if (s->hints) {
>   status_printf_ln(s, color,
> - _("  (fix conflicts and run \"git commit\")"));
> +  _("  (fix conflicts and run \"git 
> commit\")"));
> + status_printf_ln(s, color,
> +  _("  (use \"git merge --abort\" to 
> abort the merge)"));
> + }
>   } else {
>   s-> commitable = 1;
>   status_printf_ln(s, color,
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>
> I just quickly glanced through the rest of this mail because, as a
> submodule ignorant, it's just mumbo jumbo to me. But what I see here
> is, there may be problems if we choose to share some submodule info,
> but I haven't seen any good thing from sharing any submodule info at
> all.

Okay. :(

I assume the sharing is beneficial. (As a work-tree ignorant) I thought
we had this main work tree, which also holds the repository, whereas
the other working trees have a light weight implementation (e.g. just
a .git file pointing back to the main working tree/git dir).

So in a way my mental model is more like the config sharing here:
You can configure things in ~/.gitconfig for example that have effects
on more than one repo. Similarly you would want to configure things
in one repo, that has effect on more than one working tree?

And my assumption was to have the repository specific parts be shared,
whereas the working tree specific things should not be shared.

By working tree specific I strongly mean:

* existence in the working tree
* the checked out sha1
* submodule.$name.path

By repository specific I strongly mean:

* the submodule URL

I am not sure about:

* submodule.$name.update, submodule.$name.ignore,
   submodule.$name.branch,
  These have to be able to be different across working trees, but do we
  require them to be set for each working tree individually?  I thought a
  repo wide setup with defaults may be ok?

>
> I can imagine long term you may want to just clone a submodule repo
> once and share it across worktrees that use it, so maybe it's just me
> not seeing things and this may be a step towards that.

Just as Junio put it:
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

I agree with that as well.

>
> Anyway, I assume some people will be working on the submodule side.

Once the discussion comes to a rough agreement, I'll give it a shot.

> And because I have not heard any bad thing about the new config
> design, I'm going to drop submodule patches from this series and focus
> on polishing config stuff.

Oh, sorry for not focusing on that part. The design of git config --worktree
is sound IMO.

> --
> Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 9:55 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> From a users POV there are:
>> * non existent submodules (no gitlink recorded, no config set,
>>   no repo in place)
>> * not initialized submodules (gitlink is recorded, no config set,
>>   and an empty repo is put in the working tree as a place holder).

I meant empty directory, not empty repo.

>
> This is no different from what you later call "embedded".  The only
> difference is that embedded thing hasn't seen its initial commit.

That did not occur to me.
The "not initialized" is what you'd get via

git clone --no-recurse repo-with-submodules

whereas the "embedded" could come from

   git clone  tmp
   cd tmp && git clone 

>
>> * initialized submodules (gitlink is recorded, the config
>>   submodule ..url is copied from the .gitmodules file to .git/config.
>>   an empty dir in the working tree as a place holder)
>>   A user may change the configuration before the next step as the url in
>>   the .gitmodules file may be wrong and the user doesn't want to
>>   rewrite history
>
> i.e. what "submodule init" gives you.

Right.

>
>> * existing submodules (gitlink is recorded, the config option is set
>>   and instead of an empty placeholder dir, we actually have a git
>>   repo there.)
>
> i.e. what "submodule update" after "submodule init" gives you.

Right.

>
>> * matching submodules (the recorded git link matches
>>   the actual checked out state of the repo!, config option and repo exist)
>
> Is this any different from "existing" case for the purpose of
> discussing the interaction between a submodule (and its checkout)
> and having possibly multiple worktrees of its superproject?

I don't think so.

>
> I agree that when a top-level superproject has multiple worktrees
> these multiple worktrees may want to have the same submodule in
> different states, but I'd imagine that they want to share the same
> physical repository (i.e. $GIT_DIR/modules/$name of the primary
> worktree of the superproject)---is everybody involved in the
> discussion share this assumption?

At least me agrees.

>
> Assuming that everybody is on the same page, that means "do we have
> the repository for that submodule, and if so where in our local
> filesystem?" is a bit of information shared across the worktrees of
> the superproject.  And the "name" used to identify the submodule is
> also shared across these worktrees of the superproject, as it is
> meant to be a unique (within the superproject) identifier for that
> "other" project it uses, no matter where in the superproject's
> working tree (note: this is "working tree", not "worktree") it would
> be checked out, and where the upstream URL to get further updates to
> the submodule is (i.e. that URL may change over time if they relocate,
> or it may even change when the user who works on the superproject
> decides to use a different mirror).

I agree.

>
> What can be different between the instantiation of the same
> submodule in these multiple worktrees, and how they should be
> recorded?
>
>  * submodule.$name.URL?  I am not sure if we want to have different
>"upstreams" depending on the worktree of the superproject.  While
>there is no fundamental reason to forbid it, having to maintain a
>single local repository for a submodule would mean they would
>need to be treated as separate "remotes" in the submodule
>repository.

You can only have a remote if the the submodule repo exists already.
I guess that can be made a requirement.

So setting up the worktrees and submodule URLs in the config and
then doing the clone of said submodule is maybe not encouraged.

>
>  * submodule.$name.path of course can be different depending on
>which commit of the superproject is checked out in the worktree,
>as the superproject may move the submodule binding site across
>its versions.

Right.

>
>  * submodule.$name.update, submodule.$name.ignore,
>submodule.$name.branch, etc. would need to be all different among
>worktrees of the superproject, as that is the whole point of
>being able to work on separate branches of the superproject in
>separate worktrees.

What do you mean by "would need". The ability to be different or rather
the veto of an 'inheritance' of defaults from the repository configuration?

>
> Somewhere in this discussion thread, you present the conclusion of
> your discussion with Jonathan Nieder that there needs a separate
> "should the submodule directory be populated?" bit, which currently
> is tied to submodule.$name.URL in $GIT_DIR/config.

I'll try to get the discussion back on list and whenever Jonathan starts talking
off list, I'll poke him with a stick.

>  I tend to agree
> that knowing where you get other people's update of that submodule
> repository should come from and wanting to have/keep a checkout of
> that submodule in the working tree of a particular worktree are two
> different things, so such a separate bit would

Re: [PATCH v4 3/4] submodule: support running in multiple worktree setup

2016-07-22 Thread Duy Nguyen

On Fri, Jul 22, 2016 at 7:25 PM, Stefan Beller  wrote:
> On Fri, Jul 22, 2016 at 10:09 AM, Duy Nguyen  wrote:
>>
>> I just quickly glanced through the rest of this mail because, as a
>> submodule ignorant, it's just mumbo jumbo to me. But what I see here
>> is, there may be problems if we choose to share some submodule info,
>> but I haven't seen any good thing from sharing any submodule info at
>> all.
>
> Okay. :(

Didn't mean to make you feel sad :)


> I assume the sharing is beneficial. (As a work-tree ignorant) I thought
> we had this main work tree, which also holds the repository, whereas
> the other working trees have a light weight implementation (e.g. just
> a .git file pointing back to the main working tree/git dir).

The main worktree is special for historical reason. But from the user
point of view (and even developer's at a certain level) they should be
treated equally. Think of it like cloning the same repo multiple
times. Only now you save disk space because there's only one object
database.

> So in a way my mental model is more like the config sharing here
> You can configure things in ~/.gitconfig for example that have effects
> on more than one repo. Similarly you would want to configure things
> in one repo, that has effect on more than one working tree?
>
> And my assumption was to have the repository specific parts be shared,
> whereas the working tree specific things should not be shared.

I think that's a good assumption. Although I would rather be not
sharing by default and let the user initiate it when they want to
share something. Like ~/..gitconfig, we never write anything there
unless the user asks us to explicitly (with git config --user).
Accidental share could have negative effect.

>> I can imagine long term you may want to just clone a submodule repo
>> once and share it across worktrees that use it, so maybe it's just me
>> not seeing things and this may be a step towards that.
>
> Just as Junio put it:
>> I agree that when a top-level superproject has multiple worktrees
>> these multiple worktrees may want to have the same submodule in
>> different states, but I'd imagine that they want to share the same
>> physical repository (i.e. $GIT_DIR/modules/$name of the primary
>> worktree of the superproject)---is everybody involved in the
>> discussion share this assumption?
>
> I agree with that as well.

Yeah. We have a long way to go though. As I see it, you may need ref
namespace as well (so they look like separate clones), which has never
been used on the client side before. Either that or odb alternates...

>> And because I have not heard any bad thing about the new config
>> design, I'm going to drop submodule patches from this series and focus
>> on polishing config stuff.
>
> Oh, sorry for not focusing on that part. The design of git config --worktree
> is sound IMO.

This makes me happy (I know other people can still find flaws in it,
and I'm ok with that). This config split thing has been wrecking my
brain for a long time, find the the "right" way to do with minimum
impacts :)
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] diff: do not reuse worktree files that need "clean" conversion

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

> Subject: diff: do not reuse worktree files that need "clean" conversion
>
> When accessing a blob for a diff, we may try to reuse file
> contents in the working tree, under the theory that it is
> faster to mmap those file contents than it would be to
> extract the content from the object database.
>
> When we have to filter those contents, though, that
> assumption does not hold. Even for our internal conversions
> like CRLF, we have to allocate and fill a new buffer anyway.
> But much worse, for external clean filters we have to exec
> an arbitrary script, and we have no idea how expensive it
> may be to run.
>
> So let's skip this optimization when conversion into git's
> "clean" form is required. This applies whenever the
> "want_file" flag is false. When it's true, the caller
> actually wants the smudged worktree contents, which the
> reused file by definition already has (in fact, this is a
> key optimization going the other direction, since reusing
> the worktree file there lets us skip smudge filters).
>
> Signed-off-by: Jeff King 
> ---
>  diff.c|  7 +++
>  t/t0021-conversion.sh | 11 +++
>  2 files changed, 18 insertions(+)
>
> diff --git a/diff.c b/diff.c
> index 7d03419..b43d3dd 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2683,6 +2683,13 @@ static int reuse_worktree_file(const char *name, const 
> unsigned char *sha1, int
>   if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(sha1))
>   return 0;
>  
> + /*
> +  * Similarly, if we'd have to convert the file contents anyway, that
> +  * makes the optimization not worthwhile.
> +  */
> + if (!want_file && would_convert_to_git(name))
> + return 0;

The would_convert_to_git() function is not a free operation.  It
needs to prime the attribute stack, so it needs to open/read/parse a
few files ($GIT_DIR/info/attributes and .gitattributes at least, and
more if your directory hierarchy is deep) on the filesystem.  The
cost is amortized across paths, but we do not even enable the
optimization if we have to pay the cost of reading the index
ourselves.

I suspect that we may be better off disabling this optimization if
we need to always call the helper.

>   len = strlen(name);
>   pos = cache_name_pos(name, len);
>   if (pos < 0)
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 7bac2bc..e799e59 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -268,4 +268,15 @@ test_expect_success 'disable filter with empty override' 
> '
>   test_must_be_empty err
>  '
>  
> +test_expect_success 'diff does not reuse worktree files that need cleaning' '
> + test_config filter.counter.clean "echo . >>count; sed s/^/clean:/" &&
> + echo "file filter=counter" >.gitattributes &&
> + test_commit one file &&
> + test_commit two file &&
> +
> + >count &&
> + git diff-tree -p HEAD &&
> + test_line_count = 0 count
> +'
> +
>  test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] contrib/persistent-https: update ldflags syntax for Go 1.7+

2016-07-22 Thread Junio C Hamano

Parker Moore  writes:

> From: Parker Moore 
>
> This fixes contrib/persistent-https builds for Go v1.7+ and is
> compatible with Go v1.0+.

Thanks, applied.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] contrib/persistent-https: use Git version for build label

2016-07-22 Thread Junio C Hamano

Parker Moore  writes:

> diff --git a/contrib/persistent-https/Makefile
> b/contrib/persistent-https/Makefile
> index 92baa3be..8248269 100644
> --- a/contrib/persistent-https/Makefile
> +++ b/contrib/persistent-https/Makefile
> @@ -12,7 +12,7 @@
>  # See the License for the specific language governing permissions and
>  # limitations under the License.
>
> -BUILD_LABEL=$(shell date +"%s")
> +BUILD_LABEL=$(shell cat ../../GIT-VERSION-FILE | cut -d" " -f3)

We tend to avoid catting a single file only to pipe the result into
a different command, so I'd rewrite this like so:

BUILD_LABEL=$(shell cut -d" " -f3 ../../GIT-VERSION-FILE)

while queuing.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] diff: do not reuse worktree files that need "clean" conversion

2016-07-22 Thread Jeff King

On Fri, Jul 22, 2016 at 10:44:08AM -0700, Junio C Hamano wrote:

> > diff --git a/diff.c b/diff.c
> > index 7d03419..b43d3dd 100644
> > --- a/diff.c
> > +++ b/diff.c
> > @@ -2683,6 +2683,13 @@ static int reuse_worktree_file(const char *name, 
> > const unsigned char *sha1, int
> > if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(sha1))
> > return 0;
> >  
> > +   /*
> > +* Similarly, if we'd have to convert the file contents anyway, that
> > +* makes the optimization not worthwhile.
> > +*/
> > +   if (!want_file && would_convert_to_git(name))
> > +   return 0;
> 
> The would_convert_to_git() function is not a free operation.  It
> needs to prime the attribute stack, so it needs to open/read/parse a
> few files ($GIT_DIR/info/attributes and .gitattributes at least, and
> more if your directory hierarchy is deep) on the filesystem.  The
> cost is amortized across paths, but we do not even enable the
> optimization if we have to pay the cost of reading the index
> ourselves.

Yeah, I almost commented on that, and its position in the function, but
forgot to.

The only code path which will trigger this is diff_populate_filespec.
After reuse_worktree_file() says "yes, we can reuse", we drop into a
conditional that will end in us calling convert_to_git() anyway, which
will do the same lookup. We don't need to cache, because the expensive
parts of the attribute-lookup are already cached for us by the attribute
code.

So my initial thought was to put it at the end of reuse_worktree_file(),
where it would have the least impact. If we find we cannot reuse the
file, then we would skip both this new attr lookup and the one in
diff_populate_filespec().

But in practice, I think we'll already have cached those attrs before we
even hit this function, because we'll hit the userdiff_find_by_path()
code earlier in the diff process (e.g., to see if it's binary, if we
need to textconv, etc). Those look for different attributes, but I think
the expensive bits (finding, opening, reading attribute files) are
cached across all lookups.

So I think we actually _can_ think of would_convert_to_git() as
basically free. Or as free as other efficient-lookup functions we call
like cache_name_pos(). And so I moved it further up in the function,
where it lets us avoid doing more out-of-process work (like calling
lstat() so we can ce_match_stat() on the result).

Possibly it should go after the cache_name_pos() call, though. That's
likely to be less expensive than the actual walk of the attr tree.

> I suspect that we may be better off disabling this optimization if
> we need to always call the helper.

The thought "does this tree reuse even speed things up enough to justify
its complexity" definitely crossed my mind. It's basically swapping
open/mmap for zlib inflating the content.

But I do think it helps in the "want_file" case (i.e., when we are
writing out a tempfile for an external command via prepare_temp_file()).
There it helps us omit writing a tempfile to disk entirely, including
any possible conversion.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] git-submodule: forward exit code of git-submodule--helper more faithfully

2016-07-22 Thread Johannes Sixt

git-submodule--helper is invoked as the upstream of a pipe in several
places. Usually, the failure of a program in this position is not
detected by the shell. For this reason, the code inserts a token in the
output stream when git-submodule--helper fails that is detected
downstream, where the shell script is quit with exit code 1.

There happens to be a bug in git-submodule--helper that leads to a
segmentation fault. The test suite triggers the crash in several places,
all of which are protected by 'test_must_fail'. But due to the inspecific
exit code 1, the crash remains undiagnosed.

Extend the failure protocol such that git-submodule--helper's exit code
is passed downstream (only in the case of failure). This enables the
downstream to use it as its own exit code, and 'test_must_fail' to
identify the segmentation fault as an unexpected failure.

The bug itself is fixed in the next commit.

Signed-off-by: Johannes Sixt 
---
When you run ./t7400-submodule-basic.sh -v, you will notice this output:

fatal: destination path '/home/jsixt/Src/git/git/t/trash 
directory.t7400-submodule-basic/init' already exists and is not an empty 
directory.
fatal: clone of './.subrepo' into submodule path 
'/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
Failed to clone 'init'. Retry scheduled
fatal: destination path '/home/jsixt/Src/git/git/t/trash 
directory.t7400-submodule-basic/init' already exists and is not an empty 
directory.
fatal: clone of './.subrepo' into submodule path 
'/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
/home/jsixt/Src/git/git/git-submodule: line 494: 21757 Segmentation fault  
git submodule--helper update-clone ${GIT_QUIET:+--quiet} ${wt_prefix:+--prefix 
"$wt_prefix"} ${prefix:+--recursive-prefix "$prefix"} ${update:+--update 
"$update"} ${reference:+--reference "$reference"} ${depth:+--depth "$depth"} 
${recommend_shallow:+"$recommend_shallow"} ${jobs:+$jobs} "$@"
ok 32 - update should fail when path is used by a file

Note the segmentation fault. This mini-series addresses the issue.

Noticed on Windows because it "visualizes" segfaults even for
command line programs.

 git-submodule.sh| 22 +++---
 t/t5815-submodule-protos.sh |  4 ++--
 t/t7400-submodule-basic.sh  |  4 ++--
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 4ec7546..0a0e12d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -49,7 +49,7 @@ die_if_unmatched ()
 {
if test "$1" = "#unmatched"
then
-   exit 1
+   exit ${2:-1}
fi
 }
 
@@ -312,11 +312,11 @@ cmd_foreach()
 
{
git submodule--helper list --prefix "$wt_prefix" ||
-   echo "#unmatched"
+   echo "#unmatched" $?
} |
while read mode sha1 stage sm_path
do
-   die_if_unmatched "$mode"
+   die_if_unmatched "$mode" "$sha1"
if test -e "$sm_path"/.git
then
displaypath=$(git submodule--helper relative-path 
"$prefix$sm_path" "$wt_prefix")
@@ -423,11 +423,11 @@ cmd_deinit()
 
{
git submodule--helper list --prefix "$wt_prefix" "$@" ||
-   echo "#unmatched"
+   echo "#unmatched" $?
} |
while read mode sha1 stage sm_path
do
-   die_if_unmatched "$mode"
+   die_if_unmatched "$mode" "$sha1"
name=$(git submodule--helper name "$sm_path") || exit
 
displaypath=$(git submodule--helper relative-path "$sm_path" 
"$wt_prefix")
@@ -581,12 +581,12 @@ cmd_update()
${depth:+--depth "$depth"} \
${recommend_shallow:+"$recommend_shallow"} \
${jobs:+$jobs} \
-   "$@" || echo "#unmatched"
+   "$@" || echo "#unmatched" $?
} | {
err=
while read mode sha1 stage just_cloned sm_path
do
-   die_if_unmatched "$mode"
+   die_if_unmatched "$mode" "$sha1"
 
name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
@@ -994,11 +994,11 @@ cmd_status()
 
{
git submodule--helper list --prefix "$wt_prefix" "$@" ||
-   echo "#unmatched"
+   echo "#unmatched" $?
} |
while read mode sha1 stage sm_path
do
-   die_if_unmatched "$mode"
+   die_if_unmatched "$mode" "$sha1"
name=$(git submodule--helper name "$sm_path") || exit
url=$(git config submodule."$name".url)
displaypath=$(git submodule--helper relative-path 
"$prefix$sm_path" "$wt_prefix")
@@ -1075,11 +1075,11 @@ cmd_sync()
cd_to_toplevel
{
git submodule--helper list --prefix "$wt_prefix" "$@" ||
-   echo "#unmatched"
+

[PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Johannes Sixt

'git submodule--helper update-clone' has logic to retry failed clones
a second time. For this purpose, there is a list of submodules to clone,
and a second list that is filled with the submodules to retry. Within
these lists, the submodules are identified by an index as if both lists
were just appended.

This works nicely except when the second clone attempt fails as well. To
report an error, the identifying index must be adjusted by an offset so
that it can be used as an index into the second list. However, the
calculation uses the logical total length of the lists so that the result
always points one past the end of the second list.

Pick the correct index.

Signed-off-by: Johannes Sixt 
---
 builtin/submodule--helper.c | 2 +-
 t/t5815-submodule-protos.sh | 4 ++--
 t/t7400-submodule-basic.sh  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b22352b..6f6d67a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -795,7 +795,7 @@ static int update_clone_task_finished(int result,
suc->failed_clones[suc->failed_clones_nr++] = ce;
return 0;
} else {
-   idx = suc->current - suc->list.nr;
+   idx -= suc->list.nr;
ce  = suc->failed_clones[idx];
strbuf_addf(err, _("Failed to clone '%s' a second time, 
aborting"),
ce->name);
diff --git a/t/t5815-submodule-protos.sh b/t/t5815-submodule-protos.sh
index 112cf40..06f55a1 100755
--- a/t/t5815-submodule-protos.sh
+++ b/t/t5815-submodule-protos.sh
@@ -18,7 +18,7 @@ test_expect_success 'setup repository with submodules' '
git commit -m "add submodules"
 '
 
-test_expect_failure 'clone with recurse-submodules fails' '
+test_expect_success 'clone with recurse-submodules fails' '
test_must_fail git clone --recurse-submodules . dst
 '
 
@@ -32,7 +32,7 @@ test_expect_success 'update of ssh allowed' '
git -C dst submodule update ssh-module
 '
 
-test_expect_failure 'update of ext not allowed' '
+test_expect_success 'update of ext not allowed' '
test_must_fail git -C dst submodule update ext-module
 '
 
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 7c8b90b..b77cce8 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -352,7 +352,7 @@ test_expect_success 'sync should fail with unknown 
submodule' '
test_failure_with_unknown_submodule sync
 '
 
-test_expect_failure 'update should fail when path is used by a file' '
+test_expect_success 'update should fail when path is used by a file' '
echo hello >expect &&
 
echo "hello" >init &&
@@ -361,7 +361,7 @@ test_expect_failure 'update should fail when path is used 
by a file' '
test_cmp expect init
 '
 
-test_expect_failure 'update should fail when path is used by a nonempty 
directory' '
+test_expect_success 'update should fail when path is used by a nonempty 
directory' '
echo hello >expect &&
 
rm -fr init &&
-- 
2.9.0.443.ga8520ad

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

> I gave a very cursory look when I wrote the other email, and your
> alternative solution is what looked like the most elegant fix to me.
>
> I suspect this bug has been there quite a while, so no rush. :)

Here is what I am going to queue.

One thing that I noticed is that there is this strange field
in grep_opt called .extended_regexp_option; it only is set from the
boolean configuration grep.extendedregexp and worse yet it takes
precedence over the command line option or grep.patternType, e.g.
t7810-grep expects crazy things like these:

 * "git -c grep.extendedregexp=true -c grep.patterntype=basic grep"
   wants to be like "git grep -E"

 * "git -c grep.extendedregexp=false -c grep.patterntype=extended grep"
   wants to be like "git grep -G"

This comes from b22520a3 (grep: allow -E and -n to be turned on by
default via configuration, 2011-03-30) back when we didn't have a
more generic grep.patternType configuration mechanism in v1.7.5
days, and it probably need to be deprecated to maintain our sanity.
I.e. when we see the configuration used, first we warn the user and
set grep.patternType to extended instead, and then eventually error
out in a backward-compatibility breaking release of Git we will make
in some future date, together with things like other compatibility
breaking topics like ex/deprecate-empty-pathspec-as-match-all.

But that is a separate topic after this fix goes in anyway.

-- >8 --
From: Junio C Hamano 
Date: Fri, 22 Jul 2016 11:43:14 -0700
Subject: [PATCH] grep: further simplify setting the pattern type

When c5c31d33 (grep: move pattern-type bits support to top-level
grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
function, the intention was to allow the users of grep API to having
to fiddle only with .pattern_type_option (which can be set to "fixed",
"basic", "extended", and "pcre"), and then immediately before compiling
the pattern strings for use, call grep_commit_pattern_type() to have
it prepare various bits in the grep_opt structure (like .fixed,
.regflags, etc.).

However, grep_set_pattern_type_option() helper function the grep API
internally uses were left as an external function by mistake.  This
function shouldn't have been made callable by the users of the API.

Later when the grep API was used in revision graversal machinery,
the caller then mistakenly started calling the function around
34a4ae55 (log --grep: use the same helper to set -E/-F options as
"git grep", 2012-10-03), instead of setting the .pattern_type_option
field and letting the grep_commit_pattern_type() to take care of the
details.

This caused an unnecessary bug that made a configured
grep.patternType take precedence over the command line options
(e.g. --basic-regexp, --fixed-strings) in "git log" family of
commands.

Signed-off-by: Junio C Hamano 
---
 grep.c | 22 +++---
 grep.h |  1 -
 revision.c |  8 
 t/t4202-log.sh | 14 ++
 4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/grep.c b/grep.c
index b58c7c6..0bc5cc1 100644
--- a/grep.c
+++ b/grep.c
@@ -161,17 +161,7 @@ void grep_init(struct grep_opt *opt, const char *prefix)
strcpy(opt->color_sep, def->color_sep);
 }

-void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct 
grep_opt *opt)
-{
-   if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
-   grep_set_pattern_type_option(pattern_type, opt);
-   else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
-   grep_set_pattern_type_option(opt->pattern_type_option, opt);
-   else if (opt->extended_regexp_option)
-   grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
-}
-
-void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, struct 
grep_opt *opt)
+static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, 
struct grep_opt *opt)
 {
switch (pattern_type) {
case GREP_PATTERN_TYPE_UNSPECIFIED:
@@ -203,6 +193,16 @@ void grep_set_pattern_type_option(enum grep_pattern_type 
pattern_type, struct gr
}
 }

+void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct 
grep_opt *opt)
+{
+   if (pattern_type != GREP_PATTERN_TYPE_UNSPECIFIED)
+   grep_set_pattern_type_option(pattern_type, opt);
+   else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
+   grep_set_pattern_type_option(opt->pattern_type_option, opt);
+   else if (opt->extended_regexp_option)
+   grep_set_pattern_type_option(GREP_PATTERN_TYPE_ERE, opt);
+}
+
 static struct grep_pat *create_grep_pat(const char *pat, size_t patlen,
const char *origin, int no,
enum grep_pat_token t,
diff --git a/grep.h b/grep.h
index 95f197a..9e9ec5f 100644
--- a/grep.h
+++ b/grep.h
@@ -144,7 +144,6 @@ struct grep_opt {
 extern void init_grep_defaults(void);

Re: [PATCH 1/2] git-submodule: forward exit code of git-submodule--helper more faithfully

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 12:14 PM, Johannes Sixt  wrote:
> git-submodule--helper is invoked as the upstream of a pipe in several
> places. Usually, the failure of a program in this position is not
> detected by the shell. For this reason, the code inserts a token in the
> output stream when git-submodule--helper fails that is detected
> downstream, where the shell script is quit with exit code 1.
>
> There happens to be a bug in git-submodule--helper that leads to a
> segmentation fault. The test suite triggers the crash in several places,
> all of which are protected by 'test_must_fail'. But due to the inspecific
> exit code 1, the crash remains undiagnosed.
>
> Extend the failure protocol such that git-submodule--helper's exit code
> is passed downstream (only in the case of failure). This enables the
> downstream to use it as its own exit code, and 'test_must_fail' to
> identify the segmentation fault as an unexpected failure.
>
> The bug itself is fixed in the next commit.
>
> Signed-off-by: Johannes Sixt 
> ---
> When you run ./t7400-submodule-basic.sh -v, you will notice this output:
>
> fatal: destination path '/home/jsixt/Src/git/git/t/trash 
> directory.t7400-submodule-basic/init' already exists and is not an empty 
> directory.
> fatal: clone of './.subrepo' into submodule path 
> '/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
> Failed to clone 'init'. Retry scheduled
> fatal: destination path '/home/jsixt/Src/git/git/t/trash 
> directory.t7400-submodule-basic/init' already exists and is not an empty 
> directory.
> fatal: clone of './.subrepo' into submodule path 
> '/home/jsixt/Src/git/git/t/trash directory.t7400-submodule-basic/init' failed
> /home/jsixt/Src/git/git/git-submodule: line 494: 21757 Segmentation fault 
>  git submodule--helper update-clone ${GIT_QUIET:+--quiet} 
> ${wt_prefix:+--prefix "$wt_prefix"} ${prefix:+--recursive-prefix "$prefix"} 
> ${update:+--update "$update"} ${reference:+--reference "$reference"} 
> ${depth:+--depth "$depth"} ${recommend_shallow:+"$recommend_shallow"} 
> ${jobs:+$jobs} "$@"
> ok 32 - update should fail when path is used by a file
>
> Note the segmentation fault. This mini-series addresses the issue.

The segfault has been addressed in
http://thread.gmane.org/gmane.comp.version-control.git/25
but received no attention yet.
The propagation of the exit code makes sense nevertheless.

Thanks!


>
> Noticed on Windows because it "visualizes" segfaults even for
> command line programs.
>
>  git-submodule.sh| 22 +++---
>  t/t5815-submodule-protos.sh |  4 ++--
>  t/t7400-submodule-basic.sh  |  4 ++--
>  3 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/git-submodule.sh b/git-submodule.sh
> index 4ec7546..0a0e12d 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -49,7 +49,7 @@ die_if_unmatched ()
>  {
> if test "$1" = "#unmatched"
> then
> -   exit 1
> +   exit ${2:-1}
> fi
>  }
>
> @@ -312,11 +312,11 @@ cmd_foreach()
>
> {
> git submodule--helper list --prefix "$wt_prefix" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
> if test -e "$sm_path"/.git
> then
> displaypath=$(git submodule--helper relative-path 
> "$prefix$sm_path" "$wt_prefix")
> @@ -423,11 +423,11 @@ cmd_deinit()
>
> {
> git submodule--helper list --prefix "$wt_prefix" "$@" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
> name=$(git submodule--helper name "$sm_path") || exit
>
> displaypath=$(git submodule--helper relative-path "$sm_path" 
> "$wt_prefix")
> @@ -581,12 +581,12 @@ cmd_update()
> ${depth:+--depth "$depth"} \
> ${recommend_shallow:+"$recommend_shallow"} \
> ${jobs:+$jobs} \
> -   "$@" || echo "#unmatched"
> +   "$@" || echo "#unmatched" $?
> } | {
> err=
> while read mode sha1 stage just_cloned sm_path
> do
> -   die_if_unmatched "$mode"
> +   die_if_unmatched "$mode" "$sha1"
>
> name=$(git submodule--helper name "$sm_path") || exit
> url=$(git config submodule."$name".url)
> @@ -994,11 +994,11 @@ cmd_status()
>
> {
> git submodule--helper list --prefix "$wt_prefix" "$@" ||
> -   echo "#unmatched"
> +   echo "#unmatched" $?
> } |
> while read mode sha1 stage sm_path
> do
> -   die_i

Re: [PATCH ew/daemon-socket-keepalive] Windows: add missing definition of ENOTSOCK

2016-07-22 Thread Junio C Hamano

Johannes Schindelin  writes:

> Hi Hannes,
>
> On Thu, 21 Jul 2016, Johannes Sixt wrote:
>
>> The previous commit introduced the first use of ENOTSOCK. This macro is
>> not available on Windows. Define it as WSAENOTSOCK because that is the
>> corresponding error value reported by the Windows versions of socket
>> functions.
>
> Thanks for catching this early.

(is that an acked/reviewed-by?  it is OK if it is not).

Yeah, thanks, both.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Jeff King

On Fri, Jul 22, 2016 at 12:21:31PM -0700, Junio C Hamano wrote:

> One thing that I noticed is that there is this strange field
> in grep_opt called .extended_regexp_option; it only is set from the
> boolean configuration grep.extendedregexp and worse yet it takes
> precedence over the command line option or grep.patternType, e.g.
> t7810-grep expects crazy things like these:
> 
>  * "git -c grep.extendedregexp=true -c grep.patterntype=basic grep"
>wants to be like "git grep -E"
> 
>  * "git -c grep.extendedregexp=false -c grep.patterntype=extended grep"
>wants to be like "git grep -G"
> 
> This comes from b22520a3 (grep: allow -E and -n to be turned on by
> default via configuration, 2011-03-30) back when we didn't have a
> more generic grep.patternType configuration mechanism in v1.7.5
> days, and it probably need to be deprecated to maintain our sanity.
> I.e. when we see the configuration used, first we warn the user and
> set grep.patternType to extended instead, and then eventually error
> out in a backward-compatibility breaking release of Git we will make
> in some future date, together with things like other compatibility
> breaking topics like ex/deprecate-empty-pathspec-as-match-all.
> 
> But that is a separate topic after this fix goes in anyway.

I am not even sure we need to deprecate it. Once it becomes merely a
historical synonym for "grep.patternType=extended" we can live with it
indefinitely (and I do not think we need a deprecation period to go
there; the existing behavior is simply buggy).

Not that I mind eventually removing it, if you want to go through the
steps.

> -- >8 --
> From: Junio C Hamano 
> Date: Fri, 22 Jul 2016 11:43:14 -0700
> Subject: [PATCH] grep: further simplify setting the pattern type
> [...]

Thanks. This matches the cursory analysis I had done earlier, and the
patch looks exactly as I had expected.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 12:15 PM, Johannes Sixt  wrote:
> 'git submodule--helper update-clone' has logic to retry failed clones
> a second time. For this purpose, there is a list of submodules to clone,
> and a second list that is filled with the submodules to retry. Within
> these lists, the submodules are identified by an index as if both lists
> were just appended.
>
> This works nicely except when the second clone attempt fails as well. To
> report an error, the identifying index must be adjusted by an offset so
> that it can be used as an index into the second list. However, the
> calculation uses the logical total length of the lists so that the result
> always points one past the end of the second list.
>
> Pick the correct index.
>
> Signed-off-by: Johannes Sixt 
> ---
>  builtin/submodule--helper.c | 2 +-
>  t/t5815-submodule-protos.sh | 4 ++--
>  t/t7400-submodule-basic.sh  | 4 ++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index b22352b..6f6d67a 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -795,7 +795,7 @@ static int update_clone_task_finished(int result,
> suc->failed_clones[suc->failed_clones_nr++] = ce;
> return 0;
> } else {
> -   idx = suc->current - suc->list.nr;
> +   idx -= suc->list.nr;

The fix is the same as in
http://thread.gmane.org/gmane.comp.version-control.git/25
There we have an additional check, which may make sense to use here as well,
specifically when having the patch 1 which propagates the exit code.

The approach to tests is different though. I like yours better than mine,
as it doesn't add more tests, but strengthens existing tests.

> ce  = suc->failed_clones[idx];
> strbuf_addf(err, _("Failed to clone '%s' a second time, 
> aborting"),
> ce->name);
> diff --git a/t/t5815-submodule-protos.sh b/t/t5815-submodule-protos.sh
> index 112cf40..06f55a1 100755
> --- a/t/t5815-submodule-protos.sh
> +++ b/t/t5815-submodule-protos.sh
> @@ -18,7 +18,7 @@ test_expect_success 'setup repository with submodules' '
> git commit -m "add submodules"
>  '
>
> -test_expect_failure 'clone with recurse-submodules fails' '
> +test_expect_success 'clone with recurse-submodules fails' '
> test_must_fail git clone --recurse-submodules . dst
>  '
>
> @@ -32,7 +32,7 @@ test_expect_success 'update of ssh allowed' '
> git -C dst submodule update ssh-module
>  '
>
> -test_expect_failure 'update of ext not allowed' '
> +test_expect_success 'update of ext not allowed' '
> test_must_fail git -C dst submodule update ext-module
>  '
>
> diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
> index 7c8b90b..b77cce8 100755
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -352,7 +352,7 @@ test_expect_success 'sync should fail with unknown 
> submodule' '
> test_failure_with_unknown_submodule sync
>  '
>
> -test_expect_failure 'update should fail when path is used by a file' '
> +test_expect_success 'update should fail when path is used by a file' '
> echo hello >expect &&
>
> echo "hello" >init &&
> @@ -361,7 +361,7 @@ test_expect_failure 'update should fail when path is used 
> by a file' '
> test_cmp expect init
>  '
>
> -test_expect_failure 'update should fail when path is used by a nonempty 
> directory' '
> +test_expect_success 'update should fail when path is used by a nonempty 
> directory' '
> echo hello >expect &&
>
> rm -fr init &&
> --
> 2.9.0.443.ga8520ad
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] diff: do not reuse worktree files that need "clean" conversion

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

> So I think we actually _can_ think of would_convert_to_git() as
> basically free. Or as free as other efficient-lookup functions we call
> like cache_name_pos().

OK.

> The thought "does this tree reuse even speed things up enough to justify
> its complexity" definitely crossed my mind. It's basically swapping
> open/mmap for zlib inflating the content.
>
> But I do think it helps in the "want_file" case (i.e., when we are
> writing out a tempfile for an external command via prepare_temp_file()).
> There it helps us omit writing a tempfile to disk entirely, including
> any possible conversion.

OK.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH ew/daemon-socket-keepalive] Windows: add missing definition of ENOTSOCK

2016-07-22 Thread Eric Wong

Johannes Sixt  wrote:
> The previous commit introduced the first use of ENOTSOCK. This macro is
> not available on Windows. Define it as WSAENOTSOCK because that is the
> corresponding error value reported by the Windows versions of socket
> functions.

Thanks.   I thought compat/poll/poll.c already used ENOTSOCK
but I was mislead by the #ifdefs :x
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Junio C Hamano

Johannes Sixt  writes:

> 'git submodule--helper update-clone' has logic to retry failed clones
> a second time. For this purpose, there is a list of submodules to clone,
> and a second list that is filled with the submodules to retry. Within
> these lists, the submodules are identified by an index as if both lists
> were just appended.
>
> This works nicely except when the second clone attempt fails as well. To
> report an error, the identifying index must be adjusted by an offset so
> that it can be used as an index into the second list. However, the
> calculation uses the logical total length of the lists so that the result
> always points one past the end of the second list.
>
> Pick the correct index.
>
> Signed-off-by: Johannes Sixt 
> ---

Looks very similar to

  http://public-inbox.org/git/20160721013923.17435-1-sbeller%40google.com/

but these two patch series looks more thorough.

I expect I'd queue these two instead, after seeing Acks from Stefan.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 12:39 PM, Junio C Hamano  wrote:
> Johannes Sixt  writes:
>
>> 'git submodule--helper update-clone' has logic to retry failed clones
>> a second time. For this purpose, there is a list of submodules to clone,
>> and a second list that is filled with the submodules to retry. Within
>> these lists, the submodules are identified by an index as if both lists
>> were just appended.
>>
>> This works nicely except when the second clone attempt fails as well. To
>> report an error, the identifying index must be adjusted by an offset so
>> that it can be used as an index into the second list. However, the
>> calculation uses the logical total length of the lists so that the result
>> always points one past the end of the second list.
>>
>> Pick the correct index.
>>
>> Signed-off-by: Johannes Sixt 
>> ---
>
> Looks very similar to
>
>   http://public-inbox.org/git/20160721013923.17435-1-sbeller%40google.com/
>
> but these two patch series looks more thorough.
>
> I expect I'd queue these two instead, after seeing Acks from Stefan.
>
> Thanks.

Sure. Please take these 2 patches instead of mine.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 1/6] doc/rev-list-options: clarify "commit@{Nth}" for "-g" option

2016-07-22 Thread Jeff King

When "log -g" shows "HEAD@{1}", "HEAD@{2}", etc, calling
that "commit@{Nth}" is not really accurate. The "HEAD" part
is really the refname. By saying "commit", a reader may
misunderstand that to mean something related to the specific
commit we are showing, not the ref whose reflog we are
traversing.

While we're here, let's also switch these instances to use
literal backticks, as our style guide recommends. As a
bonus, that lets us drop some asciidoc quoting.

Signed-off-by: Jeff King 
---
 Documentation/rev-list-options.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index c5bd218..1b56253 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -252,9 +252,9 @@ list.
 +
 With `--pretty` format other than `oneline` (for obvious reasons),
 this causes the output to have two extra lines of information
-taken from the reflog.  By default, 'commit@\{Nth}' notation is
+taken from the reflog.  By default, `ref@{Nth}` notation is
 used in the output.  When the starting commit is specified as
-'commit@\{now}', output also uses 'commit@\{timestamp}' notation
+`ref@{now}`, output also uses `ref@{timestamp}` notation
 instead.  Under `--pretty=oneline`, the commit message is
 prefixed with this information on the same line.
 This option cannot be combined with `--reverse`.
-- 
2.9.2.512.gc1ef750

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 0/6] reflog docs and date-formatting

2016-07-22 Thread Jeff King

This is a repost of the patches in:

  http://thread.gmane.org/gmane.comp.version-control.git/299201/focus=299236

since I don't think they got picked up at all.

The contents are the same, but with one extra patch (the 4th) that was
posted mid-discussion.

The only other review comment is that patch 5 might want to editorialize
the weirdness of "raw-local" more. I think I'm comfortable leaving it
as-is based on my response in:

  http://article.gmane.org/gmane.comp.version-control.git/299312

but I could be persuaded otherwise.

The final patch, "--date=unix" doesn't _really_ solve Ted's problem,
because there's still no way to give multiple date formats in the same
line, nor use "%gd" as both the reflog index and as the date. So I'm
assuming he'll still build "%gt" or similar on top of this, and only in
the long run will we get to "%(reflog-date:unix)" or whatever.

So I think it's still valuable as a minor feature for other formatting
needs, and as a potential building block for later. The earlier patches
are all immediately useful as clarifications.

  [1/6]: doc/rev-list-options: clarify "commit@{Nth}" for "-g" option
  [2/6]: doc/rev-list-options: explain "-g" output formats
  [3/6]: doc/pretty-formats: describe index/time formats for %gd
  [4/6]: doc/pretty-formats: explain shortening of %gd
  [5/6]: date: document and test "raw-local" mode
  [6/6]: date: add "unix" format

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

>> This comes from b22520a3 (grep: allow -E and -n to be turned on by
>> default via configuration, 2011-03-30) back when we didn't have a
>> more generic grep.patternType configuration mechanism in v1.7.5
>> days, and it probably need to be deprecated to maintain our sanity.
>> ...
> I am not even sure we need to deprecate it. Once it becomes merely a
> historical synonym for "grep.patternType=extended" we can live with it
> indefinitely (and I do not think we need a deprecation period to go
> there; the existing behavior is simply buggy).

I grossed over an important detail.

Pretending as if grep.patternType=extended were given when we see
grep.extendedregexp=true and grep.patternType=basic is given when
grep.extendedregexp=false changes the behaviour in a way that can be
seen as the violation of (crazy) expectations t7810 makes.

Any user who depends on that crazy expectation will be broken by
such a change, even if we do not deprecate and remove the
configuration variable.

 grep.c | 4 ++--
 grep.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/grep.c b/grep.c
index 394c856..7b1d423 100644
--- a/grep.c
+++ b/grep.c
@@ -73,9 +73,9 @@ int grep_config(const char *var, const char *value, void *cb)
 
if (!strcmp(var, "grep.extendedregexp")) {
if (git_config_bool(var, value))
-   opt->extended_regexp_option = 1;
+   opt->pattern_type_option = GREP_PATTERN_ERE;
else
-   opt->extended_regexp_option = 0;
+   opt->pattern_type_option = GREP_PATTERN_BRE;
return 0;
}
 
diff --git a/grep.h b/grep.h
index cee4357..fc36c2a 100644
--- a/grep.h
+++ b/grep.h
@@ -119,7 +119,6 @@ struct grep_opt {
int max_depth;
int funcname;
int funcbody;
-   int extended_regexp_option;
int pattern_type_option;
char color_context[COLOR_MAXLEN];
char color_filename[COLOR_MAXLEN];
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 6/6] date: add "unix" format

2016-07-22 Thread Jeff King

We already have "--date=raw", which is a Unix epoch
timestamp plus a contextual timezone (either the author's or
the local). But one may not care about the timezone and just
want the epoch timestamp by itself. It's not hard to parse
the two apart, but if you are using a pretty-print format,
you may want git to show the "finished" form that the user
will see.

We can accomodate this by adding a new date format, "unix",
which is basically "raw" without the timezone.

Signed-off-by: Jeff King 
---
 Documentation/rev-list-options.txt | 4 
 builtin/blame.c| 3 +++
 cache.h| 3 ++-
 date.c | 8 
 t/t0006-date.sh| 2 ++
 5 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 3ec75d4..8b1c946 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -751,6 +751,10 @@ format. Note that the `-local` option does not affect the
 seconds-since-epoch value (which is always measured in UTC), but does
 switch the accompanying timezone value.
 +
+`--date=unix` shows the date as a Unix epoch timestamp (seconds since
+1970).  As with `--raw`, this is always in UTC and therefore `-local`
+has no effect.
++
 `--date=format:...` feeds the format `...` to your system `strftime`.
 Use `--date=format:%c` to show the date in your system locale's
 preferred format.  See the `strftime` manual for a complete list of
diff --git a/builtin/blame.c b/builtin/blame.c
index 8fec0e1..cb08127 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2625,6 +2625,9 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
case DATE_RAW:
blame_date_width = sizeof("1161298804 -0700");
break;
+   case DATE_UNIX:
+   blame_date_width = sizeof("1161298804");
+   break;
case DATE_SHORT:
blame_date_width = sizeof("2006-10-19");
break;
diff --git a/cache.h b/cache.h
index 2bf97cc..dd587b2 100644
--- a/cache.h
+++ b/cache.h
@@ -1224,7 +1224,8 @@ struct date_mode {
DATE_ISO8601_STRICT,
DATE_RFC2822,
DATE_STRFTIME,
-   DATE_RAW
+   DATE_RAW,
+   DATE_UNIX
} type;
const char *strftime_fmt;
int local;
diff --git a/date.c b/date.c
index 4c7aa9b..a996331 100644
--- a/date.c
+++ b/date.c
@@ -177,6 +177,12 @@ const char *show_date(unsigned long time, int tz, const 
struct date_mode *mode)
struct tm *tm;
static struct strbuf timebuf = STRBUF_INIT;
 
+   if (mode->type == DATE_UNIX) {
+   strbuf_reset(&timebuf);
+   strbuf_addf(&timebuf, "%lu", time);
+   return timebuf.buf;
+   }
+
if (mode->local)
tz = local_tzoffset(time);
 
@@ -792,6 +798,8 @@ static enum date_mode_type parse_date_type(const char 
*format, const char **end)
return DATE_NORMAL;
if (skip_prefix(format, "raw", end))
return DATE_RAW;
+   if (skip_prefix(format, "unix", end))
+   return DATE_UNIX;
if (skip_prefix(format, "format", end))
return DATE_STRFTIME;
 
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 482fec0..c0c9108 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -46,8 +46,10 @@ check_show rfc2822 "$TIME" 'Wed, 15 Jun 2016 16:13:20 +0200'
 check_show short "$TIME" '2016-06-15'
 check_show default "$TIME" 'Wed Jun 15 16:13:20 2016 +0200'
 check_show raw "$TIME" '146600 +0200'
+check_show unix "$TIME" '146600'
 check_show iso-local "$TIME" '2016-06-15 14:13:20 +'
 check_show raw-local "$TIME" '146600 +'
+check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-- 
2.9.2.512.gc1ef750
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 5/6] date: document and test "raw-local" mode

2016-07-22 Thread Jeff King

The "raw" format shows a Unix epoch timestamp, but with a
timezone tacked on. The timestamp is not _in_ that zone, but
it is extra information about the time (by default, the zone
the author was in).

The documentation claims that "raw-local" does not work. It
does, but the end result is rather subtle. Let's describe it
in better detail, and test to make sure it works (namely,
the epoch time doesn't change, but the zone does).

While we are rewording the documentation in this area, let's
not use the phrase "does not work" for the remaining option,
"--relative". It's vague; do we accept it or not? We do
accept it, but it has no effect (which is a reasonable
outcome).

Signed-off-by: Jeff King 
---
 Documentation/rev-list-options.txt | 9 ++---
 t/t0006-date.sh| 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 5d1de06..3ec75d4 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -725,8 +725,8 @@ include::pretty-options.txt[]
`iso-local`), the user's local time zone is used instead.
 +
 `--date=relative` shows dates relative to the current time,
-e.g. ``2 hours ago''. The `-local` option cannot be used with
-`--raw` or `--relative`.
+e.g. ``2 hours ago''. The `-local` option has no effect for
+`--relative`.
 +
 `--date=local` is an alias for `--date=default-local`.
 +
@@ -746,7 +746,10 @@ format, often found in email messages.
 +
 `--date=short` shows only the date, but not the time, in `-MM-DD` format.
 +
-`--date=raw` shows the date in the internal raw Git format `%s %z` format.
+`--date=raw` shows the date in the internal raw Git format `%s %z`
+format. Note that the `-local` option does not affect the
+seconds-since-epoch value (which is always measured in UTC), but does
+switch the accompanying timezone value.
 +
 `--date=format:...` feeds the format `...` to your system `strftime`.
 Use `--date=format:%c` to show the date in your system locale's
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 4c8cf58..482fec0 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -47,6 +47,7 @@ check_show short "$TIME" '2016-06-15'
 check_show default "$TIME" 'Wed Jun 15 16:13:20 2016 +0200'
 check_show raw "$TIME" '146600 +0200'
 check_show iso-local "$TIME" '2016-06-15 14:13:20 +'
+check_show raw-local "$TIME" '146600 +'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-- 
2.9.2.512.gc1ef750

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 2/6] doc/rev-list-options: explain "-g" output formats

2016-07-22 Thread Jeff King

We document that asking for HEAD@{now} will switch the
output to show HEAD@{timestamp}, but not that specifying
`--date` has a similar effect, or that it can be overridden
with HEAD@{0}. Let's do so.

These rules come from 794151e (reflog-walk: always make
HEAD@{0} show indexed selectors, 2012-05-04), though that is
simply the culmination of years of these heuristics growing
organically.

Signed-off-by: Jeff King 
---
 Documentation/rev-list-options.txt | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 1b56253..5d1de06 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -252,10 +252,25 @@ list.
 +
 With `--pretty` format other than `oneline` (for obvious reasons),
 this causes the output to have two extra lines of information
-taken from the reflog.  By default, `ref@{Nth}` notation is
-used in the output.  When the starting commit is specified as
-`ref@{now}`, output also uses `ref@{timestamp}` notation
-instead.  Under `--pretty=oneline`, the commit message is
+taken from the reflog.  The reflog designator in the output may be shown
+as `ref@{Nth}` (where `Nth` is the reverse-chronological index in the
+reflog) or as `ref@{timestamp}` (with the timestamp for that entry),
+depending on a few rules:
++
+--
+1. If the starting point is specified as `ref@{Nth}`, show the index
+format.
++
+2. If the starting point was specified as `ref@{now}`, show the
+timestamp format.
++
+3. If neither was used, but `--date` was given on the command line, show
+the timestamp in the format requested by `--date`.
++
+4. Otherwise, show the index format.
+--
++
+Under `--pretty=oneline`, the commit message is
 prefixed with this information on the same line.
 This option cannot be combined with `--reverse`.
 See also linkgit:git-reflog[1].
-- 
2.9.2.512.gc1ef750

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 3/6] doc/pretty-formats: describe index/time formats for %gd

2016-07-22 Thread Jeff King

The "reflog selector" format changes based on a series of
heuristics, and that applies equally to both stock "log -g"
output, as well as "--format=%gd". The documentation for
"%gd" doesn't cover this. Let's mention the multiple formats
and refer the user back to the "-g" section for the complete
rules.

Signed-off-by: Jeff King 
---
 Documentation/pretty-formats.txt | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 29b19b9..36a300a 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -147,8 +147,11 @@ endif::git-rev-list[]
   "U" for a good signature with unknown validity and "N" for no signature
 - '%GS': show the name of the signer for a signed commit
 - '%GK': show the key used to sign a signed commit
-- '%gD': reflog selector, e.g., `refs/stash@{1}`
-- '%gd': shortened reflog selector, e.g., `stash@{1}`
+- '%gD': reflog selector, e.g., `refs/stash@{1}` or
+  `refs/stash@{2 minutes ago`}; the format follows the rules described
+  for the `-g` option
+- '%gd': shortened reflog selector, e.g., `stash@{1}` or
+  `stash@{2 minutes ago}`
 - '%gn': reflog identity name
 - '%gN': reflog identity name (respecting .mailmap, see
   linkgit:git-shortlog[1] or linkgit:git-blame[1])
-- 
2.9.2.512.gc1ef750

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2 4/6] doc/pretty-formats: explain shortening of %gd

2016-07-22 Thread Jeff King

The actual shortening rules aren't that interesting and
probably not worth getting into (I gloss over them here as
"shortened for human readability"). But the fact that %gD
shows whatever you gave on the command line is subtle and
worth mentioning. Since most people will feed a shortened
refname in the first place, it otherwise makes it hard to
understand the difference between the two.

Signed-off-by: Jeff King 
---
 Documentation/pretty-formats.txt | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 36a300a..b95d67e 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -149,9 +149,12 @@ endif::git-rev-list[]
 - '%GK': show the key used to sign a signed commit
 - '%gD': reflog selector, e.g., `refs/stash@{1}` or
   `refs/stash@{2 minutes ago`}; the format follows the rules described
-  for the `-g` option
-- '%gd': shortened reflog selector, e.g., `stash@{1}` or
-  `stash@{2 minutes ago}`
+  for the `-g` option. The portion before the `@` is the refname as
+  given on the command line (so `git log -g refs/heads/master` would
+  yield `refs/heads/master@{0}`).
+- '%gd': shortened reflog selector; same as `%gD`, but the refname
+  portion is shortened for human readability (so `refs/heads/master`
+  becomes just `master`).
 - '%gn': reflog identity name
 - '%gN': reflog identity name (respecting .mailmap, see
   linkgit:git-shortlog[1] or linkgit:git-blame[1])
-- 
2.9.2.512.gc1ef750

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Junio C Hamano

Stefan Beller  writes:

> The approach to tests is different though. I like yours better than mine,
> as it doesn't add more tests, but strengthens existing tests.

So... are you retracting
http://thread.gmane.org/gmane.comp.version-control.git/25 and
instead giving an Ack to these two?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Junio C Hamano

Stefan Beller  writes:

>> I expect I'd queue these two instead, after seeing Acks from Stefan.
>>
>> Thanks.
>
> Sure. Please take these 2 patches instead of mine.

OK, I've already queued yours but I haven't started today's
integration cycle yet (I first pick up new topics and updates and
review them in isolation, and then re-review them after rebuilding
jch/pu branches in a separate phase), so I'll replace it with these
two patches.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] submodule-helper: fix indexing in clone retry error reporting path

2016-07-22 Thread Stefan Beller

On Fri, Jul 22, 2016 at 12:52 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The approach to tests is different though. I like yours better than mine,
>> as it doesn't add more tests, but strengthens existing tests.
>
> So... are you retracting
> http://thread.gmane.org/gmane.comp.version-control.git/25 and
> instead giving an Ack to these two?
>

I like this series better
* for the approach
* for the tests
* for the commit message

So I do think this should be applied instead of what I sent.

I am tempted to send a squash proposal like:
(broken whitespaces)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6f6d67a..77be97e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -796,6 +796,8 @@ static int update_clone_task_finished(int result,
  return 0;
  } else {
  idx -= suc->list.nr;
+ if (idx >= suc->failed_clones_nr)
+ die("BUG: idx too large???");
  ce  = suc->failed_clones[idx];
  strbuf_addf(err, _("Failed to clone '%s' a second time, aborting"),
 ce->name);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Jeff King

On Fri, Jul 22, 2016 at 12:51:00PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> This comes from b22520a3 (grep: allow -E and -n to be turned on by
> >> default via configuration, 2011-03-30) back when we didn't have a
> >> more generic grep.patternType configuration mechanism in v1.7.5
> >> days, and it probably need to be deprecated to maintain our sanity.
> >> ...
> > I am not even sure we need to deprecate it. Once it becomes merely a
> > historical synonym for "grep.patternType=extended" we can live with it
> > indefinitely (and I do not think we need a deprecation period to go
> > there; the existing behavior is simply buggy).
> 
> I grossed over an important detail.
> 
> Pretending as if grep.patternType=extended were given when we see
> grep.extendedregexp=true and grep.patternType=basic is given when
> grep.extendedregexp=false changes the behaviour in a way that can be
> seen as the violation of (crazy) expectations t7810 makes.
> 
> Any user who depends on that crazy expectation will be broken by
> such a change, even if we do not deprecate and remove the
> configuration variable.

Ah. Reading 84befcd (grep: add a grep.patternType configuration setting,
2012-08-03) explains the rules, although I agree they are crazy (mostly
because "basic" is different "fixed").

So I think there are two crazy things going on:

  1. grep.extendedregexp takes precedence over the command-line (or at
 least "--basic" on the command line).

  2. The weird rules in 84befcd, where patternType=fixed has higher
 precedence than extendedRegexp, but patternType=basic does not.

It seems like (1) is a bug, and one we should not have to worry about
compatibility while fixing. It stems from not telling the difference
between "the user asked for nothing, so we defaulted to basic" and "the
user explicitly asked for basic".

I think (2) _is_ kind of crazy, and stems from similar confusion. I
dunno. Part of me wants to say "I find it highly unlikely that anybody
ever actually relied on this, and therefore it is OK to bring it closer
to sanity without a deprecation period". But I admit that is just a gut
feeling.

But even that is a lesser breakage than taking away grep.extendedRegexp
entirely. Taking it away breaks anybody who uses it; tweaking (2) only
breaks people who set both config variables. But why would anyone do
that?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] contrib/persistent-https: use Git version for build label

2016-07-22 Thread Parker Moore

> We tend to avoid catting a single file only to pipe the result into a 
> different command

You got it. Here's a new patch:

>From 432c0054a28a6c91b9fc1d9b53714061cb3d903c Mon Sep 17 00:00:00 2001
From: Parker Moore 
Date: Wed, 20 Jul 2016 18:33:28 -0600
Subject: [PATCH] contrib/persistent-https: use Git version for build label

The previous method simply used the UNIX timestamp of when the binary was
built as its build label.

$ make && ./git-remote-persistent-http -print_label
1469061546

This patch aims to align the label for this binary with the Git version
contained in the GIT-VERSION-FILE. This gives a better sense of the version
of the binary as it can be mapped to a particular revision or release of
Git itself. For example:

$ make && ./git-remote-persistent-http -print_label
2.9.1.275.g75676c8

Discussion of this patch is available on a related thread in the mailing
list surrounding this package called "contrib/persistent-https: update
ldflags syntax for Go 1.7+". The gmane.org link is:
http://article.gmane.org/gmane.comp.version-control.git/299653/

Signed-off-by: Parker Moore 
---
 contrib/persistent-https/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/persistent-https/Makefile
b/contrib/persistent-https/Makefile
index 92baa3b..9da1779 100644
--- a/contrib/persistent-https/Makefile
+++ b/contrib/persistent-https/Makefile
@@ -12,7 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.

-BUILD_LABEL=$(shell date +"%s")
+BUILD_LABEL=$(shell cut -d" " -f3 ../../GIT-VERSION-FILE)
 TAR_OUT=$(shell go env GOOS)_$(shell go env GOARCH).tar.gz

 all: git-remote-persistent-https git-remote-persistent-https--proxy \
-- 
2.9.2
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Documentation: pack-protocol correct NAK response

2016-07-22 Thread Stefan Beller

In the transport protocol we use NAK to signal the non existence of a
common base, so fix the documentation. This helps readers of the document,
as they don't have to wonder about the difference between NAK and NACK.
As NACK is used in git archive and upload-archive, this is easy to get
wrong.

Signed-off-by: Stefan Beller 
---
 Documentation/technical/pack-protocol.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 8b36343..d40ab65 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -307,7 +307,7 @@ In multi_ack mode:
 ready to make a packfile, it will blindly ACK all 'have' obj-ids
 back to the client.
 
-  * the server will then send a 'NACK' and then wait for another response
+  * the server will then send a 'NAK' and then wait for another response
 from the client - either a 'done' or another list of 'have' lines.
 
 In multi_ack_detailed mode:
-- 
2.9.2.370.g4a59376.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Documentation: pack-protocol correct NAK response

2016-07-22 Thread Junio C Hamano

On Fri, Jul 22, 2016 at 1:28 PM, Stefan Beller  wrote:
> In the transport protocol we use NAK to signal the non existence of a
> common base, so fix the documentation.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Junio C Hamano

Jeff King  writes:

> But even that is a lesser breakage than taking away grep.extendedRegexp
> entirely. Taking it away breaks anybody who uses it; tweaking (2) only
> breaks people who set both config variables. But why would anyone do
> that?

OK.  Now we have an evidence that we have thought things through
that we can point at when people later complain, I'd feel much safer
to apply that "This will break t7810" patch, together with changes
to t7810 to drop these crazy expectations ;-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] git-svn: allow --version to work anywhere

2016-07-22 Thread Eric Wong

Junio C Hamano  wrote:
> Eric Wong  writes:
> >   I'm hoping "cd /" in the test will always succeed;
> >   but I suppose non-*nix systems might fail, here.
> 
> How about digging a few levels of directory hierarchy, exporting
> GIT_CEILING_DIRECTORIES so that we won't find any repository and
> going there to run these tests?

Ah, thanks; I missed that.

Repushed my master in case it's a convenient time to pull.

The following changes since commit 29493589e97a2de0c4c1c314f61ccafaee3b5caf:

  archive-tar: huge offset and future timestamps would not work on 32-bit 
(2016-07-15 10:51:55 -0700)

are available in the git repository at:

  git://bogomips.org/git-svn.git master

for you to fetch changes up to c0071ae5dc1c610ab3791ece7ccf7d4772fde151:

  git-svn: allow --version to work anywhere (2016-07-22 20:38:11 +)


Eric Wong (2):
  git-svn: document svn.authorsProg in config
  git-svn: allow --version to work anywhere

 Documentation/git-svn.txt |  3 +++
 git-svn.perl  |  4 ++--
 t/t9100-git-svn-basic.sh  | 19 +++
 3 files changed, 24 insertions(+), 2 deletions(-)

-8<-
Subject: [PATCH] git-svn: allow --version to work anywhere

Checking the version of the installed SVN libraries should not
require a git repository at all.  This matches the behavior of
"git --version".

Add a test for "git svn help" for the same behavior while we're
at it, too.

Signed-off-by: Eric Wong 
---
 git-svn.perl |  4 ++--
 t/t9100-git-svn-basic.sh | 19 +++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index f609e54..4d41d22 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -339,7 +339,7 @@ if ($cmd && $cmd =~ /(?:clone|init|multi-init)$/) {
die "failed to open $ENV{GIT_DIR}: $!\n";
$ENV{GIT_DIR} = $1 if <$fh> =~ /^gitdir: (.+)$/;
}
-} else {
+} elsif ($cmd) {
my ($git_dir, $cdup);
git_cmd_try {
$git_dir = command_oneline([qw/rev-parse --git-dir/]);
@@ -356,7 +356,7 @@ if ($cmd && $cmd =~ /(?:clone|init|multi-init)$/) {
 
 my %opts = %{$cmd{$cmd}->[2]} if (defined $cmd);
 
-read_git_config(\%opts);
+read_git_config(\%opts) if $ENV{GIT_DIR};
 if ($cmd && ($cmd eq 'log' || $cmd eq 'blame')) {
Getopt::Long::Configure('pass_through');
 }
diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh
index 28082b1..ab6593b 100755
--- a/t/t9100-git-svn-basic.sh
+++ b/t/t9100-git-svn-basic.sh
@@ -19,6 +19,25 @@ case "$GIT_SVN_LC_ALL" in
;;
 esac
 
+deepdir=nothing-above
+ceiling=$PWD
+
+test_expect_success 'git svn --version works anywhere' '
+   mkdir -p "$deepdir" && (
+   export GIT_CEILING_DIRECTORIES="$ceiling" &&
+   cd "$deepdir" &&
+   git svn --version
+   )
+'
+
+test_expect_success 'git svn help works anywhere' '
+   mkdir -p "$deepdir" && (
+   export GIT_CEILING_DIRECTORIES="$ceiling" &&
+   cd "$deepdir" &&
+   git svn help
+   )
+'
+
 test_expect_success \
 'initialize git svn' '
mkdir import &&
-- 
EW
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 0/3] Git filter protocol

2016-07-22 Thread Junio C Hamano

larsxschnei...@gmail.com writes:

> The first two patches are cleanup patches which are not really necessary
> for the feature.

These two looked trivially good.

I think I can agree with what 3/3 wants to do in principle, but

 * "protocol" is not quite the right word.  The current way to
   interact with clean and smudge filters can be considered using a
   different "protocol", that conveys the data and the options via
   the command line and pipe.  The most distinguishing feature that
   differentiates the old way and the new style this change allows
   is that it allows you to have a single instance of the process
   running that can be reused?

 * I am not sure what's the pros-and-cons in forcing people writing
   a single program that can do both cleaning and smudging.  You
   cannot have only "smudge" side that uses the long-running process
   while "clean" side that runs single-shot invocation with this
   design, which I'd imagine would be a downside.  If you are going
   to use a long-running process interface for both sides, this
   design allows you to do it with fewer number of processes, which
   may be an upside.

 * The way the serialized access to these long-running processes
   work in 3/3 would make it harder or impossible to later
   parallelize conversion?  I am imagining a far future where we
   would run "git checkout ." using (say) two threads, one
   responsible for active_cache[0..active_nr/2] and the other
   responsible for the remainder.

> You will notice that I do not check the exact number of "clean" filter
> invocations in the tests.

That is a good thing to do.  You shouldn't really care for the
proper operation of the feature, reducing the number of them would
be an independent topic (see the work of Peff earlier today), and we
may even find a need to make _more_ calls for correctness (again,
see the work of Peff earlier today -- to a person who wants to keep
the number of requests to the attribute system low, the change may
look like a regression, but it is necessary for the overall system;
you may find a similar need to running "clean" more for some need of
the overall system that you do not anticipate right now).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH/RFC] fast-import: Keep a fake pack window on the recently written data

2016-07-22 Thread Mike Hommey

Hi,

Any thoughts on the following?

Mike

On Mon, Jul 04, 2016 at 08:44:39AM +0900, Mike Hommey wrote:
> The are many ways in which fast-import ends up calling gfi_unpack_entry,
> and fery few work-arounds. I've patched fast-import for it to be smarter
> in corner cases, allowing some additional work-arounds, but it's just
> too easy to fall on gfi_unpack_entry again, so I abandonned that path.
> 
> The problem with gfi_unpack_entry is that if something has been written
> to the pack after last time it was used, it closes all pack windows. On
> OSX, this triggers munmap, which shows up in performance profiles.
> 
> To give an idea how bad this is, here is how long it takes to clone
> https://hg.mozilla.org/mozilla-unified/ with the master branch of
> git-cinnabar (which uses fast-import) on a mac mini: more than 5 hours.
> I can't actually give the exact number, because it was killed, after
> spending 2 hours importing 1.77M files and 3 hours importing 120k
> manifests.
> 
> The same clone, with a variant of this patch, *finished* in 2 hours and
> 10 minutes, spending 24 minutes importing the same 1.77M files and only
> 13 minutes to cover the same 120k manifests. It took an hour and 20
> minutes to cover the remaining 210k manifests. You can imagine how long
> it would have taken without the patch if it hadn't been killed...
> 
> Now, this is proof of concept level. There are many things that are not
> right with this patch, starting from the fact it doesn't handle
> checkpoints, and isn't safe for every kind of integer overflows. Or
> malloc'ating exactly packed_git_window_size bytes (which on 64-bits
> systems, is 1GB), etc.
> 
> But it feels to me this kind of fake pack window is a cheap way to
> counter the slowness of munmap on OSX, although the fact that it's a
> hack around the pack code in sha1_file.c is not very nice. Maybe a
> better start to fix the issue would be to add better interfaces to
> the pack code to handle pack writers that can read at the same time.
> 
> Thoughts?
> 
> Past related discussions:
>   $gmane/291717
>   $gmane/273465
> ---
>  fast-import.c | 90 
> +--
>  1 file changed, 63 insertions(+), 27 deletions(-)
> 
> diff --git a/fast-import.c b/fast-import.c
> index c504ef7..4e26883 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -316,6 +316,7 @@ static struct atom_str **atom_table;
>  static struct pack_idx_option pack_idx_opts;
>  static unsigned int pack_id;
>  static struct sha1file *pack_file;
> +static struct pack_window *pack_win;
>  static struct packed_git *pack_data;
>  static struct packed_git **all_packs;
>  static off_t pack_size;
> @@ -862,6 +863,39 @@ static struct tree_content *dup_tree_content(struct 
> tree_content *s)
>   return d;
>  }
>  
> +static void _sha1write(struct sha1file *f, const void *buf, unsigned int 
> count)
> +{
> + sha1write(f, buf, count);
> + /* Always last used */
> + pack_win->last_used = -1;
> + pack_win->inuse_cnt = -1;
> +
> + pack_data->pack_size += count;
> +
> + if (packed_git_window_size - pack_win->len >= count) {
> + memcpy(pack_win->base + pack_win->len - 20, buf, count);
> + pack_win->len += count;
> + } else {
> + struct pack_window *cursor = NULL;
> + /* We're sliding the window, so we don't need to memcpy
> +  * everything. */
> + pack_win->offset += ((pack_win->len - 20 + count)
> +  / packed_git_window_size) * packed_git_window_size;
> + pack_win->len = count % packed_git_window_size -
> + (packed_git_window_size - pack_win->len);
> + memcpy(pack_win->base, buf + count - pack_win->len + 20,
> +pack_win->len - 20);
> +
> + /* Ensure a pack window on the data before that, otherwise,
> +  * use_pack() may try to create a window that overlaps with
> +  * this one, and that won't work because it won't be complete. 
> */
> + sha1flush(f);
> + use_pack(pack_data, &cursor,
> +  pack_win->offset - packed_git_window_size, NULL);
> + unuse_pack(&cursor);
> + }
> +}
> +
>  static void start_packfile(void)
>  {
>   static char tmp_file[PATH_MAX];
> @@ -873,15 +907,22 @@ static void start_packfile(void)
> "pack/tmp_pack_XX");
>   FLEX_ALLOC_STR(p, pack_name, tmp_file);
>   p->pack_fd = pack_fd;
> + p->pack_size = 20;
>   p->do_not_close = 1;
>   pack_file = sha1fd(pack_fd, p->pack_name);
>  
> + p->windows = pack_win = xcalloc(1, sizeof(*p->windows));
> + pack_win->offset = 0;
> + pack_win->len = 20;
> + pack_win->base = xmalloc(packed_git_window_size);
> + pack_win->next = NULL;
> +
>   hdr.hdr_signature = htonl(PACK_SIGNATURE);
>   hdr.hdr_version = htonl(2);
>   hdr.hdr_entries = 0;
> - sha1write(pa

Re: [PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread Torsten Bögershausen




On 07/22/2016 05:49 PM, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

Git's clean/smudge mechanism invokes an external filter process for every
single blob that is affected by a filter. If Git filters a lot of blobs
then the startup time of the external filter processes can become a
significant part of the overall Git execution time.

This patch adds the filter..useProtocol option which, if enabled,
keeps the external filter process running and processes all blobs with
the following protocol over stdin/stdout.

1. Git starts the filter on first usage and expects a welcome message
with protocol version number:
Git <-- Filter: "git-filter-protocol\n"
Git <-- Filter: "version 1"

Is there no terminator here ?
How long should the reading side wait without a '\n', or should it read
"version 1\n" ?



2. Git sends the command (either "smudge" or "clean"), the filename, the
content size in bytes, and the content separated by a newline character:
Git --> Filter: "smudge\n"
Git --> Filter: "testfile.dat\n"
Git --> Filter: "7\n"
Git --> Filter: "CONTENT"

3. The filter is expected to answer with the result content size in
bytes and the result content separated by a newline character:
Git <-- Filter: "15\n"
Git <-- Filter: "SMUDGED_CONTENT"

4. The filter is expected to wait for the next file in step 2, again.

Please note that the protocol filters do not support stream processing
with this implemenatation because the filter needs to know the length of

typo

the result in advance. A protocol version 2 could address this in a
future patch.

Signed-off-by: Lars Schneider 
Helped-by: Martin-Louis Bright 
---
 Documentation/gitattributes.txt |  41 +++-
 convert.c   | 210 ++--
 t/t0021-conversion.sh   | 170 
 t/t0021/rot13.pl|  80 +++
 4 files changed, 494 insertions(+), 7 deletions(-)
 create mode 100755 t/t0021/rot13.pl

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8882a3e..7026d62 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -300,7 +300,10 @@ checkout, when the `smudge` command is specified, the 
command is
 fed the blob object from its standard input, and its standard
 output is used to update the worktree file.  Similarly, the
 `clean` command is used to convert the contents of worktree file
-upon checkin.
+upon checkin. By default these commands process only a single
+blob and terminate. If the setting filter..useProtocol is
+enabled then Git can process all blobs with a single filter command
+invocation (see filter protocol below).

 One use of the content filtering is to massage the content into a shape
 that is more convenient for the platform, filesystem, and the user to use.
@@ -375,6 +378,42 @@ substitution.  For example:
 


+Filter Protocol
+^^^
+
+If the setting filter..useProtocol is enabled then Git
+can process all blobs with a single filter command invocation
+by talking with the following protocol over stdin/stdout.
+
+Git starts the filter on first usage and expects a welcome
+message with protocol version number:
+
+Git < Filter: "git-filter-protocol\n"
+Git < Filter: "version 1"
+
+
+Afterwards Git sends a blob command (either "smudge" or "clean"),
+the filename, the content size in bytes, and the content separated
+by a newline character:
+
+Git > Filter: "smudge\n"
+Git > Filter: "testfile.dat\n"
+Git > Filter: "7\n"
+Git > Filter: "CONTENT"
+
+
+The filter is expected to answer with the result content size in
+bytes and the result content separated by a newline character:
+
+Git < Filter: "15\n"
+Git < Filter: "SMUDGED_CONTENT"
+
+
+Afterwards the filter is expected to wait for the next blob process
+command. A demo implementation can be found in `t/t0021/rot13.pl`
+located in the Git core repository.
+
+
 Interaction between checkin/checkout attributes
 ^^^

diff --git a/convert.c b/convert.c
index 522e2c5..91ce86f 100644
--- a/convert.c
+++ b/convert.c
@@ -481,12 +481,188 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
return ret;
 }

+static int cmd_process_map_init = 0;
+static struct hashmap cmd_process_map;
+
+struct cmd2process {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *cmd;
+   long protocol;
+   struct child_process process;
+};
+
+static int cmd2process_cmp(const struct cmd2process *e1,
+   const struct 
cmd2process *e2,
+   const void *unused)
+{
+   return

Re: [PATCH v4 4/8] doc: give headings for the two and three dot notations

2016-07-22 Thread Junio C Hamano

Philip Oakley  writes:

> +Special '{caret}' Shorthand Notations
> +~~
> +Two other shorthands exist, particularly useful for merge commits, is
> +for naming a set that is formed by a commit and its parent commits.

As these are not all that "special", how about retitling this
section as:

Other Shorthand Notations
~


> -To summarize:
> +The 'r1{caret}@' notation means all parents of 'r1'.
> +
> +'r1{caret}!' includes commit 'r1' but excludes all of its parents.
> +
> +Revision Range Summary
> +--
>  
>  ''::
>   Include commits that are reachable from (i.e. ancestors of)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] format-patch: escape "From " lines recognized by mailsplit

2016-07-22 Thread Eric Wong

Users have mistakenly copied "From " lines into commit messages
in the past, and will certainly make the same mistakes in the
future.  Since not everyone uses mboxrd, yet, we should at least
prevent miss-split mails by always escaping "From " lines based
on the check used by mailsplit.

mailsplit will not perform unescaping by default, yet, as it
could cause further invocations of format-patch from old
versions of git to generate bad output.  Propagating the mboxo
escaping is preferable to miss-split patches.  Unescaping may
still be performed via "--mboxrd".

ref: https://public-inbox.org/git/20160606230248.ga15...@dcvr.yhbt.net/T/#u

Signed-off-by: Eric Wong 
---
 builtin/mailsplit.c | 32 +---
 mailinfo.c  | 31 +++
 mailinfo.h  |  1 +
 pretty.c| 16 +---
 t/t4014-format-patch.sh | 14 ++
 5 files changed, 60 insertions(+), 34 deletions(-)

diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 3068168..bb8f9c9 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -8,41 +8,11 @@
 #include "builtin.h"
 #include "string-list.h"
 #include "strbuf.h"
+#include "mailinfo.h"
 
 static const char git_mailsplit_usage[] =
 "git mailsplit [-d] [-f] [-b] [--keep-cr] -o 
[(|)...]";
 
-static int is_from_line(const char *line, int len)
-{
-   const char *colon;
-
-   if (len < 20 || memcmp("From ", line, 5))
-   return 0;
-
-   colon = line + len - 2;
-   line += 5;
-   for (;;) {
-   if (colon < line)
-   return 0;
-   if (*--colon == ':')
-   break;
-   }
-
-   if (!isdigit(colon[-4]) ||
-   !isdigit(colon[-2]) ||
-   !isdigit(colon[-1]) ||
-   !isdigit(colon[ 1]) ||
-   !isdigit(colon[ 2]))
-   return 0;
-
-   /* year */
-   if (strtol(colon+3, NULL, 10) <= 90)
-   return 0;
-
-   /* Ok, close enough */
-   return 1;
-}
-
 static struct strbuf buf = STRBUF_INIT;
 static int keep_cr;
 static int mboxrd;
diff --git a/mailinfo.c b/mailinfo.c
index 9f19ca1..0ebd953 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -1035,3 +1035,34 @@ void clear_mailinfo(struct mailinfo *mi)
 
strbuf_release(&mi->log_message);
 }
+
+int is_from_line(const char *line, int len)
+{
+   const char *colon;
+
+   if (len < 20 || memcmp("From ", line, 5))
+   return 0;
+
+   colon = line + len - 2;
+   line += 5;
+   for (;;) {
+   if (colon < line)
+   return 0;
+   if (*--colon == ':')
+   break;
+   }
+
+   if (!isdigit(colon[-4]) ||
+   !isdigit(colon[-2]) ||
+   !isdigit(colon[-1]) ||
+   !isdigit(colon[ 1]) ||
+   !isdigit(colon[ 2]))
+   return 0;
+
+   /* year */
+   if (strtol(colon+3, NULL, 10) <= 90)
+   return 0;
+
+   /* Ok, close enough */
+   return 1;
+}
diff --git a/mailinfo.h b/mailinfo.h
index 93776a7..c1430a0 100644
--- a/mailinfo.h
+++ b/mailinfo.h
@@ -37,5 +37,6 @@ struct mailinfo {
 extern void setup_mailinfo(struct mailinfo *);
 extern int mailinfo(struct mailinfo *, const char *msg, const char *patch);
 extern void clear_mailinfo(struct mailinfo *);
+int is_from_line(const char *line, int len);
 
 #endif /* MAILINFO_H */
diff --git a/pretty.c b/pretty.c
index 9fa42c2..898c0a3 100644
--- a/pretty.c
+++ b/pretty.c
@@ -10,6 +10,7 @@
 #include "color.h"
 #include "reflog-walk.h"
 #include "gpg-interface.h"
+#include "mailinfo.h"
 
 static char *user_format;
 static struct cmt_fmt_map {
@@ -1745,9 +1746,18 @@ void pp_remainder(struct pretty_print_context *pp,
strbuf_add_tabexpand(sb, pp->expand_tabs_in_log,
 line, linelen);
else {
-   if (pp->fmt == CMIT_FMT_MBOXRD &&
-   is_mboxrd_from(line, linelen))
-   strbuf_addch(sb, '>');
+   switch (pp->fmt) {
+   case CMIT_FMT_EMAIL:
+   if (is_from_line(line, linelen))
+   strbuf_addch(sb, '>');
+   break;
+   case CMIT_FMT_MBOXRD:
+   if (is_mboxrd_from(line, linelen))
+   strbuf_addch(sb, '>');
+   break;
+   default:
+   break;
+   }
 
strbuf_add(sb, line, linelen);
}
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 1206c48..8fa3982 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -1606,4 +1606,18 @@ test_expect_success 'format-patch --pretty=mbox

Re: [PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread Ramsay Jones



On 22/07/16 16:49, larsxschnei...@gmail.com wrote:
> From: Lars Schneider 
> 
> Git's clean/smudge mechanism invokes an external filter process for every
> single blob that is affected by a filter. If Git filters a lot of blobs
> then the startup time of the external filter processes can become a
> significant part of the overall Git execution time.
> 
> This patch adds the filter..useProtocol option which, if enabled,
> keeps the external filter process running and processes all blobs with
> the following protocol over stdin/stdout.
> 
> 1. Git starts the filter on first usage and expects a welcome message
> with protocol version number:
>   Git <-- Filter: "git-filter-protocol\n"
>   Git <-- Filter: "version 1"

Hmm, I was a bit surprised to see a 'filter' talk first (but so long as the
interaction is fully defined, I guess it doesn't matter).

[If you wanted to check for a version, you could add a "version" command
instead, just like "clean" and "smudge".]

[snip]

> diff --git a/convert.c b/convert.c
> index 522e2c5..91ce86f 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -481,12 +481,188 @@ static int apply_filter(const char *path, const char 
> *src, size_t len, int fd,
>   return ret;
>  }
>  
> +static int cmd_process_map_init = 0;
> +static struct hashmap cmd_process_map;
> +
> +struct cmd2process {
> + struct hashmap_entry ent; /* must be the first member! */
> + const char *cmd;
> + long protocol;
> + struct child_process process;
> +};
> +
> +static int cmd2process_cmp(const struct cmd2process *e1,
> + const struct 
> cmd2process *e2,
> + const void *unused)
> +{
> + return strcmp(e1->cmd, e2->cmd);
> +}
> +
> +static struct cmd2process *find_protocol_filter_entry(const char *cmd)
> +{
> + struct cmd2process k;
> + hashmap_entry_init(&k, strhash(cmd));
> + k.cmd = cmd;
> + return hashmap_get(&cmd_process_map, &k, NULL);
> +}
> +
> +static void stop_protocol_filter(struct cmd2process *entry) {
> + if (!entry)
> + return;
> + sigchain_push(SIGPIPE, SIG_IGN);
> + close(entry->process.in);
> + close(entry->process.out);
> + sigchain_pop(SIGPIPE);
> + finish_command(&entry->process);
> + child_process_clear(&entry->process);
> + hashmap_remove(&cmd_process_map, entry, NULL);
> + free(entry);
> +}
> +
> +static struct cmd2process *start_protocol_filter(const char *cmd)
> +{
> + int ret = 1;
> + struct cmd2process *entry = NULL;
> + struct child_process *process = NULL;
> + struct strbuf nbuf = STRBUF_INIT;
> + struct string_list split = STRING_LIST_INIT_NODUP;
> + const char *argv[] = { NULL, NULL };
> + const char *header = "git-filter-protocol\nversion";
> +
> + entry = xmalloc(sizeof(*entry));
> + hashmap_entry_init(entry, strhash(cmd));
> + entry->cmd = cmd;
> + process = &entry->process;
> +
> + child_process_init(process);
> + argv[0] = cmd;
> + process->argv = argv;
> + process->use_shell = 1;
> + process->in = -1;
> + process->out = -1;
> +
> + if (start_command(process)) {
> + error("cannot fork to run external persistent filter '%s'", 
> cmd);
> + return NULL;
> + }
> + strbuf_reset(&nbuf);
> +
> + sigchain_push(SIGPIPE, SIG_IGN);
> + ret &= strbuf_read_once(&nbuf, process->out, 0) > 0;

Hmm, how much will be read into nbuf by this single call?
Since strbuf_read_once() makes a single call to xread(), with
a len argument that will probably be 8192, you can not really
tell how much it will read, in general. (xread() does not
guarantee how many bytes it will read.)

In particular, it could be less than strlen(header).

> + sigchain_pop(SIGPIPE);
> +
> + strbuf_stripspace(&nbuf, 0);
> + string_list_split_in_place(&split, nbuf.buf, ' ', 2);
> + ret &= split.nr > 1;
> + ret &= strncmp(header, split.items[0].string, strlen(header)) == 0;
> + if (ret) {
> + entry->protocol = strtol(split.items[1].string, NULL, 10);
> + switch (entry->protocol) {
> + case 1:
> + break;
> + default:
> + ret = 0;
> + error("unsupported protocol version %s for 
> external persistent filter '%s'",
> + nbuf.buf, cmd);
> + }
> + }
> + string_list_clear(&split, 0);
> + strbuf_release(&nbuf);
> +
> + if (!ret) {
> + error("initialization for external persistent filter '%s' 
> failed", cmd);
> + return NULL;
> + }
> +
> + hashmap_add(&cmd_process_map, entry);
> + return entry;
> +}
> +
> +static int apply_protocol_filter(const char *path, const char *src, size_t 
> len,
> + int fd, struct strbuf *dst, 
> co

Re: git-prompt.sh incompatible with non-basic global grep.patternType

2016-07-22 Thread Eric Sunshine

On Fri, Jul 22, 2016 at 3:21 PM, Junio C Hamano  wrote:
> Subject: [PATCH] grep: further simplify setting the pattern type
>
> When c5c31d33 (grep: move pattern-type bits support to top-level
> grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
> function, the intention was to allow the users of grep API to having
> to fiddle only with .pattern_type_option (which can be set to "fixed",
> "basic", "extended", and "pcre"), and then immediately before compiling
> the pattern strings for use, call grep_commit_pattern_type() to have
> it prepare various bits in the grep_opt structure (like .fixed,
> .regflags, etc.).
>
> However, grep_set_pattern_type_option() helper function the grep API
> internally uses were left as an external function by mistake.  This
> function shouldn't have been made callable by the users of the API.
>
> Later when the grep API was used in revision graversal machinery,

s/graversal/traversal/

> the caller then mistakenly started calling the function around
> 34a4ae55 (log --grep: use the same helper to set -E/-F options as
> "git grep", 2012-10-03), instead of setting the .pattern_type_option
> field and letting the grep_commit_pattern_type() to take care of the
> details.
>
> This caused an unnecessary bug that made a configured
> grep.patternType take precedence over the command line options
> (e.g. --basic-regexp, --fixed-strings) in "git log" family of
> commands.
>
> Signed-off-by: Junio C Hamano 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread Ramsay Jones


Hi Lars,

On 23/07/16 00:19, Ramsay Jones wrote:
> 
> 
> On 22/07/16 16:49, larsxschnei...@gmail.com wrote:
>> From: Lars Schneider 
>>
>> Git's clean/smudge mechanism invokes an external filter process for every
>> single blob that is affected by a filter. If Git filters a lot of blobs
>> then the startup time of the external filter processes can become a
>> significant part of the overall Git execution time.
>>
>> This patch adds the filter..useProtocol option which, if enabled,
>> keeps the external filter process running and processes all blobs with
>> the following protocol over stdin/stdout.
>>
>> 1. Git starts the filter on first usage and expects a welcome message
>> with protocol version number:
>>  Git <-- Filter: "git-filter-protocol\n"
>>  Git <-- Filter: "version 1"
> 
> Hmm, I was a bit surprised to see a 'filter' talk first (but so long as the
> interaction is fully defined, I guess it doesn't matter).
> 
> [If you wanted to check for a version, you could add a "version" command
> instead, just like "clean" and "smudge".]
> 
> [snip]
> 
>> diff --git a/convert.c b/convert.c
>> index 522e2c5..91ce86f 100644
>> --- a/convert.c
>> +++ b/convert.c
>> @@ -481,12 +481,188 @@ static int apply_filter(const char *path, const char 
>> *src, size_t len, int fd,
>>  return ret;
>>  }
>>  
>> +static int cmd_process_map_init = 0;
>> +static struct hashmap cmd_process_map;
>> +
>> +struct cmd2process {
>> +struct hashmap_entry ent; /* must be the first member! */
>> +const char *cmd;
>> +long protocol;
>> +struct child_process process;
>> +};
>> +
>> +static int cmd2process_cmp(const struct cmd2process *e1,
>> +const struct 
>> cmd2process *e2,
>> +const void *unused)
>> +{
>> +return strcmp(e1->cmd, e2->cmd);
>> +}
>> +
>> +static struct cmd2process *find_protocol_filter_entry(const char *cmd)
>> +{
>> +struct cmd2process k;
>> +hashmap_entry_init(&k, strhash(cmd));
>> +k.cmd = cmd;
>> +return hashmap_get(&cmd_process_map, &k, NULL);
>> +}
>> +
>> +static void stop_protocol_filter(struct cmd2process *entry) {
>> +if (!entry)
>> +return;
>> +sigchain_push(SIGPIPE, SIG_IGN);
>> +close(entry->process.in);
>> +close(entry->process.out);
>> +sigchain_pop(SIGPIPE);
>> +finish_command(&entry->process);
>> +child_process_clear(&entry->process);
>> +hashmap_remove(&cmd_process_map, entry, NULL);
>> +free(entry);
>> +}
>> +
>> +static struct cmd2process *start_protocol_filter(const char *cmd)
>> +{
>> +int ret = 1;
>> +struct cmd2process *entry = NULL;
>> +struct child_process *process = NULL;
>> +struct strbuf nbuf = STRBUF_INIT;
>> +struct string_list split = STRING_LIST_INIT_NODUP;
>> +const char *argv[] = { NULL, NULL };
>> +const char *header = "git-filter-protocol\nversion";
>> +
>> +entry = xmalloc(sizeof(*entry));
>> +hashmap_entry_init(entry, strhash(cmd));
>> +entry->cmd = cmd;
>> +process = &entry->process;
>> +
>> +child_process_init(process);
>> +argv[0] = cmd;
>> +process->argv = argv;
>> +process->use_shell = 1;
>> +process->in = -1;
>> +process->out = -1;
>> +
>> +if (start_command(process)) {
>> +error("cannot fork to run external persistent filter '%s'", 
>> cmd);
>> +return NULL;
>> +}
>> +strbuf_reset(&nbuf);
>> +
>> +sigchain_push(SIGPIPE, SIG_IGN);
>> +ret &= strbuf_read_once(&nbuf, process->out, 0) > 0;
> 
> Hmm, how much will be read into nbuf by this single call?
> Since strbuf_read_once() makes a single call to xread(), with
> a len argument that will probably be 8192, you can not really
> tell how much it will read, in general. (xread() does not
> guarantee how many bytes it will read.)
> 
> In particular, it could be less than strlen(header).

Please ignore this email, it's late ... ;-)

Sorry for the noise.

[Off to bed now]

ATB,
Ramsay Jones

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread Jakub Narębski

W dniu 2016-07-22 o 17:49, larsxschnei...@gmail.com pisze:
> From: Lars Schneider 

Nb. this line is only needed if you want author name and/or date
different from the email sender, or if you have sender line misconfigured
(e.g. lacking the human readable name).

> 
> Git's clean/smudge mechanism invokes an external filter process for every
> single blob that is affected by a filter. If Git filters a lot of blobs
> then the startup time of the external filter processes can become a
> significant part of the overall Git execution time.

Do I understand it correctly (from the commit description) that with
this new option you start one filter process for the whole life of
the Git command (e.g. `git add .`), which may perform more than one
cleanup, but do not create a daemon or daemon-like process which
would live for several commands invocations?

> 
> This patch adds the filter..useProtocol option which, if enabled,
> keeps the external filter process running and processes all blobs with
> the following protocol over stdin/stdout.

I agree with Junio that the name "useProtocol" is bad, and not quite
right. Perhaps "persistent" would be better? Also, what is the value
of `filter..useProtocol`: boolean? or a script name?

I also agree that we might wat to be able to keep clean and smudge
filters separate, but be able to run a single program if they are
both the same. I think there is a special case for filter unset,
and/or filter being "cat" -- we would want to keep that.

My proposal is to use `filter..persistent` as an addition
to 'clean' and 'smudge' variables, with the following possible
values:

  * none (the default)
  * clean
  * smudge
  * both

I assume that either Git would have to start multiple filter
commands for multi-threaded operation, or the protocol would have
to be extended to make persistent filter fork itself.


BTW. what would happen in your original proposal if the user had
*both* filter..useProtocol and filter..smudge
(and/or filter..clean) set?

> 
> 1. Git starts the filter on first usage and expects a welcome message
> with protocol version number:
>   Git <-- Filter: "git-filter-protocol\n"
>   Git <-- Filter: "version 1"

I was wondering how Git would know that filter executable was started,
but then I realized it was once-per-command invocation, not a daemon.

I agree with Torsten that there should be a terminator after the
version number.

Also, for future extendability this should be probably followed by
possibly empty list of script capabilities, that is:

Git <-- Filter: "git-filter-protocol\n"
Git <-- Filter: "version 1.1\n"
Git <-- Filter: "capabilities clean smudge\n"

Or we can add capabilities in later version...

BTW. why not follow e.g. HTTP protocol example, and use

Git <-- Filter: "git-filter-protocol/1\n"

> 2. Git sends the command (either "smudge" or "clean"), the filename, the
> content size in bytes, and the content separated by a newline character:
>   Git --> Filter: "smudge\n"

Would it help (for some cases) to pass the name of filter that
is being invoked?

>   Git --> Filter: "testfile.dat\n"

Unfortunately, while sane filenames should not contain newlines[1],
the unfortunate fact is that *filenames can include newlines*, and
you need to be able to handle that[2].  Therefore you need either to
choose a different separator (the only one that can be safely used
is "\0", i.e. the NUL character - but it is not something easy to
handle by shell scripts), or C-quote filenames as needed, or always
C-quote filenames.  C-quoting at minimum should include quoting newline
character, and the escape character itself.

BTW. is it the basename of a file, or a full pathname?

[1]: http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html
[2]: http://www.dwheeler.com/essays/filenames-in-shell.html

>   Git --> Filter: "7\n"

That's the content size in bytes written as an ASCII number.

>   Git --> Filter: "CONTENT"

Can filter ignore the content size, and just read all what it was
sent, that is until eof or something?

> 
> 3. The filter is expected to answer with the result content size in
> bytes and the result content separated by a newline character:
>   Git <-- Filter: "15\n"
>   Git <-- Filter: "SMUDGED_CONTENT"

I wonder how hard would be to write filters for this protocol...

> 
> 4. The filter is expected to wait for the next file in step 2, again.
> 
> Please note that the protocol filters do not support stream processing
> with this implemenatation because the filter needs to know the length of
~^^

implementation

> the result in advance. A protocol version 2 could address this in a
> future patch.
> 
> Signed-off-by: Lars Schneider 
> Helped-by: Martin-Louis Bright 
> ---
>  Documentation/gitattributes.txt |  41 +++-
>  convert.c   | 210 
> ++--
>  t/t0021-conversion.sh   | 170

[PATCH 0/2] fix git-svn HTTP tests under Apache 2.4

2016-07-22 Thread Eric Wong

This resurrects the series started by Michael in April 2015 at:

  https://public-inbox.org/git/cover.1428505184.git@drmicha.warpmail.net/T/

But only keeps only his first patch.  PATCH 2/2 reuses the
work Clemens introduced in t/lib-httpd.sh back in 2008 to
enable mod_dav_svn.

The following changes since commit 08bb3500a2a718c3c78b0547c68601cafa7a8fd9:

  Sixth batch of topics for 2.10 (2016-07-19 13:26:16 -0700)

are available in the git repository at:

  git://bogomips.org/git-svn.git svn-httpd

for you to fetch changes up to cbc1a1fbd2911b851467862b7163c303a14b17ab:

  git svn: migrate tests to use lib-httpd (2016-07-23 03:31:21 +)


Eric Wong (1):
  git svn: migrate tests to use lib-httpd

Michael J Gruber (1):
  t/t91*: do not say how to avoid the tests

 t/lib-git-svn.sh  | 91 +--
 t/lib-httpd.sh|  8 ++-
 t/lib-httpd/apache.conf   |  4 +-
 t/t9100-git-svn-basic.sh  |  2 -
 t/t9115-git-svn-dcommit-funky-renames.sh  |  7 ++-
 t/t9118-git-svn-funky-branch-names.sh |  2 +-
 t/t9120-git-svn-clone-with-percent-escapes.sh |  2 +-
 t/t9142-git-svn-shallow-clone.sh  |  2 +-
 t/t9158-git-svn-mergeinfo.sh  |  2 -
 t/t9160-git-svn-preserve-empty-dirs.sh|  1 -
 10 files changed, 30 insertions(+), 91 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] git svn: migrate tests to use lib-httpd

2016-07-22 Thread Eric Wong

This allows us to use common test infrastructure and parallelize
the tests.  For now, GIT_SVN_TEST_HTTPD=true needs to be set to
enable the SVN HTTP tests because we reuse the same test cases
for both file:// and http:// SVN repositories.  SVN_HTTPD_PORT
is no longer honored.

Tested under Apache 2.2 and 2.4 on Debian 7.x (wheezy) and
8.x (jessie), respectively.

Cc: Clemens Buchacher 
Cc: Michael J Gruber 
Signed-off-by: Eric Wong 
---
 t/lib-git-svn.sh  | 91 +--
 t/lib-httpd.sh|  8 ++-
 t/lib-httpd/apache.conf   |  4 +-
 t/t9115-git-svn-dcommit-funky-renames.sh  |  7 ++-
 t/t9118-git-svn-funky-branch-names.sh |  2 +-
 t/t9120-git-svn-clone-with-percent-escapes.sh |  2 +-
 t/t9142-git-svn-shallow-clone.sh  |  2 +-
 7 files changed, 30 insertions(+), 86 deletions(-)

diff --git a/t/lib-git-svn.sh b/t/lib-git-svn.sh
index fb88232..688313e 100644
--- a/t/lib-git-svn.sh
+++ b/t/lib-git-svn.sh
@@ -65,81 +65,22 @@ svn_cmd () {
svn "$orig_svncmd" --config-dir "$svnconf" "$@"
 }
 
-prepare_httpd () {
-   for d in \
-   "$SVN_HTTPD_PATH" \
-   /usr/sbin/apache2 \
-   /usr/sbin/httpd \
-   ; do
-   if test -f "$d"
-   then
-   SVN_HTTPD_PATH="$d"
-   break
-   fi
-   done
-   if test -z "$SVN_HTTPD_PATH"
-   then
-   echo >&2 '*** error: Apache not found'
-   return 1
-   fi
-   for d in \
-   "$SVN_HTTPD_MODULE_PATH" \
-   /usr/lib/apache2/modules \
-   /usr/libexec/apache2 \
-   ; do
-   if test -d "$d"
-   then
-   SVN_HTTPD_MODULE_PATH="$d"
-   break
-   fi
-   done
-   if test -z "$SVN_HTTPD_MODULE_PATH"
-   then
-   echo >&2 '*** error: Apache module dir not found'
-   return 1
-   fi
-   if test ! -f "$SVN_HTTPD_MODULE_PATH/mod_dav_svn.so"
-   then
-   echo >&2 '*** error: Apache module "mod_dav_svn" not found'
-   return 1
-   fi
-
-   repo_base_path="${1-svn}"
-   mkdir "$GIT_DIR"/logs
-
-   cat > "$GIT_DIR/httpd.conf" <
-   DAV svn
-   SVNPath "$rawsvnrepo"
-
-EOF
-}
-
-start_httpd () {
-   if test -z "$SVN_HTTPD_PORT"
-   then
-   echo >&2 'SVN_HTTPD_PORT is not defined!'
-   return
-   fi
-
-   prepare_httpd "$1" || return 1
-
-   "$SVN_HTTPD_PATH" -f "$GIT_DIR"/httpd.conf -k start
-   svnrepo="http://127.0.0.1:$SVN_HTTPD_PORT/$repo_base_path";
-}
-
-stop_httpd () {
-   test -z "$SVN_HTTPD_PORT" && return
-   test ! -f "$GIT_DIR/httpd.conf" && return
-   "$SVN_HTTPD_PATH" -f "$GIT_DIR"/httpd.conf -k stop
+maybe_start_httpd () {
+   loc=${1-svn}
+
+   test_tristate GIT_SVN_TEST_HTTPD
+   case $GIT_SVN_TEST_HTTPD in
+   true)
+   . "$TEST_DIRECTORY"/lib-httpd.sh
+   LIB_HTTPD_SVN="$loc"
+   start_httpd
+   ;;
+   *)
+   stop_httpd () {
+   : noop
+   }
+   ;;
+   esac
 }
 
 convert_to_rev_db () {
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index ac2cbee..435a374 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -24,7 +24,7 @@
 #LIB_HTTPD_MODULE_PATH   web server modules path
 #LIB_HTTPD_PORT  listening port
 #LIB_HTTPD_DAV   enable DAV
-#LIB_HTTPD_SVN   enable SVN
+#LIB_HTTPD_SVN   enable SVN at given location (e.g. "svn")
 #LIB_HTTPD_SSL   enable SSL
 #
 # Copyright (c) 2008 Clemens Buchacher 
@@ -162,8 +162,10 @@ prepare_httpd() {
if test -n "$LIB_HTTPD_SVN"
then
HTTPD_PARA="$HTTPD_PARA -DSVN"
-   rawsvnrepo="$HTTPD_ROOT_PATH/svnrepo"
-   svnrepo="http://127.0.0.1:$LIB_HTTPD_PORT/svn";
+   LIB_HTTPD_SVNPATH="$rawsvnrepo"
+   svnrepo="http://127.0.0.1:$LIB_HTTPD_PORT/";
+   svnrepo="$svnrepo$LIB_HTTPD_SVN"
+   export LIB_HTTPD_SVN LIB_HTTPD_SVNPATH
fi
fi
 }
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 018a83a..c3e6313 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -208,8 +208,8 @@ RewriteRule ^/half-auth-complete/ - [E=AUTHREQUIRED:yes]
 
LoadModule dav_svn_module modules/mod_dav_svn.so
 
-   
+   
DAV svn
-   SVNPath svnrepo
+   SVNPath "${LIB_HTTPD_SVNPATH}"

 
diff --git a/t/t9115-git-svn-dcommit-funky-renames.sh 
b/t/t9115-git-svn-dcommit-funky-renames.sh
index a87d3d3..64bb495 100755
--- a/t/t9115-git-svn-dcommit-funky

[PATCH 1/2] t/t91*: do not say how to avoid the tests

2016-07-22 Thread Eric Wong

From: Michael J Gruber 

Some of the tests "say" how to stop the svn tests from running, some do
not.

The test suite is directed at people reading t/README where we keep all
information about running the test suite (partly, with options etc.).

Remove said "say" occurences.

Signed-off-by: Michael J Gruber 
Signed-off-by: Eric Wong 
---
 t/t9100-git-svn-basic.sh   | 2 --
 t/t9158-git-svn-mergeinfo.sh   | 2 --
 t/t9160-git-svn-preserve-empty-dirs.sh | 1 -
 3 files changed, 5 deletions(-)

diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh
index 28082b1..c23b11f 100755
--- a/t/t9100-git-svn-basic.sh
+++ b/t/t9100-git-svn-basic.sh
@@ -8,8 +8,6 @@ GIT_SVN_LC_ALL=${LC_ALL:-$LANG}
 
 . ./lib-git-svn.sh
 
-say 'define NO_SVN_TESTS to skip git svn tests'
-
 case "$GIT_SVN_LC_ALL" in
 *.UTF-8)
test_set_prereq UTF8
diff --git a/t/t9158-git-svn-mergeinfo.sh b/t/t9158-git-svn-mergeinfo.sh
index 13f78f2..a875b45 100755
--- a/t/t9158-git-svn-mergeinfo.sh
+++ b/t/t9158-git-svn-mergeinfo.sh
@@ -7,8 +7,6 @@ test_description='git svn mergeinfo propagation'
 
 . ./lib-git-svn.sh
 
-say 'define NO_SVN_TESTS to skip git svn tests'
-
 test_expect_success 'initialize source svn repo' '
svn_cmd mkdir -m x "$svnrepo"/trunk &&
svn_cmd co "$svnrepo"/trunk "$SVN_TREE" &&
diff --git a/t/t9160-git-svn-preserve-empty-dirs.sh 
b/t/t9160-git-svn-preserve-empty-dirs.sh
index b4a4434..0ede3cf 100755
--- a/t/t9160-git-svn-preserve-empty-dirs.sh
+++ b/t/t9160-git-svn-preserve-empty-dirs.sh
@@ -11,7 +11,6 @@ local Git repository with placeholder files.'
 
 . ./lib-git-svn.sh
 
-say 'define NO_SVN_TESTS to skip git svn tests'
 GIT_REPO=git-svn-repo
 
 test_expect_success 'initialize source svn repo containing empty dirs' '
-- 
EW

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

90 matches

Mail list logo