Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)

2018-08-18 Thread Jonathan Nieder
Hi,

Junio C Hamano wrote:

> Usually, I refrain from merging larger topics in 'next' down to
> 'master' when we get close to -rc0, but I am wondering if it is
> better to merge all of them to 'master', even the ones on the larger
> and possibly undercooked side, expecting that we collectively spend
> effort on hunting and fixing bugs in them during the pre-release
> freeze period.  If we were to go that route, I'd want everybody's
> buy-in and I'll promise to ignore any shiny new toys that appear on
> list that are not regression fixes to topics merged to 'master'
> since the end of the previous cycle to make sure people are not
> distracted.

Based on what I see in 'next' (midx, range-diff, etc), I quite like
this idea.  In private I may still have ideas for new features and
hack away at them but I would avoid sending new feature topics and
would focus on helping polish what's already been sent out.  Let's
make 2.19 easy for people to adopt.

For whatever that's worth.  I'm only one person; others may have their
own wishes.

Thanks,
Jonathan


[PATCH] t2024: mark test using "checkout -p" with PERL prerequisite

2018-08-18 Thread Ævar Arnfjörð Bjarmason
Checkout with the -p switch uses the "add interactive" framework which
is written in Perl.

One test added in 8d7b558bae ("checkout & worktree: introduce
checkout.defaultRemote", 2018-06-05) didn't declare the PERL
prerequisite, breaking the test when built with NO_PERL.

Reported-by: CB Bailey 
Signed-off-by: CB Bailey 
Signed-off-by: Ævar Arnfjörð Bjarmason 
---

On Sat, Aug 18, 2018 at 6:43 AM, CB Bailey  wrote:
> checkout with the -p switch uses the "add interactive" framework which
> is written in Perl. Add a PERL prerequisite to skip this test when built
> with NO_PERL.

Thanks, and sorry about my buggy code. I didn't consider the
interaction between -p and NO_PERL.

Your patch works, but I think just splitting the test up is better, so
we're not skipping things unrelated to "checkout -p" under NO_PERL.

I added your SOB since I stole significant parts of your commit
message.

Junio: We'd want one patch or the other before 2.19 so that release
doesn't break tests under NO_PERL.

 t/t2024-checkout-dwim.sh | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/t/t2024-checkout-dwim.sh b/t/t2024-checkout-dwim.sh
index f79dfbbdd6..69b6774d10 100755
--- a/t/t2024-checkout-dwim.sh
+++ b/t/t2024-checkout-dwim.sh
@@ -86,8 +86,13 @@ test_expect_success 'checkout of branch from multiple 
remotes fails with advice'
checkout foo 2>stderr &&
test_branch master &&
status_uno_is_clean &&
-   test_i18ngrep ! "^hint: " stderr &&
-   # Make sure the likes of checkout -p do not print this hint
+   test_i18ngrep ! "^hint: " stderr
+'
+
+test_expect_success PERL 'checkout -p with multiple remotes does not print 
advice' '
+   git checkout -B master &&
+   test_might_fail git branch -D foo &&
+
git checkout -p foo 2>stderr &&
test_i18ngrep ! "^hint: " stderr &&
status_uno_is_clean
-- 
2.18.0.865.gffc8e1a3cd6



Re: [PATCH/RFC] commit: add short option for --amend

2018-08-18 Thread Simon Ruderich
On Thu, Aug 16, 2018 at 08:31:17PM +0200, Nguyễn Thái Ngọc Duy wrote:
> I just realized how often I type "git ci --amend". Looking back at my
> ~/.bash_history (only 10k lines) this is the second most often git
> command I type which may justify a short option for it (assuming that
> other people use this option often too, of course).

Why not add another alias? As you're already using the ci alias,
maybe cia? Personally I have the following aliases for
committing:

c   = commit --verbose
ca  = commit --verbose --amend
cad = commit --verbose --amend --date=now

Besides the obvious g=git alias in the shell. I really like one
character aliases for often used commands/subcommands.

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9


Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)

2018-08-18 Thread Christian Couder
On Sat, Aug 18, 2018 at 12:44 AM Junio C Hamano  wrote:

> * cc/delta-islands (2018-08-16) 7 commits
>  - pack-objects: move 'layer' into 'struct packing_data'
>  - pack-objects: move tree_depth into 'struct packing_data'
>  - t5320: tests for delta islands
>  - repack: add delta-islands support
>  - pack-objects: add delta-islands support
>  - pack-objects: refactor code into compute_layer_order()
>  - Add delta-islands.{c,h}
>
>  Lift code from GitHub to restrict delta computation so that an
>  object that exists in one fork is not made into a delta against
>  another object that does not appear in the same forked repository.
>
>  What's the doneness of this topic?

All the suggestions by Peff, you, Duy, Dscho, Ramsay and Szeder Gabor
have been taken into account in the v5 that is in pu.

Except the suggestion by Duy to move 2 fields from 'struct
object_entry' to 'struct packing_data' (which is implemented in
patches 6/7 and 7/7) the suggestions have all been about relatively
small things (documentation, code modernization, regex check,
translation strings, ...) So the code is very similar to the original
code in https://github.com/peff/git/commits/jk/delta-islands which has
been used for years by GitHub in production.

FYI this has been requested from GitLab by Drupal (as well as others)
see 
https://www.drupal.org/drupalorg/blog/developer-tools-initiative-part-5-gitlab-partnership
which contains:

"The timeline for Phase 2 is dependent on GitLab’s resolution of a
diskspace deduplication issue, which they have committed to on our
behalf: https://gitlab.com/gitlab-org/gitlab-ce/issues/23029";


Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)

2018-08-18 Thread Ævar Arnfjörð Bjarmason


On Fri, Aug 17 2018, Junio C Hamano wrote:

> Usually, I refrain from merging larger topics in 'next' down to
> 'master' when we get close to -rc0, but I am wondering if it is
> better to merge all of them to 'master', even the ones on the larger
> and possibly undercooked side, expecting that we collectively spend
> effort on hunting and fixing bugs in them during the pre-release
> freeze period.  If we were to go that route, I'd want everybody's
> buy-in and I'll promise to ignore any shiny new toys that appear on
> list that are not regression fixes to topics merged to 'master'
> since the end of the previous cycle to make sure people are not
> distracted.

I'm very much up for this. We have some very useful things (like
range-diff and midx) sitting in "next". I think if you merge next down
to master now we'll have plenty of time to smoke out bugs & regressions
before 2.19.


Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)

2018-08-18 Thread Ævar Arnfjörð Bjarmason


On Sat, Aug 18 2018, Christian Couder wrote:

> On Sat, Aug 18, 2018 at 12:44 AM Junio C Hamano  wrote:
>
>> * cc/delta-islands (2018-08-16) 7 commits
>>  - pack-objects: move 'layer' into 'struct packing_data'
>>  - pack-objects: move tree_depth into 'struct packing_data'
>>  - t5320: tests for delta islands
>>  - repack: add delta-islands support
>>  - pack-objects: add delta-islands support
>>  - pack-objects: refactor code into compute_layer_order()
>>  - Add delta-islands.{c,h}
>>
>>  Lift code from GitHub to restrict delta computation so that an
>>  object that exists in one fork is not made into a delta against
>>  another object that does not appear in the same forked repository.
>>
>>  What's the doneness of this topic?
>
> All the suggestions by Peff, you, Duy, Dscho, Ramsay and Szeder Gabor
> have been taken into account in the v5 that is in pu.
>
> Except the suggestion by Duy to move 2 fields from 'struct
> object_entry' to 'struct packing_data' (which is implemented in
> patches 6/7 and 7/7) the suggestions have all been about relatively
> small things (documentation, code modernization, regex check,
> translation strings, ...) So the code is very similar to the original
> code in https://github.com/peff/git/commits/jk/delta-islands which has
> been used for years by GitHub in production.
>
> FYI this has been requested from GitLab by Drupal (as well as others)
> see 
> https://www.drupal.org/drupalorg/blog/developer-tools-initiative-part-5-gitlab-partnership
> which contains:
>
> "The timeline for Phase 2 is dependent on GitLab’s resolution of a
> diskspace deduplication issue, which they have committed to on our
> behalf: https://gitlab.com/gitlab-org/gitlab-ce/issues/23029";

This is not a critique of the delta islands feature, just something I'm
curious about.

Why is Drupal blocked on something like delta-islands? The blog post
mentions they have 45k projects, which can be browsed at
https://cgit.drupalcode.org

Almost all of those are completely independent projects, so they
wouldn't benefit from delta islands, and I'd guess >98% are of them are
in an obscure long tail and probably won't have even a single fork.

That leaves forks of say drupal.git, which is ~150MB, the mirror on
GitHub has 1500 forks: https://github.com/drupal Even if there were 5000
forks of that that would be 750G of disk space.

So accounting for backups, me being off by a lot etc. let's say that's
5TB. That's relatively cheap today. Are they really just holding up
their GitLab migration plans to save something on the order of that disk
space, or have I missed something here?

Again, not a critique of delta-islands, because it's most certainly
useful for the likes of github/gitlab, but I wonder if for this
particular problem it wouldn't be more straightforward of a solution for
GitLab to allow anyone to push to
refs/for-merge// on any
repository. Then they could open a MR for an existing branch in the repo
(which GitLab already supports).


Re: [GSoC][PATCH v7 09/26] stash: implement the "list" command in the builtin

2018-08-18 Thread Paul Sebastian Ungureanu
On Wed, Aug 15, 2018 at 10:41 PM, Thomas Gummerer  wrote:
>> Subject: stash: implement the "list" command in the builtin
>
> Nit: The previous commit messages all have the format "stash: convert
>  to builtin", maybe follow the same pattern here?

At first, the subject of the commit matched the pattern. I think I
changed it in order
to avoid any confusion with "list" as data structure. Now, I guess
that "stash:" at the
beginning of the commit message removes any doubt.


Re: [GSoC][PATCH v7 10/26] stash: convert show to builtin

2018-08-18 Thread Paul Sebastian Ungureanu
On Wed, Aug 15, 2018 at 11:20 PM, Thomas Gummerer  wrote:
> '--quiet' doesn't make too much sense to use with 'git stash show', so
> I'm not sure whether or not it makes sense to support it at all.  But
> we do promise to pass all options through to in our documentation, so
> the new behaviour is what we are documenting.

Indeed, I guess it doesn't make sense at all. I cannot come up
with a situation where `--quiet` would be useful. I will think more
about it and come back with a reply if I find any good example
where it would be useful.


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Ævar Arnfjörð Bjarmason


On Sat, Aug 18 2018, Jonathan Nieder wrote:

> Hi,
>
> Michael Muré wrote:
>
>> I released today git-bug, a distributed bug tracker that embeds in
>> git. It use git's internal storage to store bugs information in a way
>> that can be merged without conflict. You can push/pull to the normal
>> git remote you are already using to interact with other people. Normal
>> code and bugs are completely separated and no files are added in the
>> regular branches.
>
> I am a bit unhappy about the namespace grab.  Not for trademark
> reasons: the Git trademark rules are pretty clear about this kind of
> usage being okay.  Instead, the unhappiness comes because a future Git
> command like "git bug" to produce a bug report with appropriate
> diagnostics for a bug in Git seems like a likely and useful thing to
> get added to Git some day.  And now the name's taken.
>
> Is it too late to ask if it's possible to come up with a less generic
> name?

Wouldn't we call such a thing "git-reportbug", or "git gitbug", with
reference to Debian reportbug or perl's perlbug?

Addressing the more general issue, if we're concerned with 3rd party
tools usurping the core namespace trying to convince individual authors
of 3rd party tools to change the names of those tools to something more
unique is pissing in the wind.

That's never going to make a dent in the vast amount of git-whatever
tools, most of which won't be discussed as ideas on this mailing list
before they're released.

I think we basically have these options:

1) Accept the status quo where people do create third party tools, much
   of which are way too obscure to matter (e.g. I'm sure someone's
   created a tool/alias called range-diff before, but we didn't
   care).

   If those tools become popular enough in the wild they get own that
   namespace, e.g. we're not going to ship a "git-annex" or "git-lfs"
   ourselves implementing some unrelated features (re parallel on-list
   discussion; "git annex" could also be a "git commit --amend" alias).

2) #1, but hope we catch new tools early enough to convince their
   authors to change the names.

3) Make some structural change to git where only things we ourselves
   compile get to be called as "git ", and you'd need to call
   e.g. "git-bug" as "git ext::bug" or something. We'd need to have a
   large hardcoded list of older tools (lfs, annex, ...) to grandfather
   in if we went for this approach.

I think we should just go for #1, and if we're concerned about #1 not
being OK we really need something like #3, because #2 isn't going to
work.

> Separately from that, I'm happy to see progress being made in the
> distributed bug tracker world; thanks for that!
>
> Thanks,
> Jonathan


Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)

2018-08-18 Thread Christian Couder
On Sat, Aug 18, 2018 at 1:34 PM Ævar Arnfjörð Bjarmason
 wrote:
> On Sat, Aug 18 2018, Christian Couder wrote:

> > FYI this has been requested from GitLab by Drupal (as well as others)
> > see 
> > https://www.drupal.org/drupalorg/blog/developer-tools-initiative-part-5-gitlab-partnership
> > which contains:
> >
> > "The timeline for Phase 2 is dependent on GitLab’s resolution of a
> > diskspace deduplication issue, which they have committed to on our
> > behalf: https://gitlab.com/gitlab-org/gitlab-ce/issues/23029";
>
> This is not a critique of the delta islands feature, just something I'm
> curious about.
>
> Why is Drupal blocked on something like delta-islands? The blog post
> mentions they have 45k projects, which can be browsed at
> https://cgit.drupalcode.org
>
> Almost all of those are completely independent projects, so they
> wouldn't benefit from delta islands, and I'd guess >98% are of them are
> in an obscure long tail and probably won't have even a single fork.
>
> That leaves forks of say drupal.git, which is ~150MB, the mirror on
> GitHub has 1500 forks: https://github.com/drupal Even if there were 5000
> forks of that that would be 750G of disk space.
>
> So accounting for backups, me being off by a lot etc. let's say that's
> 5TB. That's relatively cheap today. Are they really just holding up
> their GitLab migration plans to save something on the order of that disk
> space, or have I missed something here?

I am not sure why. I haven't been in touch directly with Drupal people
and I haven't followed all the discussions with them.

When I discussed this topic with someone responsible for another
significant open source project. He told me that they don't want forks
at all on their self hosted GitLab instance because of the disk space
and management burden that would come with them. But they would like
people to fork on a separate GitLab instance like gitlab.com or
github.com (but then be able to send merge/pull requests to their self
hosted GitLab instance) because they know that developers like to have
their own fork :-)

So my guess is that Drupal people are also afraid of the possible disk
space burden, even if your numbers seem to say that they shouldn't.

> Again, not a critique of delta-islands, because it's most certainly
> useful for the likes of github/gitlab, but I wonder if for this
> particular problem it wouldn't be more straightforward of a solution for
> GitLab to allow anyone to push to
> refs/for-merge// on any
> repository. Then they could open a MR for an existing branch in the repo
> (which GitLab already supports).

My guess is that people are used to forks on GitHub and they like them
and want to have the same thing and same workflow on GitLab.

The delta islands documentation though says that it's possible to use
git namespaces along with delta islands, and if forks are indeed
implemented as namespaces, it will be kind of similar in the GitLab
internals as what you suggest. There are still discussions by the way
in the GitLab issue referenced above about whether it's better for
GitLab to use git namespaces or alternates to implement deduplication
in forks (along with delta islands).


Re: Re* [PATCH v7 1/1] sideband: highlight keywords in remote sideband output

2018-08-18 Thread Junio C Hamano
Jonathan Nieder  writes:

>> --- a/sideband.c
>> +++ b/sideband.c
>> @@ -75,7 +75,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, 
>> const char *src, int n)
>
> Not about this patch: should the 'n' parameter be a size_t instead of
> an int?  It doesn't matter in practice (since the caller has an int,
> it can never be more than INT_MAX) but it might make the intent
> clearer.

I tend to agree, but I think a separate "clean-up" patch to do so is
more appropriate than rolling it into this fix.

>>  /*
>>   * Match case insensitively, so we colorize output from existing
>>   * servers regardless of the case that they use for their
>>   * messages. We only highlight the word precisely, so
>>   * "successful" stays uncolored.
>>   */
>>  if (!strncasecmp(p->keyword, src, len) && !isalnum(src[len])) {
>
> Not about this patch: should this check "&& src[len] == ':'" instead,
> as discussed upthread?

I originally was of an opinion that we should take only lowercase
keyword followed by a colon, primarily because that is what we
produce.  Then "the real world need" told us that we are better off
catching the keyword case-insensitively.  Recalling that lesson, I
am not sure I would support "let's limit to the colon, rejecting any
other punctionation letter".

In any case, we should make such a policy decision outside a patch
like this one that is about fixing a behaviour which all users would
consider as a bug regardless of the policy they support.

>> @@ -100,8 +103,8 @@ static void maybe_colorize_sideband(struct strbuf *dest, 
>> const char *src, int n)
>>  }
>>  }
>>  
>> -strbuf_add(dest, src, n);
>> +if (0 < n)
>> +strbuf_add(dest, src, n);
>
> This check seems unnecessary.  strbuf_add can cope fine with !n.

I was primarily interested in catching negatives, and !n was a mere
optimization, but you are correct to point out that negative n at
this point in the codeflow is a BUG().


[PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite

2018-08-18 Thread Nguyễn Thái Ngọc Duy
This makes sure that cache-tree is consistent with the index. The main
purpose is to catch potential problems by saving the index in
unpack_trees() but the line in write_index() would also help spot
missing invalidation in other code.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache-tree.c   | 78 ++
 cache-tree.h   |  1 +
 read-cache.c   |  3 ++
 t/test-lib.sh  |  6 
 unpack-trees.c |  2 ++
 5 files changed, 90 insertions(+)

diff --git a/cache-tree.c b/cache-tree.c
index caafbff2ff..c3c206427c 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -4,6 +4,7 @@
 #include "tree-walk.h"
 #include "cache-tree.h"
 #include "object-store.h"
+#include "replace-object.h"
 
 #ifndef DEBUG
 #define DEBUG 0
@@ -732,3 +733,80 @@ int update_main_cache_tree(int flags)
the_index.cache_tree = cache_tree();
return cache_tree_update(&the_index, flags);
 }
+
+static void verify_one(struct index_state *istate,
+  struct cache_tree *it,
+  struct strbuf *path)
+{
+   int i, pos, len = path->len;
+   struct strbuf tree_buf = STRBUF_INIT;
+   struct object_id new_oid;
+
+   for (i = 0; i < it->subtree_nr; i++) {
+   strbuf_addf(path, "%s/", it->down[i]->name);
+   verify_one(istate, it->down[i]->cache_tree, path);
+   strbuf_setlen(path, len);
+   }
+
+   if (it->entry_count < 0 ||
+   /* no verification on tests (t7003) that replace trees */
+   lookup_replace_object(the_repository, &it->oid) != &it->oid)
+   return;
+
+   if (path->len) {
+   pos = index_name_pos(istate, path->buf, path->len);
+   pos = -pos - 1;
+   } else {
+   pos = 0;
+   }
+
+   i = 0;
+   while (i < it->entry_count) {
+   struct cache_entry *ce = istate->cache[pos + i];
+   const char *slash;
+   struct cache_tree_sub *sub = NULL;
+   const struct object_id *oid;
+   const char *name;
+   unsigned mode;
+   int entlen;
+
+   if (ce->ce_flags & (CE_STAGEMASK | CE_INTENT_TO_ADD | 
CE_REMOVE))
+   BUG("%s with flags 0x%x should not be in cache-tree",
+   ce->name, ce->ce_flags);
+   name = ce->name + path->len;
+   slash = strchr(name, '/');
+   if (slash) {
+   entlen = slash - name;
+   sub = find_subtree(it, ce->name + path->len, entlen, 0);
+   if (!sub || sub->cache_tree->entry_count < 0)
+   BUG("bad subtree '%.*s'", entlen, name);
+   oid = &sub->cache_tree->oid;
+   mode = S_IFDIR;
+   i += sub->cache_tree->entry_count;
+   } else {
+   oid = &ce->oid;
+   mode = ce->ce_mode;
+   entlen = ce_namelen(ce) - path->len;
+   i++;
+   }
+   strbuf_addf(&tree_buf, "%o %.*s%c", mode, entlen, name, '\0');
+   strbuf_add(&tree_buf, oid->hash, the_hash_algo->rawsz);
+   }
+   hash_object_file(tree_buf.buf, tree_buf.len, tree_type, &new_oid);
+   if (oidcmp(&new_oid, &it->oid))
+   BUG("cache-tree for path %.*s does not match. "
+   "Expected %s got %s", len, path->buf,
+   oid_to_hex(&new_oid), oid_to_hex(&it->oid));
+   strbuf_setlen(path, len);
+   strbuf_release(&tree_buf);
+}
+
+void cache_tree_verify(struct index_state *istate)
+{
+   struct strbuf path = STRBUF_INIT;
+
+   if (!istate->cache_tree)
+   return;
+   verify_one(istate, istate->cache_tree, &path);
+   strbuf_release(&path);
+}
diff --git a/cache-tree.h b/cache-tree.h
index 9799e894f7..c1fde531f9 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -32,6 +32,7 @@ struct cache_tree *cache_tree_read(const char *buffer, 
unsigned long size);
 
 int cache_tree_fully_valid(struct cache_tree *);
 int cache_tree_update(struct index_state *, int);
+void cache_tree_verify(struct index_state *);
 
 int update_main_cache_tree(int);
 
diff --git a/read-cache.c b/read-cache.c
index 5ce40f39b3..41f313bc9e 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2744,6 +2744,9 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
int new_shared_index, ret;
struct split_index *si = istate->split_index;
 
+   if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
+   cache_tree_verify(istate);
+
if ((flags & SKIP_IF_UNCHANGED) && !istate->cache_changed) {
if (flags & COMMIT_LOCK)
rollback_lock_file(lock);
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 78f7097746..5b50f6e2e6 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1083,6 +1083,12 

[PATCH v5 0/7] Speed up unpack_trees()

2018-08-18 Thread Nguyễn Thái Ngọc Duy
v5 fixes some minor comments from round 4 and a big mistake in 5/5.
Junio's scary feeling turns out true. There is a missing invalidation
in keep_entry() which is not added in 6/7. 7/7 makes sure that similar
problems will not slip through.

I had to rebase this series on top of 'master' because 7/7 caught a
bad cache-tree situation that has been fixed by Elijah in ad3762042a
(read-cache: fix directory/file conflict handling in
read_index_unmerged() - 2018-07-31). I believe the issue was we prime
cache-tree in 'git reset --hard' even though the index has conflicts.

Range-diff (before the rebase):

1:  a192faf79e ! 1:  ed8763726b trace.h: support nested performance tracing
@@ -49,13 +49,16 @@
struct untracked_cache_dir *untracked;
 -  uint64_t start = getnanotime();
  
-   if (has_symlink_leading_path(path, len))
+-  if (has_symlink_leading_path(path, len))
++  trace_performance_enter();
++
++  if (has_symlink_leading_path(path, len)) {
++  trace_performance_leave("read directory %.*s", len, path);
return dir->nr;
++  }
  
-+  trace_performance_enter();
untracked = validate_untracked_cache(dir, len, pathspec);
if (!untracked)
-   /*
 @@
dir->nr = i;
}
2:  9afe7c488a = 2:  9b70652fa2 unpack-trees: add performance tracing
3:  74101edb60 ! 3:  8b3cfea623 unpack-trees: optimize walking same trees with 
cache-tree
@@ -141,7 +141,7 @@
 +
 +  /*
 +   * Do what unpack_callback() and unpack_nondirectories() normally
-+   * do. But we walk all paths recursively in just one loop instead.
++   * do. But we walk all paths in an iterative loop instead.
 +   *
 +   * D/F conflicts and higher stage entries are not a concern
 +   * because cache-tree would be invalidated and we would never
4:  9261c5920e = 4:  5af28d44ca unpack-trees: reduce malloc in cache-tree walk
5:  43fac1154f = 5:  5657c92fe9 unpack-trees: reuse (still valid) cache-tree 
from src_index
-:  -- > 6:  3b91783afc unpack-trees: add missing cache invalidation
-:  -- > 7:  0d5464c0dc cache-tree: verify valid cache-tree in the test 
suite

Nguyễn Thái Ngọc Duy (7):
  trace.h: support nested performance tracing
  unpack-trees: add performance tracing
  unpack-trees: optimize walking same trees with cache-tree
  unpack-trees: reduce malloc in cache-tree walk
  unpack-trees: reuse (still valid) cache-tree from src_index
  unpack-trees: add missing cache invalidation
  cache-tree: verify valid cache-tree in the test suite

 cache-tree.c|  80 +
 cache-tree.h|   1 +
 diff-lib.c  |   4 +-
 dir.c   |   9 ++-
 name-hash.c |   4 +-
 preload-index.c |   4 +-
 read-cache.c|  16 +++--
 t/test-lib.sh   |   6 ++
 trace.c |  69 --
 trace.h |  15 +
 unpack-trees.c  | 154 +++-
 11 files changed, 340 insertions(+), 22 deletions(-)

-- 
2.18.0.1004.g6639190530



[PATCH v5 2/7] unpack-trees: add performance tracing

2018-08-18 Thread Nguyễn Thái Ngọc Duy
We're going to optimize unpack_trees() a bit in the following
patches. Let's add some tracing to measure how long it takes before
and after. This is the baseline ("git checkout -" on webkit.git, 275k
files on worktree)

performance: 0.056651714 s:  read cache .git/index
performance: 0.183101080 s:  preload index
performance: 0.008584433 s:  refresh index
performance: 0.633767589 s:   traverse_trees
performance: 0.340265448 s:   check_updates
performance: 0.381884638 s:   cache_tree_update
performance: 1.401562947 s:  unpack_trees
performance: 0.338687914 s:  write index, changed mask = 2e
performance: 0.411927922 s:traverse_trees
performance: 0.23335 s:check_updates
performance: 0.423697246 s:   unpack_trees
performance: 0.423708360 s:  diff-index
performance: 2.559524127 s: git command: git checkout -

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 cache-tree.c   | 2 ++
 unpack-trees.c | 9 -
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/cache-tree.c b/cache-tree.c
index 181d5919f0..caafbff2ff 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -433,7 +433,9 @@ int cache_tree_update(struct index_state *istate, int flags)
 
if (i)
return i;
+   trace_performance_enter();
i = update_one(it, cache, entries, "", 0, &skip, flags);
+   trace_performance_leave("cache_tree_update");
if (i < 0)
return i;
istate->cache_changed |= CACHE_TREE_CHANGED;
diff --git a/unpack-trees.c b/unpack-trees.c
index f9efee0836..6d9f692ea6 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -345,6 +345,7 @@ static int check_updates(struct unpack_trees_options *o)
struct checkout state = CHECKOUT_INIT;
int i;
 
+   trace_performance_enter();
state.force = 1;
state.quiet = 1;
state.refresh_cache = 1;
@@ -414,6 +415,7 @@ static int check_updates(struct unpack_trees_options *o)
errs |= finish_delayed_checkout(&state);
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
+   trace_performance_leave("check_updates");
return errs != 0;
 }
 
@@ -1285,6 +1287,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
if (len > MAX_UNPACK_TREES)
die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
 
+   trace_performance_enter();
memset(&el, 0, sizeof(el));
if (!core_apply_sparse_checkout || !o->update)
o->skip_sparse_checkout = 1;
@@ -1357,7 +1360,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
}
}
 
-   if (traverse_trees(len, t, &info) < 0)
+   trace_performance_enter();
+   ret = traverse_trees(len, t, &info);
+   trace_performance_leave("traverse_trees");
+   if (ret < 0)
goto return_failed;
}
 
@@ -1449,6 +1455,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
o->src_index = NULL;
 
 done:
+   trace_performance_leave("unpack_trees");
clear_exclude_list(&el);
return ret;
 
-- 
2.18.0.1004.g6639190530



[PATCH v5 5/7] unpack-trees: reuse (still valid) cache-tree from src_index

2018-08-18 Thread Nguyễn Thái Ngọc Duy
We do n-way merge by walking the source index and n trees at the same
time and add merge results to a new temporary index called o->result.
The merge result for any given path could be either

- keep_entry(): same old index entry in o->src_index is reused
- merged_entry(): either a new entry is added, or an existing one updated
- deleted_entry(): one entry from o->src_index is removed

For some reason [1] we keep making sure that the source index's
cache-tree is still valid if used by o->result: for all those
merged/deleted entries, we invalidate the same path in o->src_index,
so only cache-trees covering the "keep_entry" parts remain good.

Because of this, the cache-tree from o->src_index can be perfectly
reused in o->result. And in fact we already rely on this logic to
reuse untracked cache in edf3b90553 (unpack-trees: preserve index
extensions - 2017-05-08). Move the cache-tree to o->result before
doing cache_tree_update() to reduce hashing cost.

Since cache_tree_update() has risen up as one of the most expensive
parts in unpack_trees() after the last few patches. This does help
reduce unpack_trees() time significantly (on webkit.git):

before   after
  
0.080394752  0.051258167 s:  read cache .git/index
0.216010838  0.212106298 s:  preload index
0.008534301  0.280521764 s:  refresh index
0.251992198  0.218160442 s:   traverse_trees
0.377031383  0.374948191 s:   check_updates
0.372768105  0.037040114 s:   cache_tree_update
1.045887251  0.672031609 s:  unpack_trees
0.314983512  0.317456290 s:  write index, changed mask = 2e
0.062572653  0.038382654 s:traverse_trees
0.22544  0.42731 s:check_updates
0.073795585  0.050930053 s:   unpack_trees
0.073807557  0.051099735 s:  diff-index
1.938191592  1.614241153 s: git command: git checkout -

[1] I'm pretty sure the reason is an oversight in 34110cd4e3 (Make
'unpack_trees()' have a separate source and destination index -
2008-03-06). That patch aims to _not_ update the source index at
all. The invalidation should have been done on o->result in that
patch. But then there was no cache-tree on o->result even then so
it's pointless to do so.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 read-cache.c   | 2 ++
 unpack-trees.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 1c9c88c130..5ce40f39b3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2940,6 +2940,8 @@ void move_index_extensions(struct index_state *dst, 
struct index_state *src)
 {
dst->untracked = src->untracked;
src->untracked = NULL;
+   dst->cache_tree = src->cache_tree;
+   src->cache_tree = NULL;
 }
 
 struct cache_entry *dup_cache_entry(const struct cache_entry *ce,
diff --git a/unpack-trees.c b/unpack-trees.c
index dbef6e1b8a..aa80b65ee1 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1576,6 +1576,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
 
ret = check_updates(o) ? (-2) : 0;
if (o->dst_index) {
+   move_index_extensions(&o->result, o->src_index);
if (!ret) {
if (!o->result.cache_tree)
o->result.cache_tree = cache_tree();
@@ -1584,7 +1585,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
struct unpack_trees_options
  WRITE_TREE_SILENT |
  WRITE_TREE_REPAIR);
}
-   move_index_extensions(&o->result, o->src_index);
discard_index(o->dst_index);
*o->dst_index = o->result;
} else {
-- 
2.18.0.1004.g6639190530



[PATCH v5 1/7] trace.h: support nested performance tracing

2018-08-18 Thread Nguyễn Thái Ngọc Duy
Performance measurements are listed right now as a flat list, which is
fine when we measure big blocks. But when we start adding more and
more measurements, some of them could be just part of a bigger
measurement and a flat list gives a wrong impression that they are
executed at the same level instead of nested.

Add trace_performance_enter() and trace_performance_leave() to allow
indent these nested measurements. For now it does not help much
because the only nested thing is (lazy) name hash initialization
(e.g. called in diff-index from "git status"). This will help more
because I'm going to add some more tracing that's actually nested.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 diff-lib.c  |  4 +--
 dir.c   |  9 ---
 name-hash.c |  4 +--
 preload-index.c |  4 +--
 read-cache.c| 11 
 trace.c | 69 -
 trace.h | 15 +++
 7 files changed, 96 insertions(+), 20 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 732f684a49..d5bbb7ea50 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -518,11 +518,11 @@ static int diff_cache(struct rev_info *revs,
 int run_diff_index(struct rev_info *revs, int cached)
 {
struct object_array_entry *ent;
-   uint64_t start = getnanotime();
 
if (revs->pending.nr != 1)
BUG("run_diff_index must be passed exactly one tree");
 
+   trace_performance_enter();
ent = revs->pending.objects;
if (diff_cache(revs, &ent->item->oid, ent->name, cached))
exit(128);
@@ -531,7 +531,7 @@ int run_diff_index(struct rev_info *revs, int cached)
diffcore_fix_diff_index(&revs->diffopt);
diffcore_std(&revs->diffopt);
diff_flush(&revs->diffopt);
-   trace_performance_since(start, "diff-index");
+   trace_performance_leave("diff-index");
return 0;
 }
 
diff --git a/dir.c b/dir.c
index 32f5f72759..18b57b94cc 100644
--- a/dir.c
+++ b/dir.c
@@ -2263,10 +2263,13 @@ int read_directory(struct dir_struct *dir, struct 
index_state *istate,
   const char *path, int len, const struct pathspec *pathspec)
 {
struct untracked_cache_dir *untracked;
-   uint64_t start = getnanotime();
 
-   if (has_symlink_leading_path(path, len))
+   trace_performance_enter();
+
+   if (has_symlink_leading_path(path, len)) {
+   trace_performance_leave("read directory %.*s", len, path);
return dir->nr;
+   }
 
untracked = validate_untracked_cache(dir, len, pathspec);
if (!untracked)
@@ -2302,7 +2305,7 @@ int read_directory(struct dir_struct *dir, struct 
index_state *istate,
dir->nr = i;
}
 
-   trace_performance_since(start, "read directory %.*s", len, path);
+   trace_performance_leave("read directory %.*s", len, path);
if (dir->untracked) {
static int force_untracked_cache = -1;
static struct trace_key trace_untracked_stats = 
TRACE_KEY_INIT(UNTRACKED_STATS);
diff --git a/name-hash.c b/name-hash.c
index 163849831c..1fcda73cb3 100644
--- a/name-hash.c
+++ b/name-hash.c
@@ -578,10 +578,10 @@ static void threaded_lazy_init_name_hash(
 
 static void lazy_init_name_hash(struct index_state *istate)
 {
-   uint64_t start = getnanotime();
 
if (istate->name_hash_initialized)
return;
+   trace_performance_enter();
hashmap_init(&istate->name_hash, cache_entry_cmp, NULL, 
istate->cache_nr);
hashmap_init(&istate->dir_hash, dir_entry_cmp, NULL, istate->cache_nr);
 
@@ -602,7 +602,7 @@ static void lazy_init_name_hash(struct index_state *istate)
}
 
istate->name_hash_initialized = 1;
-   trace_performance_since(start, "initialize name hash");
+   trace_performance_leave("initialize name hash");
 }
 
 /*
diff --git a/preload-index.c b/preload-index.c
index 4d08d44874..d7f7919ba2 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -78,7 +78,6 @@ static void preload_index(struct index_state *index,
 {
int threads, i, work, offset;
struct thread_data data[MAX_PARALLEL];
-   uint64_t start = getnanotime();
 
if (!core_preload_index)
return;
@@ -88,6 +87,7 @@ static void preload_index(struct index_state *index,
threads = 2;
if (threads < 2)
return;
+   trace_performance_enter();
if (threads > MAX_PARALLEL)
threads = MAX_PARALLEL;
offset = 0;
@@ -109,7 +109,7 @@ static void preload_index(struct index_state *index,
if (pthread_join(p->pthread, NULL))
die("unable to join threaded lstat");
}
-   trace_performance_since(start, "preload index");
+   trace_performance_leave("preload index");
 }
 #endif
 
diff --git a/read-cache.c b/read-cache.c
index c5fabc844a..1c9c88c130 100644
--- a/read-cache.c
+++ b/read-cache.c
@

[PATCH v5 4/7] unpack-trees: reduce malloc in cache-tree walk

2018-08-18 Thread Nguyễn Thái Ngọc Duy
This is a micro optimization that probably only shines on repos with
deep directory structure. Instead of allocating and freeing a new
cache_entry in every iteration, we reuse the last one and only update
the parts that are new each iteration.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 unpack-trees.c | 29 -
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 8376663b59..dbef6e1b8a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -685,6 +685,8 @@ static int traverse_by_cache_tree(int pos, int nr_entries, 
int nr_names,
 {
struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
struct unpack_trees_options *o = info->data;
+   struct cache_entry *tree_ce = NULL;
+   int ce_len = 0;
int i, d;
 
if (!o->merge)
@@ -699,30 +701,39 @@ static int traverse_by_cache_tree(int pos, int 
nr_entries, int nr_names,
 * get here in the first place.
 */
for (i = 0; i < nr_entries; i++) {
-   struct cache_entry *tree_ce;
-   int len, rc;
+   int new_ce_len, len, rc;
 
src[0] = o->src_index->cache[pos + i];
 
len = ce_namelen(src[0]);
-   tree_ce = xcalloc(1, cache_entry_size(len));
+   new_ce_len = cache_entry_size(len);
+
+   if (new_ce_len > ce_len) {
+   new_ce_len <<= 1;
+   tree_ce = xrealloc(tree_ce, new_ce_len);
+   memset(tree_ce, 0, new_ce_len);
+   ce_len = new_ce_len;
+
+   tree_ce->ce_flags = create_ce_flags(0);
+
+   for (d = 1; d <= nr_names; d++)
+   src[d] = tree_ce;
+   }
 
tree_ce->ce_mode = src[0]->ce_mode;
-   tree_ce->ce_flags = create_ce_flags(0);
tree_ce->ce_namelen = len;
oidcpy(&tree_ce->oid, &src[0]->oid);
memcpy(tree_ce->name, src[0]->name, len + 1);
 
-   for (d = 1; d <= nr_names; d++)
-   src[d] = tree_ce;
-
rc = call_unpack_fn((const struct cache_entry * const *)src, o);
-   free(tree_ce);
-   if (rc < 0)
+   if (rc < 0) {
+   free(tree_ce);
return rc;
+   }
 
mark_ce_used(src[0], o);
}
+   free(tree_ce);
if (o->debug_unpack)
printf("Unpacked %d entries from %s to %s using cache-tree\n",
   nr_entries,
-- 
2.18.0.1004.g6639190530



[PATCH v5 3/7] unpack-trees: optimize walking same trees with cache-tree

2018-08-18 Thread Nguyễn Thái Ngọc Duy
In order to merge one or many trees with the index, unpack-trees code
walks multiple trees in parallel with the index and performs n-way
merge. If we find out at start of a directory that all trees are the
same (by comparing OID) and cache-tree happens to be available for
that directory as well, we could avoid walking the trees because we
already know what these trees contain: it's flattened in what's called
"the index".

The upside is of course a lot less I/O since we can potentially skip
lots of trees (think subtrees). We also save CPU because we don't have
to inflate and apply the deltas. The downside is of course more
fragile code since the logic in some functions are now duplicated
elsewhere.

"checkout -" with this patch on webkit.git (275k files):

baseline  new
  
0.056651714   0.080394752 s:  read cache .git/index
0.183101080   0.216010838 s:  preload index
0.008584433   0.008534301 s:  refresh index
0.633767589   0.251992198 s:   traverse_trees
0.340265448   0.377031383 s:   check_updates
0.381884638   0.372768105 s:   cache_tree_update
1.401562947   1.045887251 s:  unpack_trees
0.338687914   0.314983512 s:  write index, changed mask = 2e
0.411927922   0.062572653 s:traverse_trees
0.23335   0.22544 s:check_updates
0.423697246   0.073795585 s:   unpack_trees
0.423708360   0.073807557 s:  diff-index
2.559524127   1.938191592 s: git command: git checkout -

Another measurement from Ben's running "git checkout" with over 500k
trees (on the whole series):

baselinenew
  --
0.535510167 0.556558733 s: read cache .git/index
0.3057373   0.3147105   s: initialize name hash
0.0184082   0.023558433 s: preload index
0.086910967 0.089085967 s: refresh index
7.889590767 2.191554433 s: unpack trees
0.120760833 0.131941267 s: update worktree after a merge
2.2583504   2.572663167 s: repair cache-tree
0.8916137   0.959495233 s: write index, changed mask = 28
3.405199233 0.2710663   s: unpack trees
0.000999667 0.0021554   s: update worktree after a merge
3.4063306   0.273318333 s: diff-index
16.9524923  9.462943133 s: git command: git.exe checkout

This command calls unpack_trees() twice, the first time on 2way merge
and the second 1way merge. In both times, "unpack trees" time is
reduced to one third. Overall time reduction is not that impressive of
course because index operations take a big chunk. And there's that
repair cache-tree line.

PS. A note about cache-tree invalidation and the use of it in this
code.

We do invalidate cache-tree in _source_ index when we add new entries
to the (temporary) "result" index. But we also use the cache-tree from
source index in this optimization. Does this mean we end up having no
cache-tree in the source index to activate this optimization?

The answer is twisted: the order of finding a good cache-tree and
invalidating it matters. In this case we check for a good cache-tree
first in all_trees_same_as_cache_tree(), then we start to merge things
and potentially invalidate that same cache-tree in the process. Since
cache-tree invalidation happens after the optimization kicks in, we're
still good. But we may lose that cache-tree at the very first
call_unpack_fn() call in traverse_by_cache_tree().

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 
---
 unpack-trees.c | 127 +
 1 file changed, 127 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index 6d9f692ea6..8376663b59 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -635,6 +635,102 @@ static inline int are_same_oid(struct name_entry *name_j, 
struct name_entry *nam
return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid);
 }
 
+static int all_trees_same_as_cache_tree(int n, unsigned long dirmask,
+   struct name_entry *names,
+   struct traverse_info *info)
+{
+   struct unpack_trees_options *o = info->data;
+   int i;
+
+   if (!o->merge || dirmask != ((1 << n) - 1))
+   return 0;
+
+   for (i = 1; i < n; i++)
+   if (!are_same_oid(names, names + i))
+   return 0;
+
+   return cache_tree_matches_traversal(o->src_index->cache_tree, names, 
info);
+}
+
+static int index_pos_by_traverse_info(struct name_entry *names,
+ struct traverse_info *info)
+{
+   struct unpack_trees_options *o = info->data;
+   int len = traverse_path_len(info, names);
+   char *name = xmalloc(len + 1 /* slash */ + 1 /* NUL */);
+   int pos;
+
+   make_traverse_path(name, info, names);
+   name[len++] 

[PATCH v5 6/7] unpack-trees: add missing cache invalidation

2018-08-18 Thread Nguyễn Thái Ngọc Duy
Any changes to the output index should be (confusingly) marked in the
source index with invalidate_ce_path(). This is used to make sure we
still have valid untracked cache and cache-tree extensions in the end.

We do a pretty good job of invalidating except in two places.
verify_clean_subdirectory() is part of verify_absent() and
verify_absent_sparse(). The former is usually called by merged_entry()
or directly in threeway_merge(). The latter is obviously used by
sparse checkout.

In these three call sites, only merged_entry() follows up with
invalidate_ce_path(). The other two don't, but they should not trigger
this ce removal because this is about D/F conflicts [1]. But let's be
safe and invalidate_ce_path() here as well.

The second place is keep_entry() which is also used by threeway_merge()
to keep higher stage entries. In order to reuse cache-tree we need to
invalidate these paths as well. It's not a problem in the past because
whenever a higher stage entry is present, cache-tree will not be
created [2]. Now we salvage cache-tree even when higher stage entries
are present, we need more invalidation.

[1] c81935348b (Fix switching to a branch with D/F when current branch
has file D. - 2007-03-15)

[2] This is probably too strict. We should be able to create and save
cache-tree for the directories that do not have conflict entries
in cache_tree_update(). And this becomes more important when
cache-tree plays bigger role in terms of performance.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 unpack-trees.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/unpack-trees.c b/unpack-trees.c
index aa80b65ee1..bc43922922 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1774,6 +1774,7 @@ static int verify_clean_subdirectory(const struct 
cache_entry *ce,
if (verify_uptodate(ce2, o))
return -1;
add_entry(o, ce2, CE_REMOVE, 0);
+   invalidate_ce_path(ce, o);
mark_ce_used(ce2, o);
}
cnt++;
@@ -2033,6 +2034,8 @@ static int keep_entry(const struct cache_entry *ce,
  struct unpack_trees_options *o)
 {
add_entry(o, ce, 0, 0);
+   if (ce_stage(ce))
+   invalidate_ce_path(ce, o);
return 1;
 }
 
-- 
2.18.0.1004.g6639190530



Re: [RFC/PATCH] drop vcs-svn experiment

2018-08-18 Thread Jeff King
On Fri, Aug 17, 2018 at 10:26:05PM -0700, Jonathan Nieder wrote:

> > We also ship contrib/svn-fe, which builds on the vcs-svn
> > work. However, it does not seem to build out of the box for
> > me, as the link step misses some required libraries for
> > using libgit.a.
> 
> What libraries do you mean?  It builds and runs fine for me with
> 
>  $ git diff
>  diff --git i/contrib/svn-fe/Makefile w/contrib/svn-fe/Makefile
>  index e8651aaf4b5..bd709f8d83b 100644
>  --- i/contrib/svn-fe/Makefile
>  +++ w/contrib/svn-fe/Makefile
>  @@ -4,7 +4,7 @@ CC = cc
>   RM = rm -f
>   MV = mv
>  
>  -CFLAGS = -g -O2 -Wall
>  +CFLAGS = -g -O2 -Wall -pthread
>   LDFLAGS =
>   EXTLIBS = -lz
> 
> which appears to be platform related, not due to some internal change
> in Git.

Yes, it works for me with that, too[1]. So clearly there's some system
dependence. But I suspect it's broken for every system with pthreads,
which is most of them. And older versions _do_ compile out of the box,
even on my modern system. For completeness, here's what I dug up:

 - it builds fine up through v1.8.2

 - after eff80a9fd9 (Allow custom "comment char", 2013-01-16), it breaks
   with a ton of undefined references during the link stage, including
   SHA1_* and some xdl_* functions. I still have no idea why, as that
   commit is fairly mundane, but I guess it just somehow tickles
   something in the linker or the way we build libgit.a.

 - after da011cb0e7 (contrib/svn-fe: fix Makefile, 2014-08-28), the
   error becomes:

 /usr/bin/ld: ../../libgit.a(sha1_file.o): undefined reference to symbol 
'SHA1_Update@@OPENSSL_1_1_0'
 /usr/bin/ld: //usr/lib/x86_64-linux-gnu/libcrypto.so.1.1: error adding 
symbols: DSO missing from command line

   Presumably this did work for the author at the time. It's seems quite
   plausible that older versions of openssl did not exhibit this
   problem, and that it's system-specific. Or that it was possible to
   build with BLK_SHA1.

 - after e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17),
   the openssl error goes away (naturally), but is replaced with:

 /usr/bin/ld: ../../libgit.a(run-command.o): undefined reference to symbol 
'pthread_sigmask@@GLIBC_2.2.5'
 /usr/bin/ld: //lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: 
DSO missing from command line

   If I go back to 2014-era and start building with NO_OPENSSL, then
   even da011cb0e7 fails with:

 /usr/bin/ld: ../../libgit.a(run-command.o): undefined reference to symbol 
'pthread_setspecific@@GLIBC_2.2.5'

   So again, assuming it worked back then for the author of that commit,
   that's something that has changed on the system, and we can't figure
   out through bisecting git when that became common.

So that does mean it's possible that it works for some people on some
systems today (though it was also probably broken for everybody for a
year and a half in 2013 with nobody noticing).

[1] That patch actually doesn't quite work out of the box, because we
also include config.mak, and mine overrides CFLAGS. It also doesn't
seem to work with USE_LIBPCRE. But those are only evidence that the
Makefile is not very mature, not that people aren't using it for
out-of-the-box config.

> > Of course, I could be completely wrong about people using this. Maybe
> > svn-fe builds are just completely broken on my system, and maybe people
> > really do use testsvn::. But if so, they certainly aren't talking about
> > it on the mailing list. :)
> 
> My take:
> 
>  - svn-fe works fine and has been useful to me, though its Makefile
>could likely be simplified and made more user-friendly
> 
>  - I've benefited from the test coverage of having this in-tree
> 
>  - testsvn:: is a demo and at a minimum we ought not to install it
>with "make install"
> 
>  - keeping this in-tree for the benefit of just one user is excessive,
>so removing it is probably the right thing

Thanks, all of that sounds sensible to me.

>  - it would be nice if the commit removing this code from Git includes
>a note to help people find its new home
> 
> Would you mind holding off until I'm able to arrange that last bit?

Not at all. This patch was mostly meant to start the discussion. Mission
accomplished. ;)

-Peff


Re: [GSoC][PATCH v7 12/26] stash: refactor `show_stash()` to use the diff API

2018-08-18 Thread Paul Sebastian Ungureanu
On Thu, Aug 16, 2018 at 12:01 AM, Thomas Gummerer  wrote:
> On 08/08, Paul-Sebastian Ungureanu wrote:
>> Currently, `show_stash()` uses `cmd_diff()` to generate
>> the output. After this commit, the output will be generated
>> using the internal API.
>>
>> Before this commit, `git stash show --quiet` would act like
>> `git diff` and error out if the stash is not empty. Now, the
>> `--quiet` option does not error out given an empty stash.
>
> I think this needs a bit more justification.  As I mentioned in my
> comment to a previous patch, I'm not sure '--quiet' makes much sense
> with 'git stash show' (it will show nothing, and will always exit with
> an error code, as the stash will always contain something), but if we
> are supporting the same flags as 'git diff', and essentially just
> forwarding them, shouldn't they keep the same behaviour as well?

If we are going to support them, I guess there wouldn't be a problem
if any change in behaviour is noted in documentation (but as you said,
the next commit should be squashed into this). Indeed, the big question
is if we should support them. I would say no considering there is no
benefit.

>> Signed-off-by: Paul-Sebastian Ungureanu 
>> ---
>>  builtin/stash--helper.c | 73 +
>>  1 file changed, 45 insertions(+), 28 deletions(-)
>>
>> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
>> index 0c1efca6b..ec8c38c6f 100644
>> --- a/builtin/stash--helper.c
>> +++ b/builtin/stash--helper.c
>> @@ -10,6 +10,8 @@
>>  #include "run-command.h"
>>  #include "dir.h"
>>  #include "rerere.h"
>> +#include "revision.h"
>> +#include "log-tree.h"
>>
>>  static const char * const git_stash_helper_usage[] = {
>>   N_("git stash--helper list []"),
>> @@ -662,56 +664,71 @@ static int git_stash_config(const char *var, const 
>> char *value, void *cb)
>>
>>  static int show_stash(int argc, const char **argv, const char *prefix)
>>  {
>> - int i, ret = 0;
>> - struct child_process cp = CHILD_PROCESS_INIT;
>> - struct argv_array args_refs = ARGV_ARRAY_INIT;
>> + int i;
>> + int flags = 0;
>>   struct stash_info info;
>> + struct rev_info rev;
>> + struct argv_array stash_args = ARGV_ARRAY_INIT;
>>   struct option options[] = {
>>   OPT_END()
>>   };
>>
>> - argc = parse_options(argc, argv, prefix, options,
>> -  git_stash_helper_show_usage,
>> -  PARSE_OPT_KEEP_UNKNOWN);
>> + init_diff_ui_defaults();
>> + git_config(git_diff_ui_config, NULL);
>>
>> - cp.git_cmd = 1;
>> - argv_array_push(&cp.args, "diff");
>> + init_revisions(&rev, prefix);
>>
>> - /* Push arguments which are not options into args_refs. */
>> - for (i = 0; i < argc; ++i) {
>> - if (argv[i][0] == '-')
>> - argv_array_push(&cp.args, argv[i]);
>> + /* Push arguments which are not options into stash_args. */
>> + for (i = 1; i < argc; ++i) {
>> + if (argv[i][0] != '-')
>> + argv_array_push(&stash_args, argv[i]);
>>   else
>> - argv_array_push(&args_refs, argv[i]);
>> - }
>> -
>> - if (get_stash_info(&info, args_refs.argc, args_refs.argv)) {
>> - child_process_clear(&cp);
>> - argv_array_clear(&args_refs);
>> - return -1;
>> + flags++;
>>   }
>>
>>   /*
>>* The config settings are applied only if there are not passed
>>* any flags.
>>*/
>> - if (cp.args.argc == 1) {
>> + if (!flags) {
>>   git_config(git_stash_config, NULL);
>>   if (show_stat)
>> - argv_array_push(&cp.args, "--stat");
>> + rev.diffopt.output_format |= DIFF_FORMAT_DIFFSTAT;
>> + if (show_patch) {
>> + rev.diffopt.output_format = ~DIFF_FORMAT_NO_OUTPUT;
>> + rev.diffopt.output_format |= DIFF_FORMAT_PATCH;
>> + }
>
> I failed to notice this in the previous patch (the problem existed
> there as well), but this changes the behaviour of 'git -c
> stash.showStat=false stash show '.  Previously doing this would
> not show anything, which is the correct behaviour, while now still
> shows the diffstat.
>
> I think the show_stat variable is interpreted the wrong way around in
> the previous patch.
>
> Something else I noticed now that I was playing around more with the
> config options is that the parsing of the config options is not
> correctly done in the previous patch.  It does a 'strcmp(var,
> "stash.showStat"))', but the config API makes all variables lowercase
> (config options are case insensitive, and making everything lowercase
> is the way to ensure that), so it should be 'strcmp(var, "stash.showstat"))',
> and similar for the 'stash.showpatch' config option.
>
> This all sounds like it would be nice to have some tests for these
> config opti

Re: [GSoC][PATCH v7 18/26] stash: convert push to builtin

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> Add stash push to the helper.
> ---

This (and the previous two and I think most subsequent patches) are
missing your sign-off.

>  builtin/stash--helper.c | 209 
>  git-stash.sh|   6 +-
>  2 files changed, 213 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index f905d3908..c26cad3d5 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -23,6 +23,9 @@ static const char * const git_stash_helper_usage[] = {
>   N_("git stash--helper clear"),
>   N_("git stash--helper store [-m|--message ] [-q|--quiet] 
> "),
>   N_("git stash--helper create []"),
> + N_("git stash--helper [push [-p|--patch] [-k|--[no-]keep-index] 
> [-q|--quiet]\n"
> +"  [-u|--include-untracked] [-a|--all] [-m|--message 
> ]\n"
> +"  [--] [...]]"),
>   NULL
>  };
>  
> @@ -71,6 +74,13 @@ static const char * const git_stash_helper_create_usage[] 
> = {
>   NULL
>  };
>  
> +static const char * const git_stash_helper_push_usage[] = {
> + N_("git stash--helper [push [-p|--patch] [-k|--[no-]keep-index] 
> [-q|--quiet]\n"
> +"  [-u|--include-untracked] [-a|--all] [-m|--message 
> ]\n"
> +"  [--] [...]]"),
> + NULL
> +};
> +
>  static const char *ref_stash = "refs/stash";
>  static int quiet;
>  static struct strbuf stash_index_path = STRBUF_INIT;
> @@ -1210,6 +1220,203 @@ static int create_stash(int argc, const char **argv, 
> const char *prefix)
>   return ret < 0;
>  }
>  
> +static int do_push_stash(int argc, const char **argv, const char *prefix,
> +  int keep_index, int patch_mode, int include_untracked,
> +  int quiet, const char *stash_msg)
> +{
> + int ret = 0;
> + struct pathspec ps;
> + struct stash_info info;
> + if (patch_mode && keep_index == -1)
> + keep_index = 1;
> +
> + if (patch_mode && include_untracked) {
> + fprintf_ln(stderr, "Can't use --patch and --include-untracked 
> or --all at the same time");

This should be marked for translation.  Similar for the messages
below.  I noticed this in a previous patch as well, so it may be worth
reviewing all the output, and checking that it's going to the right
stream and is marked for translation.

> + return -1;
> + }
> +
> + parse_pathspec(&ps, 0, PATHSPEC_PREFER_FULL, prefix, argv);
> +
> + if (read_cache() < 0)
> + die(_("index file corrupt"));
> +
> + if (!include_untracked && ps.nr) {
> + int i;
> + char *ps_matched = xcalloc(ps.nr, 1);
> +
> + for (i = 0; i < active_nr; ++i) {
> + const struct cache_entry *ce = active_cache[i];
> + if (!ce_path_match(ce, &ps, ps_matched))
> + continue;
> + }
> +
> + if (report_path_error(ps_matched, &ps, prefix)) {
> + fprintf_ln(stderr, "Did you forget to 'git add'?");
> + return -1;
> + }
> + }
> +
> + read_cache_preload(NULL);

Instead of doing a 'read_cache' before and 'read_cache_preload(NULL)'
here, we could just use 'read_cache_preload(NULL)' above.
'read_cache' does return early if the index has already been read, so
there's no big harm in doing this twice, but just having one call is
still neater I think.

It would make the command slightly slower in the error case above, but
I doubt that's worth worrying about.

> + if (refresh_cache(REFRESH_QUIET))
> + return -1;
> +
> + if (!check_changes(argv, include_untracked, prefix)) {
> + fprintf_ln(stdout, "No local changes to save");
> + return 0;
> + }
> +
> + if (!reflog_exists(ref_stash) && do_clear_stash()) {
> + fprintf_ln(stderr, "Cannot initialize stash");
> + return -1;
> + }
> +
> + if ((ret = do_create_stash(argc, argv, prefix, &stash_msg,
> +include_untracked, patch_mode, &info)))
> + return ret;

Should this be 'return ret < 0'?  'ret == 1' means there are no
changes, for which we currently get a 0 exit code.  Though on second
thought that can't happen, because we already have 'check_changes'
above.  Why do we want the 'ret' variable here?

Something I notice here is that we are passing 'argc' and 'argv'
around a lot.  We passed that through parse-options already, and it
seems to me that we're mostly left with pathspecs here, rather than
'argv'.  It looks to me like we could just parse the pathspecs in the
callers (which we do in some places, but maybe not in all of them) and
then pass 'struct pathspec' around instead of the leftover argv, which
is easier to understand, and gives all these functions a neater/easier
to understand interface.

Also looking at 'do_create_stash', the 'ar

Re: [GSoC][PATCH v7 15/26] stash: convert create to builtin

2018-08-18 Thread Paul Sebastian Ungureanu
On Thu, Aug 16, 2018 at 1:13 AM, Thomas Gummerer  wrote:
> On 08/08, Paul-Sebastian Ungureanu wrote:
>> Add stash create to the helper.
>>
>> Signed-off-by: Paul-Sebastian Ungureanu 
>> ---
>>  builtin/stash--helper.c | 406 
>>  git-stash.sh|   2 +-
>>  2 files changed, 407 insertions(+), 1 deletion(-)
>>
>> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
>> index 5ff810f8c..a4e57899b 100644
>> --- a/builtin/stash--helper.c
>> +++ b/builtin/stash--helper.c
>> @@ -21,6 +21,7 @@ static const char * const git_stash_helper_usage[] = {
>>   N_("git stash--helper branch  []"),
>>   N_("git stash--helper clear"),
>>   N_("git stash--helper store [-m|--message ] [-q|--quiet] 
>> "),
>> + N_("git stash--helper create []"),
>>   NULL
>>  };
>>
>> @@ -64,6 +65,11 @@ static const char * const git_stash_helper_store_usage[] 
>> = {
>>   NULL
>>  };
>>
>> +static const char * const git_stash_helper_create_usage[] = {
>> + N_("git stash--helper create []"),
>> + NULL
>> +};
>> +
>>  static const char *ref_stash = "refs/stash";
>>  static int quiet;
>>  static struct strbuf stash_index_path = STRBUF_INIT;
>> @@ -781,6 +787,404 @@ static int store_stash(int argc, const char **argv, 
>> const char *prefix)
>>   return do_store_stash(argv[0], stash_msg, quiet);
>>  }
>>
>> [...]
>>
>> +
>> +static int do_create_stash(int argc, const char **argv, const char *prefix,
>> +const char **stash_msg, int include_untracked,
>> +int patch_mode, struct stash_info *info)
>> +{
>> + int untracked_commit_option = 0;
>> + int ret = 0;
>> + int subject_len;
>> + int flags;
>> + const char *head_short_sha1 = NULL;
>> + const char *branch_ref = NULL;
>> + const char *head_subject = NULL;
>> + const char *branch_name = "(no branch)";
>> + struct commit *head_commit = NULL;
>> + struct commit_list *parents = NULL;
>> + struct strbuf msg = STRBUF_INIT;
>> + struct strbuf commit_tree_label = STRBUF_INIT;
>> + struct strbuf out = STRBUF_INIT;
>> + struct strbuf final_stash_msg = STRBUF_INIT;
>> +
>> + read_cache_preload(NULL);
>> + refresh_cache(REFRESH_QUIET);
>> +
>> + if (!check_changes(argv, include_untracked, prefix)) {
>> + ret = 1;
>> + goto done;
>
> I wonder if we can just 'exit(0)' here, instead of returning.  This
> whole command is a builtin, and I *think* outside of 'libgit.a' exiting
> early is fine.  It does mean that we're not free'ing the memory
> though, which means a leak checker would probably complain.  So
> dunno.  It would simplify the code a little, but not sure it's worth it.

Indeed, there shouldn't be any problem by calling exit(0).

>> + }
>> +
>> + if (get_oid("HEAD", &info->b_commit)) {
>> + fprintf_ln(stderr, "You do not have the initial commit yet");
>> + ret = -1;
>> + goto done;
>> + } else {
>> + head_commit = lookup_commit(the_repository, &info->b_commit);
>> + }
>> +
>> + branch_ref = resolve_ref_unsafe("HEAD", 0, NULL, &flags);
>> + if (flags & REF_ISSYMREF)
>> + branch_name = strrchr(branch_ref, '/') + 1;
>> + head_short_sha1 = find_unique_abbrev(&head_commit->object.oid,
>> +  DEFAULT_ABBREV);
>> + subject_len = find_commit_subject(get_commit_buffer(head_commit, NULL),
>> +   &head_subject);
>> + strbuf_addf(&msg, "%s: %s %.*s\n", branch_name, head_short_sha1,
>> + subject_len, head_subject);
>
> I think this can be written in a slightly simpler way:
>
> head_short_sha1 = find_unique_abbrev(&head_commit->object.oid,
>  DEFAULT_ABBREV);
> strbuf_addf(&msg, "%s: %s", branch_name, head_short_sha1);
> pp_commit_easy(CMIT_FMT_ONELINE, head_commit, &msg);
> strbuf_addch(&msg, '\n');
>
> The other advantage this brings is that it is consistent with other
> places where we print/use the subject of a commit (e.g. in 'git reset
> --hard').

Thanks for this suggestion.

>> +
>> + strbuf_addf(&commit_tree_label, "index on %s\n", msg.buf);
>> + commit_list_insert(head_commit, &parents);
>> + if (write_cache_as_tree(&info->i_tree, 0, NULL) ||
>> + commit_tree(commit_tree_label.buf, commit_tree_label.len,
>> + &info->i_tree, parents, &info->i_commit, NULL, NULL)) {
>> + fprintf_ln(stderr, "Cannot save the current index state");
>
> Looks like this message is translated in the current 'git stash'
> implementation, so it should be here as well.  Same for the messages
> below.
>
>> + ret = -1;
>> + goto done;
>> + }
>> +
>> + if (include_untracked && get_untracked_files(argv, 1,
>> +  include_untrac

Re: [GSoC][PATCH v7 19/26] stash: make push to be quiet

2018-08-18 Thread Thomas Gummerer
> Subject: stash: make push to be quiet

Nit: maybe "stash: make push -q quiet"?  I think the subject should at
least mention the -q option.

On 08/08, Paul-Sebastian Ungureanu wrote:
> There is a change in behaviour with this commit. When there was
> no initial commit, the shell version of stash would still display
> a message. This commit makes `push` to not display any message if
> `--quiet` or `-q` is specified.

Yeah, not being quiet here cna be considered a bug, so this change in
behaviour makes sense.

Should the "No changes selected" message in 'stash_patch' also be made
quiet?

> ---
>  builtin/stash--helper.c | 41 +++--
>  1 file changed, 27 insertions(+), 14 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index c26cad3d5..4fd79532c 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -1079,7 +1079,7 @@ static int stash_working_tree(struct stash_info *info,
>  
>  static int do_create_stash(int argc, const char **argv, const char *prefix,
>  const char **stash_msg, int include_untracked,
> -int patch_mode, struct stash_info *info)
> +int patch_mode, struct stash_info *info, int quiet)
>  {
>   int untracked_commit_option = 0;
>   int ret = 0;
> @@ -1105,7 +1105,8 @@ static int do_create_stash(int argc, const char **argv, 
> const char *prefix,
>   }
>  
>   if (get_oid("HEAD", &info->b_commit)) {
> - fprintf_ln(stderr, "You do not have the initial commit yet");
> + if (!quiet)
> + fprintf_ln(stderr, "You do not have the initial commit 
> yet");
>   ret = -1;
>   goto done;
>   } else {
> @@ -1127,7 +1128,8 @@ static int do_create_stash(int argc, const char **argv, 
> const char *prefix,
>   if (write_cache_as_tree(&info->i_tree, 0, NULL) ||
>   commit_tree(commit_tree_label.buf, commit_tree_label.len,
>   &info->i_tree, parents, &info->i_commit, NULL, NULL)) {
> - fprintf_ln(stderr, "Cannot save the current index state");
> + if (!quiet)
> + fprintf_ln(stderr, "Cannot save the current index 
> state");
>   ret = -1;
>   goto done;
>   }
> @@ -1135,7 +1137,8 @@ static int do_create_stash(int argc, const char **argv, 
> const char *prefix,
>   if (include_untracked && get_untracked_files(argv, 1,
>include_untracked, &out)) {
>   if (save_untracked_files(info, &msg, &out)) {
> - printf_ln("Cannot save the untracked files");
> + if (!quiet)
> + printf_ln("Cannot save the untracked files");
>   ret = -1;
>   goto done;
>   }
> @@ -1144,14 +1147,16 @@ static int do_create_stash(int argc, const char 
> **argv, const char *prefix,
>   if (patch_mode) {
>   ret = stash_patch(info, argv);
>   if (ret < 0) {
> - printf_ln("Cannot save the current worktree state");
> + if (!quiet)
> + printf_ln("Cannot save the current worktree 
> state");
>   goto done;
>   } else if (ret > 0) {
>   goto done;
>   }
>   } else {
>   if (stash_working_tree(info, argv, prefix)) {
> - printf_ln("Cannot save the current worktree state");
> + if (!quiet)
> + printf_ln("Cannot save the current worktree 
> state");
>   ret = -1;
>   goto done;
>   }
> @@ -1176,7 +1181,8 @@ static int do_create_stash(int argc, const char **argv, 
> const char *prefix,
>  
>   if (commit_tree(*stash_msg, strlen(*stash_msg), &info->w_tree,
>   parents, &info->w_commit, NULL, NULL)) {
> - printf_ln("Cannot record working tree state");
> + if (!quiet)
> + printf_ln("Cannot record working tree state");
>   ret = -1;
>   goto done;
>   }
> @@ -1208,7 +1214,7 @@ static int create_stash(int argc, const char **argv, 
> const char *prefix)
>0);
>  
>   ret = do_create_stash(argc, argv, prefix, &stash_msg,
> -   include_untracked, 0, &info);
> +   include_untracked, 0, &info, 0);
>  
>   if (!ret)
>   printf_ln("%s", oid_to_hex(&info.w_commit));
> @@ -1261,25 +1267,31 @@ static int do_push_stash(int argc, const char **argv, 
> const char *prefix,
>   return -1;
>  
>   if (!check_changes(argv, include_untracked, prefix)) {
> - fprintf_ln(stdout, "No local changes to save");
> + if (!quiet)
> + 

Re: [PATCH 11/18] builtin rebase: support `--autostash` option

2018-08-18 Thread Duy Nguyen
On Wed, Aug 8, 2018 at 5:26 PM Pratik Karki  wrote:
> @@ -224,13 +219,56 @@ static int read_basic_state(struct rebase_options *opts)
> return 0;
>  }
>
> +static int apply_autostash(struct rebase_options *opts)
> +{
> +   const char *path = state_dir_path("autostash", opts);
> +   struct strbuf autostash = STRBUF_INIT;
> +   struct child_process stash_apply = CHILD_PROCESS_INIT;
> +
> +   if (!file_exists(path))
> +   return 0;
> +
> +   if (read_one(state_dir_path("autostash", opts), &autostash))
> +   return error(_("Could not read '%s'"), path);
> +   argv_array_pushl(&stash_apply.args,
> +"stash", "apply", autostash.buf, NULL);
> +   stash_apply.git_cmd = 1;
> +   stash_apply.no_stderr = stash_apply.no_stdout =
> +   stash_apply.no_stdin = 1;
> +   if (!run_command(&stash_apply))
> +   printf("Applied autostash.\n");

I think you need _() here.

> +   else {
> +   struct argv_array args = ARGV_ARRAY_INIT;
> +   int res = 0;
> +
> +   argv_array_pushl(&args,
> +"stash", "store", "-m", "autostash", "-q",
> +autostash.buf, NULL);
> +   if (run_command_v_opt(args.argv, RUN_GIT_CMD))
> +   res = error(_("Cannot store %s"), autostash.buf);
> +   argv_array_clear(&args);
> +   strbuf_release(&autostash);
> +   if (res)
> +   return res;
> +
> +   fprintf(stderr,
> +   _("Applying autostash resulted in conflicts.\n"
> + "Your changes are safe in the stash.\n"
> + "You can run \"git stash pop\" or \"git stash 
> drop\" "
> + "at any time.\n"));
> +   }
> +
> +   strbuf_release(&autostash);
> +   return 0;
> +}
> +
>  static int finish_rebase(struct rebase_options *opts)
>  {
> struct strbuf dir = STRBUF_INIT;
> const char *argv_gc_auto[] = { "gc", "--auto", NULL };
>
> delete_ref(NULL, "REBASE_HEAD", NULL, REF_NO_DEREF);
> -   apply_autostash();
> +   apply_autostash(opts);
> close_all_packs(the_repository->objects);
> /*
>  * We ignore errors in 'gc --auto', since the
-- 
Duy


Re: Re* [PATCH v7 1/1] sideband: highlight keywords in remote sideband output

2018-08-18 Thread Junio C Hamano
Junio C Hamano  writes:

>>> -   strbuf_add(dest, src, n);
>>> +   if (0 < n)
>>> +   strbuf_add(dest, src, n);
>>
>> This check seems unnecessary.  strbuf_add can cope fine with !n.
>
> I was primarily interested in catching negatives, and !n was a mere
> optimization, but you are correct to point out that negative n at
> this point in the codeflow is a BUG().

Actually, let's just lose the conditional.  strbuf_add() would catch
and issue an error message when it notices that we fed negative
count anyway ;-).


Re: [PATCH v2 2/2] repack: repack promisor objects if -a or -A is set

2018-08-18 Thread Duy Nguyen
On Thu, Aug 9, 2018 at 12:35 AM Jonathan Tan  wrote:
> @@ -179,6 +179,76 @@ static void prepare_pack_objects(struct child_process 
> *cmd,
> cmd->out = -1;
>  }
>
> +/*
> + * Write oid to the given struct child_process's stdin, starting it first if
> + * necessary.
> + */
> +static int write_oid(const struct object_id *oid, struct packed_git *pack,
> +uint32_t pos, void *data)
> +{
> +   struct child_process *cmd = data;
> +
> +   if (cmd->in == -1) {
> +   if (start_command(cmd))
> +   die("Could not start pack-objects to repack promisor 
> objects");

_() ? (also other messages)
-- 
Duy


Re: Re* [PATCH v7 1/1] sideband: highlight keywords in remote sideband output

2018-08-18 Thread Junio C Hamano
Jonathan Nieder  writes:

> And here are some tests to squash in.

Thanks, will do.


Re: [GSoC][PATCH v7 05/26] stash: convert apply to builtin

2018-08-18 Thread Duy Nguyen
On Wed, Aug 8, 2018 at 9:00 PM Paul-Sebastian Ungureanu
 wrote:
> +   strbuf_init(&info->revision, 0);
> +   if (!commit) {
> +   if (!ref_exists(ref_stash)) {
> +   free_stash_info(info);
> +   fprintf_ln(stderr, "No stash entries found.");

Maybe _() ?
-- 
Duy


Re: [PATCH 7/7] submodule--helper: introduce new update-module-mode helper

2018-08-18 Thread Duy Nguyen
On Tue, Aug 14, 2018 at 12:45 AM Stefan Beller  wrote:
> +static int module_update_module_mode(int argc, const char **argv, const char 
> *prefix)
> +{
> +   const char *path, *update = NULL;
> +   int just_cloned;
> +   struct submodule_update_strategy update_strategy = { .type = 
> SM_UPDATE_CHECKOUT };
> +
> +   if (argc < 3 || argc > 4)
> +   die("submodule--helper update-module-clone expects 
>   []");

Maybe _() ?
-- 
Duy


Re: [GSoC][PATCH v7 20/26] stash: add tests for `git stash push -q`

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> This commit introduces more tests for the quiet option of
> `git stash push`.

I think this commit should be squashed into the previous one, so we
have implementation and tests in one commit.  That way it's easier to
see during review that there are tests for the change.  For more
discussion on that also see [1].

[1]: 
https://public-inbox.org/git/20180806144726.gb97...@aiede.svl.corp.google.com/

> ---
>  t/t3903-stash.sh | 21 +
>  1 file changed, 21 insertions(+)
> 
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 8d002a7f2..b78db74ae 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -1064,6 +1064,27 @@ test_expect_success 'push:  not in the 
> repository errors out' '
>   test_path_is_file untracked
>  '
>  
> +test_expect_success 'push: -q is quiet with changes' '
> + >foo &&
> + git stash push -q >output 2>&1 &&

We create an untracked file here and then call 'git stash push', which
will not create a new stash, as we don't use the --include-untracked
option.  In fact, right now this test is doing the same thing as the
test below.  There should be a 'git add foo' above the 'git stash
push' call to test what we're claiming to test here.

> + test_must_be_empty output
> +'
> +
> +test_expect_success 'push: -q is quiet with no changes' '
> + git stash push -q >output 2>&1 &&
> + test_must_be_empty output
> +'
> +
> +test_expect_success 'push: -q is quiet even if there is no initial commit' '
> + git init foo_dir &&
> + cd foo_dir &&
> + touch bar &&

The typical style in the test suite for creating a new file is to use
'>bar', unless you care about the 'mtime' the file has.  We don't seem
to care about that in this test, so avoiding 'touch' would be better.

> + test_must_fail git stash push -q >output 2>&1 &&
> + test_must_be_empty output &&
> + cd .. &&

The above should be in a subshell, i.e.

(
cd foo_dir &&
touch bar &&
test_must_fail git stash push -q >output 2>&1 &&
test_must_be_empty output &&
)

then you don't have to do the 'cd ..' in the end.  With the 'cd ..' in
the end, if one of the commands between the 'cd foo_dir' and 'cd ..'
fails, all subsequent tests will be run inside of 'foo_dir', which
puts them in a different environment than they expect.  That can cause
all kinds of weirdness.

If inside a subshell, the current working directory of the parent
shell is unaffected, so we don't have to worry about cd'ing back, and
subsequent tests will get the correct cwd even if things go wrong in
this test.


> + rm -rf foo_dir

We'll want to run this cleanup to run even if the test fails.  To do
so, the 'test_when_finished' helper can be used.  Using that, this
would go at the top of the test, as 'test_when_finished rm -rf
foo_dir'.  Otherwise if any of the commands above fail, 'foo_dir' will
not be removed, and may interfere with subsequent tests.

> +'
> +
>  test_expect_success 'untracked files are left in place when -u is not given' 
> '
>   >file &&
>   git add file &&
> -- 
> 2.18.0.573.g56500d98f
> 


Re: Re* [PATCH v7 1/1] sideband: highlight keywords in remote sideband output

2018-08-18 Thread Junio C Hamano
Junio C Hamano  writes:

> Junio C Hamano  writes:
>
 -  strbuf_add(dest, src, n);
 +  if (0 < n)
 +  strbuf_add(dest, src, n);
>>>
>>> This check seems unnecessary.  strbuf_add can cope fine with !n.
>>
>> I was primarily interested in catching negatives, and !n was a mere
>> optimization, but you are correct to point out that negative n at
>> this point in the codeflow is a BUG().
>
> Actually, let's just lose the conditional.  strbuf_add() would catch
> and issue an error message when it notices that we fed negative
> count anyway ;-).

So I'll have this applied on top of the original topic to prevent a
buggy version from escaping the lab.

-- >8 --
Subject: [PATCH] sideband: do not read beyond the end of input

The caller of maybe_colorize_sideband() gives a counted buffer
, but the callee checked src[] as if it were a NUL terminated
buffer.  If src[] had all isspace() bytes in it, we would have made
n negative, and then

 (1) made number of strncasecmp() calls to see if the remaining
 bytes in src[] matched keywords, reading beyond the end of the
 array (this actually happens even if n does not go negative),
 and/or

 (2) called strbuf_add() with negative count, most likely triggering
 the "you want to use way too much memory" error due to unsigned
 integer overflow.

Fix both issues by making sure we do not go beyond &src[n].

In the longer term we may want to accept size_t as parameter for
clarity (even though we know that a sideband message we are painting
typically would fit on a line on a terminal and int is sufficient).
Write it down as a NEEDSWORK comment.

Helped-by: Jonathan Nieder 
Signed-off-by: Junio C Hamano 
---
 sideband.c  |  8 ++--
 t/t5409-colorize-remote-messages.sh | 14 ++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/sideband.c b/sideband.c
index 1c6bb0e25b..368647acf8 100644
--- a/sideband.c
+++ b/sideband.c
@@ -65,6 +65,8 @@ void list_config_color_sideband_slots(struct string_list 
*list, const char *pref
  * Optionally highlight one keyword in remote output if it appears at the start
  * of the line. This should be called for a single line only, which is
  * passed as the first N characters of the SRC array.
+ *
+ * NEEDSWORK: use "size_t n" instead for clarity.
  */
 static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int 
n)
 {
@@ -75,7 +77,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, 
const char *src, int n)
return;
}
 
-   while (isspace(*src)) {
+   while (0 < n && isspace(*src)) {
strbuf_addch(dest, *src);
src++;
n--;
@@ -84,6 +86,9 @@ static void maybe_colorize_sideband(struct strbuf *dest, 
const char *src, int n)
for (i = 0; i < ARRAY_SIZE(keywords); i++) {
struct keyword_entry *p = keywords + i;
int len = strlen(p->keyword);
+
+   if (n <= len)
+   continue;
/*
 * Match case insensitively, so we colorize output from existing
 * servers regardless of the case that they use for their
@@ -101,7 +106,6 @@ static void maybe_colorize_sideband(struct strbuf *dest, 
const char *src, int n)
}
 
strbuf_add(dest, src, n);
-
 }
 
 
diff --git a/t/t5409-colorize-remote-messages.sh 
b/t/t5409-colorize-remote-messages.sh
index eb1b8aa05d..f81b6813c0 100755
--- a/t/t5409-colorize-remote-messages.sh
+++ b/t/t5409-colorize-remote-messages.sh
@@ -15,6 +15,8 @@ test_expect_success 'setup' '
echo warning: warning
echo prefixerror: error
echo " " "error: leading space"
+   echo ""
+   echo Err
exit 0
EOF
echo 1 >file &&
@@ -44,6 +46,12 @@ test_expect_success 'whole words at line start' '
grep "prefixerror: error" decoded
 '
 
+test_expect_success 'short line' '
+   git -C child -c color.remote=always push -f origin HEAD:short-line 
2>output &&
+   test_decode_color decoded &&
+   grep "remote: Err" decoded
+'
+
 test_expect_success 'case-insensitive' '
git --git-dir child/.git -c color.remote=always push -f origin 
HEAD:refs/heads/case-insensitive 2>output &&
cat output &&
@@ -58,6 +66,12 @@ test_expect_success 'leading space' '
grep "  error: leading space" decoded
 '
 
+test_expect_success 'spaces only' '
+   git -C child -c color.remote=always push -f origin HEAD:only-space 
2>output &&
+   test_decode_color decoded &&
+   grep "remote: " decoded
+'
+
 test_expect_success 'no coloring for redirected output' '
git --git-dir child/.git push -f origin 
HEAD:refs/heads/redirected-output 2>output &&
test_decode_color decoded &&
-- 
2.18.0-748-gfa03cdc39b



Re: [PATCH v6 6/6] list-objects-filter: implement filter tree:0

2018-08-18 Thread Duy Nguyen
On Thu, Aug 16, 2018 at 1:54 AM Matthew DeVore  wrote:
> diff --git a/list-objects-filter.c b/list-objects-filter.c
> index a0ba78b20..8e3caf5bf 100644
> --- a/list-objects-filter.c
> +++ b/list-objects-filter.c
> @@ -80,6 +80,55 @@ static void *filter_blobs_none__init(
> return d;
>  }
>
> +/*
> + * A filter for list-objects to omit ALL trees and blobs from the traversal.
> + * Can OPTIONALLY collect a list of the omitted OIDs.
> + */
> +struct filter_trees_none_data {
> +   struct oidset *omits;
> +};
> +
> +static enum list_objects_filter_result filter_trees_none(
> +   enum list_objects_filter_situation filter_situation,
> +   struct object *obj,
> +   const char *pathname,
> +   const char *filename,
> +   void *filter_data_)
> +{
> +   struct filter_trees_none_data *filter_data = filter_data_;
> +
> +   switch (filter_situation) {
> +   default:
> +   die("unknown filter_situation");

This sounds like BUG() than die() without _(). And you probably want
to report filter_situation value too.

> +   return LOFR_ZERO;

And neither BUG() or die() returns.
-- 
Duy


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Junio C Hamano
Michael Muré  writes:

> I released today git-bug, a distributed bug tracker that embeds in
> git. It use git's internal storage to store bugs information in a way
> that can be merged without conflict. You can push/pull to the normal
> git remote you are already using to interact with other people. Normal
> code and bugs are completely separated and no files are added in the
> regular branches.

This reminds me of a demo Scott Chacon showed us ages ago, the name
of which escapes me.  I guess great minds think alike, or something?


Re: [GSoC][PATCH v7 22/26] stash: convert save to builtin

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> Add stash save to the helper and delete functions which are no
> longer needed (`show_help()`, `save_stash()`, `push_stash()`,
> `create_stash()`, `clear_stash()`, `untracked_files()` and
> `no_changes()`).
> ---
>  builtin/stash--helper.c |  48 +++
>  git-stash.sh| 311 +---
>  2 files changed, 50 insertions(+), 309 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index 5c27f5dcf..f54a476e3 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -26,6 +26,8 @@ static const char * const git_stash_helper_usage[] = {
>   N_("git stash--helper [push [-p|--patch] [-k|--[no-]keep-index] 
> [-q|--quiet]\n"
>  "  [-u|--include-untracked] [-a|--all] [-m|--message 
> ]\n"
>  "  [--] [...]]"),
> + N_("git stash--helper save [-p|--patch] [-k|--[no-]keep-index] 
> [-q|--quiet]\n"
> +"  [-u|--include-untracked] [-a|--all] []"),
>   NULL
>  };
>  
> @@ -81,6 +83,12 @@ static const char * const git_stash_helper_push_usage[] = {
>   NULL
>  };
>  
> +static const char * const git_stash_helper_save_usage[] = {
> + N_("git stash--helper save [-p|--patch] [-k|--[no-]keep-index] 
> [-q|--quiet]\n"
> +"  [-u|--include-untracked] [-a|--all] []"),
> + NULL
> +};
> +
>  static const char *ref_stash = "refs/stash";
>  static int quiet;
>  static struct strbuf stash_index_path = STRBUF_INIT;
> @@ -1445,6 +1453,44 @@ static int push_stash(int argc, const char **argv, 
> const char *prefix)
>include_untracked, quiet, stash_msg);
>  }
>  
> +static int save_stash(int argc, const char **argv, const char *prefix)
> +{
> + int i;
> + int keep_index = -1;
> + int patch_mode = 0;
> + int include_untracked = 0;
> + int quiet = 0;
> + char *stash_msg = NULL;
> + struct strbuf alt_stash_msg = STRBUF_INIT;
> + struct option options[] = {
> + OPT_SET_INT('k', "keep-index", &keep_index,
> + N_("keep index"), 1),
> + OPT_BOOL('p', "patch", &patch_mode,
> + N_("stash in patch mode")),
> + OPT_BOOL('q', "quiet", &quiet,
> + N_("quiet mode")),

This could be OPT__QUIET again as mentioned in the previous patch as
well.

> + OPT_BOOL('u', "include-untracked", &include_untracked,
> +  N_("include untracked files in stash")),
> + OPT_SET_INT('a', "all", &include_untracked,
> + N_("include ignore files"), 2),
> + OPT_STRING('m', "message", &stash_msg, N_("message"),
> +  N_("stash message")),
> + OPT_END()
> + };
> +
> + argc = parse_options(argc, argv, prefix, options,
> +  git_stash_helper_save_usage,
> +  0);
> +
> + for (i = 0; i < argc; ++i)
> + strbuf_addf(&alt_stash_msg, "%s ", argv[i]);
> +
> + stash_msg = strbuf_detach(&alt_stash_msg, NULL);

We unconditionally overwrite 'stash_msg' here, even if a '-m'
parameter was given earlier, which I don't think is what we intended
here.  I think we started "supporting" (not erroring out on rather)
the '-m' flag accidentally (my bad, sorry) in 'git stash save' rather
than intentionally, and indeed it has the same behaviour as the code
above.

However I think we should just not add support the '-m' flag here, as
it doesn't make a lot of sense to have two ways of passing a message.

We also never free the memory we get back here from 'strbuf_detach'.
As this is not code in 'libgit.a' that's probably fine, and we can
just add an 'UNLEAK(stash_msg)' here I think.

It may generally be interesting to consider using a leak checker, and
see how far we can get in making this leak free.  It may not be
possible to make 'git stash' completely leak free, as the underlying
APIs may not be, but it may be interesting to see how far we can get
for the new code only.

> +
> + return do_push_stash(0, NULL, prefix, keep_index, patch_mode,
> +  include_untracked, quiet, stash_msg);
> +}
> +
>  int cmd_stash__helper(int argc, const char **argv, const char *prefix)
>  {
>   pid_t pid = getpid();
> @@ -1485,6 +1531,8 @@ int cmd_stash__helper(int argc, const char **argv, 
> const char *prefix)
>   return !!create_stash(argc, argv, prefix);
>   else if (!strcmp(argv[0], "push"))
>   return !!push_stash(argc, argv, prefix);
> + else if (!strcmp(argv[0], "save"))
> + return !!save_stash(argc, argv, prefix);
>  
>   usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]),
> git_stash_helper_usage, options);
>
> [...]


Re: [GSoC][PATCH v7 23/26] stash: convert `stash--helper.c` into `stash.c`

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> The old shell script `git-stash.sh`  was removed and replaced
> entirely by `builtin/stash.c`. In order to do that, `create` and
> `push` were adapted to work without `stash.sh`. For example, before
> this commit, `git stash create` called `git stash--helper create
> --message "$*"`. If it called `git stash--helper create "$@"`, then
> some of these changes wouldn't have been necessary.
> 
> This commit also removes the word `helper` since now stash is
> called directly and not by a shell script.
> ---
>  .gitignore   |   1 -
>  Makefile |   3 +-
>  builtin.h|   2 +-
>  builtin/{stash--helper.c => stash.c} | 132 ---
>  git-stash.sh | 153 ---
>  git.c|   2 +-
>  6 files changed, 74 insertions(+), 219 deletions(-)
>  rename builtin/{stash--helper.c => stash.c} (91%)
>  delete mode 100755 git-stash.sh
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash.c
> similarity index 91%
> rename from builtin/stash--helper.c
> rename to builtin/stash.c
> index f54a476e3..0ef88408a 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash.c
>
> [...]
>
> @@ -1445,9 +1448,10 @@ static int push_stash(int argc, const char **argv, 
> const char *prefix)
>   OPT_END()
>   };
>  
> - argc = parse_options(argc, argv, prefix, options,
> -  git_stash_helper_push_usage,
> -  0);
> + if (argc)
> + argc = parse_options(argc, argv, prefix, options,
> +  git_stash_push_usage,
> +  0);

This change is a bit surprising here.  Why is this necessary?  I
thought parse_options would handle no arguments just fine?

>   return do_push_stash(argc, argv, prefix, keep_index, patch_mode,
>include_untracked, quiet, stash_msg);
> @@ -1479,7 +1483,7 @@ static int save_stash(int argc, const char **argv, 
> const char *prefix)
>   };
>  
>   argc = parse_options(argc, argv, prefix, options,
> -  git_stash_helper_save_usage,
> +  git_stash_save_usage,
>0);
>  
>   for (i = 0; i < argc; ++i)
> @@ -1491,7 +1495,7 @@ static int save_stash(int argc, const char **argv, 
> const char *prefix)
>include_untracked, quiet, stash_msg);
>  }
>  
> -int cmd_stash__helper(int argc, const char **argv, const char *prefix)
> +int cmd_stash(int argc, const char **argv, const char *prefix)
>  {
>   pid_t pid = getpid();
>   const char *index_file;
> @@ -1502,16 +1506,16 @@ int cmd_stash__helper(int argc, const char **argv, 
> const char *prefix)
>  
>   git_config(git_default_config, NULL);
>  
> - argc = parse_options(argc, argv, prefix, options, 
> git_stash_helper_usage,
> + argc = parse_options(argc, argv, prefix, options, git_stash_usage,
>PARSE_OPT_KEEP_UNKNOWN | PARSE_OPT_KEEP_DASHDASH);
>  
>   index_file = get_index_file();
>   strbuf_addf(&stash_index_path, "%s.stash.%" PRIuMAX, index_file,
>   (uintmax_t)pid);
>  
> - if (argc < 1)
> - usage_with_options(git_stash_helper_usage, options);
> - if (!strcmp(argv[0], "apply"))
> + if (argc == 0)
> + return !!push_stash(0, NULL, prefix);
> + else if (!strcmp(argv[0], "apply"))
>   return !!apply_stash(argc, argv, prefix);
>   else if (!strcmp(argv[0], "clear"))
>   return !!clear_stash(argc, argv, prefix);
> @@ -1533,7 +1537,13 @@ int cmd_stash__helper(int argc, const char **argv, 
> const char *prefix)
>   return !!push_stash(argc, argv, prefix);
>   else if (!strcmp(argv[0], "save"))
>   return !!save_stash(argc, argv, prefix);
> + if (*argv[0] == '-') {
> + struct argv_array args = ARGV_ARRAY_INIT;
> + argv_array_push(&args, "push");
> + argv_array_pushv(&args, argv);
> + return !!push_stash(args.argc, args.argv, prefix);
> + }

This is a bit different than what the current code does.  Currently
the rules for when a plain 'git stash' becomes 'git stash push' are
the following:

- If there are no arguments.
- If all arguments are option arguments.
- If the first argument of 'git stash' is '-p'.
- If the first argument of 'git stash' is '--'.

This is to avoid someone typing 'git stash -q drop' for example, and
then being surprised that a new stash was created instead of an old
one being dropped, which what we have above would do.

For more reasoning about these aliasing rules see also the thread at [1].

[1]: 
https://public-inbox.org/git/20170213200950.m3bcyp52wd25p...@sigill.intra.peff.net/

>  
>   usage_msg_opt(xstrfmt(_("unknown subcommand: %s"), argv[0]),
> -   

[RFC PATCH] t7508-status: fix a bogus check in a submodule-related test

2018-08-18 Thread SZEDER Gábor
Since 90e14525f2 (Add tests for the diff.ignoreSubmodules config
option, 2010-08-06) the test '.gitmodules ignore=dirty suppresses
submodules with untracked content' in 't7508-status.sh' contains a
check that is bogus for two reasons.

The commit in question added the following three lines at the
beginning of that test:

  git config diff.ignoreSubmodules dirty &&
  git status >output &&
  ! test -s actual &&

That 'test' is problematic, because:

  - The output of 'git status' is saved in the file 'output', but the
subsequent 'test' looks at the file 'actual'.  This is the first
mention of 'actual' in t7508, so that file doesn't exist at that
point, and, consequently, the 'test' itself fails.  However, since
there is a '!' in front to flip the exit code, the command as a
whole succeeds.
I guess that this 'test' should look at the file 'output' instead,
but...

  - This whole command checks that the given file is empty, i.e. that,
supposedly, 'git status' produced no output.  However, in this
case 'git status' does produce output, and indeed it should
produce the same output as already expected in the neighbouring
tests or even later in the same test, all running 'git status' in
similar situations.

So drop that bogus check, and verify that 'git status's output matches
what's otherwise expected in similar cases.

Signed-off-by: SZEDER Gábor 
---

This is a submodules-related test, and I'm not very well versed in
submodules, hence the RFC.

 t/t7508-status.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7508-status.sh b/t/t7508-status.sh
index e1f11293e2..4ea528785a 100755
--- a/t/t7508-status.sh
+++ b/t/t7508-status.sh
@@ -1183,7 +1183,7 @@ test_expect_success '--ignore-submodules=dirty suppresses 
submodules with untrac
 test_expect_success '.gitmodules ignore=dirty suppresses submodules with 
untracked content' '
test_config diff.ignoreSubmodules dirty &&
git status >output &&
-   ! test -s actual &&
+   test_i18ncmp expect output &&
git config --add -f .gitmodules submodule.subname.ignore dirty &&
git config --add -f .gitmodules submodule.subname.path sm &&
git status >output &&
-- 
2.18.0.903.gab616d7dc6



Re: [GSoC][PATCH v7 15/26] stash: convert create to builtin

2018-08-18 Thread Thomas Gummerer
On 08/18, Paul Sebastian Ungureanu wrote:
> On Thu, Aug 16, 2018 at 1:13 AM, Thomas Gummerer  wrote:
> > On 08/08, Paul-Sebastian Ungureanu wrote:
> >>
> >>  [...]
> >>
> >> + ret = -1;
> >> + goto done;
> >> + }
> >> + untracked_commit_option = 1;
> >> + }
> >> + if (patch_mode) {
> >> + ret = stash_patch(info, argv);
> >> + if (ret < 0) {
> >> + printf_ln("Cannot save the current worktree state");
> >> + goto done;
> >> + } else if (ret > 0) {
> >> + goto done;
> >> + }
> >> + } else {
> >> + if (stash_working_tree(info, argv)) {
> >> + printf_ln("Cannot save the current worktree state");
> >> + ret = -1;
> >> + goto done;
> >> + }
> >> + }
> >> +
> >> + if (!*stash_msg || !strlen(*stash_msg))
> >> + strbuf_addf(&final_stash_msg, "WIP on %s", msg.buf);
> >> + else
> >> + strbuf_addf(&final_stash_msg, "On %s: %s\n", branch_name,
> >> + *stash_msg);
> >> + *stash_msg = strbuf_detach(&final_stash_msg, NULL);
> >
> > strbuf_detach means we're taking ownership of the memory, so we'll
> > have to free it afterwards. Looking at this we may not even want to
> > re-use the 'stash_msg' variable here, but instead introduce another
> > variable for it, so it doesn't have a dual meaning in this function.
> >
> >> +
> >> + /*
> >> +  * `parents` will be empty after calling `commit_tree()`, so there is
> >> +  * no need to call `free_commit_list()`
> >> +  */
> >> + parents = NULL;
> >> + if (untracked_commit_option)
> >> + commit_list_insert(lookup_commit(the_repository, 
> >> &info->u_commit), &parents);
> >> + commit_list_insert(lookup_commit(the_repository, &info->i_commit), 
> >> &parents);
> >> + commit_list_insert(head_commit, &parents);
> >> +
> >> + if (commit_tree(*stash_msg, strlen(*stash_msg), &info->w_tree,
> >> + parents, &info->w_commit, NULL, NULL)) {
> >> + printf_ln("Cannot record working tree state");
> >> + ret = -1;
> >> + goto done;
> >> + }
> >> +
> >> +done:
> >> + strbuf_release(&commit_tree_label);
> >> + strbuf_release(&msg);
> >> + strbuf_release(&out);
> >> + strbuf_release(&final_stash_msg);
> >> + return ret;
> >> +}
> >> +
> >> +static int create_stash(int argc, const char **argv, const char *prefix)
> >> +{
> >> + int include_untracked = 0;
> >> + int ret = 0;
> >> + const char *stash_msg = NULL;
> >> + struct stash_info info;
> >> + struct option options[] = {
> >> + OPT_BOOL('u', "include-untracked", &include_untracked,
> >> +  N_("include untracked files in stash")),
> >> + OPT_STRING('m', "message", &stash_msg, N_("message"),
> >> +  N_("stash message")),
> >> + OPT_END()
> >> + };
> >> +
> >> + argc = parse_options(argc, argv, prefix, options,
> >> +  git_stash_helper_create_usage,
> >> +  0);
> >> +
> >> + ret = do_create_stash(argc, argv, prefix, &stash_msg,
> >> +   include_untracked, 0, &info);
> >
> > stash_msg doesn't have to be passed as a pointer to a pointer here, as
> > we never need the modified value after this function returns.  I think
> > just passing 'stash_msg' here instead of '&stash_msg' will make
> > 'do_create_stash' slightly easier to read.
> 
> That's right, but `do_create_stash()` is also called by
> `do_push_stash()`, which will need the modified value.

Ah right, I didn't read that far yet when leaving this comment :)

Reading the original push_stash again though, do we actually need the
modified value in 'do_push_stash()'?  The original lines are:

create_stash -m "$stash_msg" -u "$untracked" -- "$@"
store_stash -m "$stash_msg" -q $w_commit ||
die "$(gettext "Cannot save the current status")"

'$stash_msg' gets passed in to 'create_stash()', but is the
'stash_msg' variable inside of 'create_stash()' is local and only the
local copy is modified, so 'store_stash()' would still get the
original.  Or am I missing something?


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
Hi,

Ævar Arnfjörð Bjarmason wrote:
> On Sat, Aug 18 2018, Jonathan Nieder wrote:
>> Michael Muré wrote:

>>> I released today git-bug, a distributed bug tracker
[...]
>> I am a bit unhappy about the namespace grab.  Not for trademark
>> reasons: the Git trademark rules are pretty clear about this kind of
>> usage being okay.  Instead, the unhappiness comes because a future Git
>> command like "git bug" to produce a bug report with appropriate
>> diagnostics for a bug in Git seems like a likely and useful thing to
>> get added to Git some day.  And now the name's taken.
>>
>> Is it too late to ask if it's possible to come up with a less generic
>> name?
>
> Wouldn't we call such a thing "git-reportbug", or "git gitbug", with
> reference to Debian reportbug or perl's perlbug?

I hope you're kidding about "git gitbug".

[...]
> 1) Accept the status quo where people do create third party tools, much
>of which are way too obscure to matter (e.g. I'm sure someone's
>created a tool/alias called range-diff before, but we didn't
>care).
>
>If those tools become popular enough in the wild they get own that
>namespace, e.g. we're not going to ship a "git-annex" or "git-lfs"
>ourselves implementing some unrelated features

That's fair.  Let me spell out my thinking a little more.

This framework would lead me to rephrase my question to Michael a
different way.  Instead of saying that I'm not happy with the
namespace grab, I should say something more severe:

  Don't be surprised if Git itself makes a "git bug" command in the
  future, and be prepared to rename.

Is that preferable, in your opinion?

I still think it's a reasonable thing for me to ask about, if only to
save Michael some trouble later.

Thanks,
Jonathan


Re: [PATCHv3 0/5] Simple fixes to t7406

2018-08-18 Thread Elijah Newren
Hi,

On Mon, Aug 13, 2018 at 1:28 PM SZEDER Gábor  wrote:
> On Tue, Aug 7, 2018 at 6:49 PM Elijah Newren  wrote:
>
> > Since folks like to notice other problems with t7406 while reading my
> > patches, here's a challenge:
> >
> >   Find something *else* wrong with t7406 that neither I nor any of the
> >   reviewers so far have caught that could be fixed.
>
> Well, I'd hate to be that guy...  but since those who already
> commented on previous rounds are not explicitly excluded from the
> challenge, let's see.
>
> - There are still a few command substitutions running git commands,
>   where the exit status of that command is ignored; just look for the
>   '[^=]$(' pattern in the test script.
>
>   (Is not noticing those cases considered as "flubbing"?)

Hmm, borderline.

> - The 'compare_head' helper function defined in this test script looks
>   very similar to the generally available 'test_cmp_rev' function,
>   which has the benefit to provide some visible output on failure
>   (though, IMO, not a particularly useful output, because the diff of
>   two OIDs is not very informative, but at least it's something as
>   opposed to the silence of 'test $this = $that").
>
>   Now, since 'compare_head' always compares the same two revisions,
>   namely 'master' and HEAD, replacing 'compare_head' with an
>   appropriate 'test_cmp_rev' call would result in repeating 'master'
>   and 'HEAD' arguments all over the test script.  I'm not sure whether
>   that's good or bad.  Anyway, I think that 'compare_head' could be
>   turned into a wrapper around 'test_cmp_rev'.

Ooh, that does sound better.

> > - You get bonus points if that thing is in the context region for
> >   one of my five patches.
> > - Extra bonus points if the thing needing fixing was on a line I
> >   changed.
> > - You win outright if it's something big enough that I give up and
> >   request to just have my series merged as-is and punt your
> >   suggested fixes down the road to someone else.
>
> Well, there's always the indentation of the commands run in subshells,
> which doesn't conform to our coding style...
>
> Gah, now you made me that guy ;)

I read this on Monday and got a really good laugh.  I meant to fix it
up, but fell asleep too soon the first couple nights...and now this
series is in next anyway and there are a couple other git things that
have my attention.  You have pointed out a couple additional nice
fixups, like you always do, but I think at this point I'm just going
to declare you the winner and label these as #leftoverbits.

Thanks for always thoroughly looking over the testcase patches and
your constant work to improve the testsuite.


Re: [GSoC][PATCH v7 16/26] stash: replace spawning a "read-tree" process

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> Instead of spawning a child process, make use of `reset_tree()`
> function already implemented in `stash-helper.c`.
> ---
>  builtin/stash--helper.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index a4e57899b..887b78d05 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -984,21 +984,18 @@ static int stash_patch(struct stash_info *info, const 
> char **argv)
>  static int stash_working_tree(struct stash_info *info, const char **argv)
>  {
>   int ret = 0;
> - struct child_process cp0 = CHILD_PROCESS_INIT;
>   struct child_process cp1 = CHILD_PROCESS_INIT;
>   struct child_process cp2 = CHILD_PROCESS_INIT;
>   struct child_process cp3 = CHILD_PROCESS_INIT;
>   struct strbuf out1 = STRBUF_INIT;
>   struct strbuf out3 = STRBUF_INIT;
>  
> - cp0.git_cmd = 1;
> - argv_array_push(&cp0.args, "read-tree");
> - argv_array_pushf(&cp0.args, "--index-output=%s", stash_index_path.buf);
> - argv_array_pushl(&cp0.args, "-m", oid_to_hex(&info->i_tree), NULL);
> - if (run_command(&cp0)) {
> + set_alternate_index_output(stash_index_path.buf);
> + if (reset_tree(&info->i_tree, 0, 0)) {
>   ret = -1;
>   goto done;
>   }
> + set_alternate_index_output(".git/index");

I think this second 'set_alternate_index_output()' should be
'set_alternate_index_output(NULL)', which has slightly different
semantics than setting it to '.git/index'.  Having it set means that
the index is written unconditionally even if it is not set.

Also the index file could be something other than ".git/index", if
the GIT_INDEX_FILE environment variable is set, so it should be
replaced with 'get_index_file()' if we were to keep this.

I was also wondering if we could avoid writing a temporary index to
disk, in 'stash_working_tree', but I don't see an easy way for doing
that.

>  
>   cp1.git_cmd = 1;
>   argv_array_pushl(&cp1.args, "diff-index", "--name-only", "-z",
> -- 
> 2.18.0.573.g56500d98f
> 


Re: [PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite

2018-08-18 Thread Elijah Newren
On Sat, Aug 18, 2018 at 7:41 AM Nguyễn Thái Ngọc Duy  wrote:
...
> diff --git a/read-cache.c b/read-cache.c
> index 5ce40f39b3..41f313bc9e 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2744,6 +2744,9 @@ int write_locked_index(struct index_state *istate, 
> struct lock_file *lock,
> int new_shared_index, ret;
> struct split_index *si = istate->split_index;
>
> +   if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
> +   cache_tree_verify(istate);
> +
> if ((flags & SKIP_IF_UNCHANGED) && !istate->cache_changed) {
> if (flags & COMMIT_LOCK)
> rollback_lock_file(lock);
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 78f7097746..5b50f6e2e6 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1083,6 +1083,12 @@ else
> test_set_prereq C_LOCALE_OUTPUT
>  fi
>
> +if test -z "$GIT_TEST_CHECK_CACHE_TREE"
> +then
> +   GIT_TEST_CHECK_CACHE_TREE=true
> +   export GIT_TEST_CHECK_CACHE_TREE
> +fi
> +
>  test_lazy_prereq PIPE '
> # test whether the filesystem supports FIFOs
> test_have_prereq !MINGW,!CYGWIN &&
> diff --git a/unpack-trees.c b/unpack-trees.c
> index bc43922922..3394540842 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1578,6 +1578,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, 
> struct unpack_trees_options
> if (o->dst_index) {
> move_index_extensions(&o->result, o->src_index);
> if (!ret) {
> +   if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
> +   cache_tree_verify(&o->result);
> if (!o->result.cache_tree)
> o->result.cache_tree = cache_tree();
> if (!cache_tree_fully_valid(o->result.cache_tree))
> --
> 2.18.0.1004.g6639190530

Should documentation of GIT_TEST_CHECK_CACHE_TREE be added in
t/README, int the "Running tests with special setups" section?


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Ævar Arnfjörð Bjarmason


On Sat, Aug 18 2018, Jonathan Nieder wrote:

> Hi,
>
> Ævar Arnfjörð Bjarmason wrote:
>> On Sat, Aug 18 2018, Jonathan Nieder wrote:
>>> Michael Muré wrote:
>
 I released today git-bug, a distributed bug tracker
> [...]
>>> I am a bit unhappy about the namespace grab.  Not for trademark
>>> reasons: the Git trademark rules are pretty clear about this kind of
>>> usage being okay.  Instead, the unhappiness comes because a future Git
>>> command like "git bug" to produce a bug report with appropriate
>>> diagnostics for a bug in Git seems like a likely and useful thing to
>>> get added to Git some day.  And now the name's taken.
>>>
>>> Is it too late to ask if it's possible to come up with a less generic
>>> name?
>>
>> Wouldn't we call such a thing "git-reportbug", or "git gitbug", with
>> reference to Debian reportbug or perl's perlbug?
>
> I hope you're kidding about "git gitbug".

It sounds a bit silly, but such a tool is going to be rarely used enough
that we probably don't want to squat a 3 letter command to invoke it.

> [...]
>> 1) Accept the status quo where people do create third party tools, much
>>of which are way too obscure to matter (e.g. I'm sure someone's
>>created a tool/alias called range-diff before, but we didn't
>>care).
>>
>>If those tools become popular enough in the wild they get own that
>>namespace, e.g. we're not going to ship a "git-annex" or "git-lfs"
>>ourselves implementing some unrelated features
>
> That's fair.  Let me spell out my thinking a little more.
>
> This framework would lead me to rephrase my question to Michael a
> different way.  Instead of saying that I'm not happy with the
> namespace grab, I should say something more severe:
>
>   Don't be surprised if Git itself makes a "git bug" command in the
>   future, and be prepared to rename.
>
> Is that preferable, in your opinion?

We're not going to make some blanket policy that doesn't recognize the
difference between say git-lfs and git-tool_nobody_has_ever_heard_of,
and then decide that it would be just as reasonable for us to ship a new
git-lfs ourselves (which would do something different) as it were for us
to ship git-tool_nobody_has_ever_heard_of.

The reason I can drop a "git-whatever" in my $PATH and invoke it as "git
whatever" is just a historical accident of how git was implemented.

But because that feature has been exposed since the very beginning it's
become an implicit API. There's thousands of git-whatever tools, and
people do use these. The likes of git-lfs and git-annex are used a *lot*
more than some builtins we ship.

So we don't get to say "you never asked us about git-annex, we're using
that name now" without considering how widely used it is. It's us who
decided to expose the API of seamlessly integrating 3rd party tools.


Re: [PATCH v5 0/7] Speed up unpack_trees()

2018-08-18 Thread Elijah Newren
On Sat, Aug 18, 2018 at 7:41 AM Nguyễn Thái Ngọc Duy  wrote:
>
> v5 fixes some minor comments from round 4 and a big mistake in 5/5.
> Junio's scary feeling turns out true. There is a missing invalidation
> in keep_entry() which is not added in 6/7. 7/7 makes sure that similar

I'm having trouble parsing this.  Did you mean "...which is now
added..."?  Also, if 6/7 represents a fix to the "big mistake in 5/5",
why is 6/7 separate from 5/7 instead of squashed in?

> problems will not slip through.
>
> I had to rebase this series on top of 'master' because 7/7 caught a
> bad cache-tree situation that has been fixed by Elijah in ad3762042a

Cool, glad that helped.

...
> Nguyễn Thái Ngọc Duy (7):
>   trace.h: support nested performance tracing
>   unpack-trees: add performance tracing
>   unpack-trees: optimize walking same trees with cache-tree
>   unpack-trees: reduce malloc in cache-tree walk
>   unpack-trees: reuse (still valid) cache-tree from src_index
>   unpack-trees: add missing cache invalidation
>   cache-tree: verify valid cache-tree in the test suite

I read through the new series and only had one small comment.  I'm not
up to speed on cache-tree stuff, still, so don't feel qualified to
give an Ack on it.


Re: [GSoC][PATCH v7 17/26] stash: avoid spawning a "diff-index" process

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> This commits replaces spawning `diff-index` child process by using
> the already existing `diff` API

I think this should be squashed into the previous commit.  It's easier
to review a commit that replaces all the 'run_command'/'pipe_command'
calls in one function, rather than doing it call by call, especially
if they interact with eachother.

E.g. I was going to suggest replacing the 'write_tree' call as well,
but reading ahead in the series I see that that's already being done :)  

While replacing all the calls of one type with the internal API call
is probably easiest for writing the patches, at least I would find it
easier to review replacing the run-command API calls in one codepath
at a time.

> ---
>  builtin/stash--helper.c | 56 ++---
>  1 file changed, 42 insertions(+), 14 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index 887b78d05..f905d3908 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -12,6 +12,7 @@
>  #include "rerere.h"
>  #include "revision.h"
>  #include "log-tree.h"
> +#include "diffcore.h"
>  
>  static const char * const git_stash_helper_usage[] = {
>   N_("git stash--helper list []"),
> @@ -297,6 +298,18 @@ static int reset_head(const char *prefix)
>   return run_command(&cp);
>  }
>  
> +static void add_diff_to_buf(struct diff_queue_struct *q,
> + struct diff_options *options,
> + void *data)
> +{
> + int i;
> + for (i = 0; i < q->nr; i++) {
> + struct diff_filepair *p = q->queue[i];
> + strbuf_addstr(data, p->one->path);
> + strbuf_addch(data, '\n');

What about filenames that include a '\n'?  I think this in combination
with removing the '-z' flag from the 'update-index' call will break
with filenames that have a LF in them.  This should be a '\0' instead
of a '\n', and we should still be using the '-z' flag in
'update-index'.

> + }
> +}
> +
>  static int get_newly_staged(struct strbuf *out, struct object_id *c_tree)
>  {
>   struct child_process cp = CHILD_PROCESS_INIT;
> @@ -981,14 +994,16 @@ static int stash_patch(struct stash_info *info, const 
> char **argv)
>   return ret;
>  }
>  
> -static int stash_working_tree(struct stash_info *info, const char **argv)
> +static int stash_working_tree(struct stash_info *info,
> +   const char **argv, const char *prefix)
>  {
>   int ret = 0;
> - struct child_process cp1 = CHILD_PROCESS_INIT;
>   struct child_process cp2 = CHILD_PROCESS_INIT;
>   struct child_process cp3 = CHILD_PROCESS_INIT;
> - struct strbuf out1 = STRBUF_INIT;
>   struct strbuf out3 = STRBUF_INIT;

We're left with cp{2,3} and out3 here, which is a bit weird.  Then
again renaming them in this patch adds more churn, making it harder to
review.  Maybe instead of numbering them it would be better to name
them after the child process they are calling?  e.g. 'cp1' could
become 'cp_di', and so on?

> + struct argv_array args = ARGV_ARRAY_INIT;
> + struct strbuf diff_output = STRBUF_INIT;
> + struct rev_info rev;
>  
>   set_alternate_index_output(stash_index_path.buf);
>   if (reset_tree(&info->i_tree, 0, 0)) {
> @@ -997,26 +1012,36 @@ static int stash_working_tree(struct stash_info *info, 
> const char **argv)
>   }
>   set_alternate_index_output(".git/index");
>  
> - cp1.git_cmd = 1;
> - argv_array_pushl(&cp1.args, "diff-index", "--name-only", "-z",
> - "HEAD", "--", NULL);
> + argv_array_push(&args, "dummy");

Not being familiar with the setup_revisions code, I had to dig a bit
to figure out why this makes sense.  This is a dummy replacement for
argv[0] in normal operation.  In retrospect it's kind of obvious, but
maybe call "dummy" "fake_argv0" instead, to help nudge future readers
in the right direction?

>   if (argv)
> - argv_array_pushv(&cp1.args, argv);
> - argv_array_pushf(&cp1.env_array, "GIT_INDEX_FILE=%s",
> -  stash_index_path.buf);
> + argv_array_pushv(&args, argv);
> + git_config(git_diff_basic_config, NULL);
> + init_revisions(&rev, prefix);
> + args.argc = setup_revisions(args.argc, args.argv, &rev, NULL);
> +
> + rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
> + rev.diffopt.format_callback = add_diff_to_buf;
> + rev.diffopt.format_callback_data = &diff_output;
> +
> + if (read_cache_preload(&rev.diffopt.pathspec) < 0) {
> + ret = -1;
> + goto done;
> + }
>  
> - if (pipe_command(&cp1, NULL, 0, &out1, 0, NULL, 0)) {
> + add_pending_object(&rev, parse_object(the_repository, &info->b_commit), 
> "");
> + if (run_diff_index(&rev, 0)) {
>   ret = -1;
>   goto done;
>   }
>  
>   cp2.git_cmd = 1;
> - argv_array_pushl(&cp2.args, "update-index"

Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> The reason I can drop a "git-whatever" in my $PATH and invoke it as "git
> whatever" is just a historical accident of how git was implemented.

No.  This is a very deliberate design decision, to allow people to
prototype new Git commands (and to create the kind of ecosystem that
allows commands to be implemented outside Git.

[...]
> So we don't get to say "you never asked us about git-annex, we're using
> that name now" without considering how widely used it is. It's us who
> decided to expose the API of seamlessly integrating 3rd party tools.

I think we're talking past each other.  I haven't proposed any blanket
policy.  I'm saying that "git bug" is a bad name for this tool:

 - it's hard to find with search engines
 - it conflicts with some likely good future changes to Git
 - it assumes that no one else will have some other refinement of the
   Git bugtracker concept, that it is the only "git bug" tool

It's a namespace grab.  There's nothing stopping someone from naming a
command "bug", either, but that doesn't make it a good idea.  (I'm not
saying that was the intent --- that's just the effect.)

Meanwhile it looks like a neat tool, and I'm very supportive of the
idea.  But you certainly still have not convinced me that the name is
a good idea, or that I shouldn't be bringing this up.

I'm not sure *what* you're trying to convince me of, actually.

Jonathan


Re: [GSoC][PATCH v7 21/26] stash: replace spawning `git ls-files` child process

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> This commit replaces spawning `git ls-files` child process with
> API calls to get the untracked files.
> ---
>  builtin/stash--helper.c | 49 +++--
>  1 file changed, 32 insertions(+), 17 deletions(-)
> 
> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c
> index 4fd79532c..5c27f5dcf 100644
> --- a/builtin/stash--helper.c
> +++ b/builtin/stash--helper.c
> @@ -813,27 +813,42 @@ static int store_stash(int argc, const char **argv, 
> const char *prefix)
>  /*
>   * `out` will be filled with the names of untracked files. The return value 
> is:
>   *
> - * < 0 if there was a bug (any arg given outside the repo will be detected
> - * by `setup_revision()`)
>   * = 0 if there are not any untracked files
>   * > 0 if there are untracked files
>   */
> -static int get_untracked_files(const char **argv, int line_term,
> +static int get_untracked_files(const char **argv, const char *prefix,
>  int include_untracked, struct strbuf *out)
>  {
> - struct child_process cp = CHILD_PROCESS_INIT;
> - cp.git_cmd = 1;
> - argv_array_pushl(&cp.args, "ls-files", "-o", NULL);
> - if (line_term)
> - argv_array_push(&cp.args, "-z");
> + int max_len;
> + int i;
> + char *seen;
> + struct dir_struct dir;
> + struct pathspec pathspec;
> +
> + memset(&dir, 0, sizeof(dir));
>   if (include_untracked != 2)
> - argv_array_push(&cp.args, "--exclude-standard");
> - argv_array_push(&cp.args, "--");
> - if (argv)
> - argv_array_pushv(&cp.args, argv);
> + setup_standard_excludes(&dir);
>  
> - if (pipe_command(&cp, NULL, 0, out, 0, NULL, 0))
> - return -1;
> + parse_pathspec(&pathspec, 0,
> +PATHSPEC_PREFER_FULL,
> +prefix, argv);
> + seen = xcalloc(pathspec.nr, 1);
> +
> + max_len = fill_directory(&dir, the_repository->index, &pathspec);
> + for (i = 0; i < dir.nr; i++) {
> + struct dir_entry *ent = dir.entries[i];
> + if (!dir_path_match(ent, &pathspec, max_len, seen)) {
> + free(ent);
> + continue;
> + }
> + strbuf_addf(out, "%s\n", ent->name);

As mentioned in my comments about the 'diff-index' replacement, this
'\n' should probably be '\0', and we should keep the '-z' flag in 'git
update-index', in case somebody has a '\n' in their filenames.

While creating such a file is probably not a good idea anyway, we
should still be able to handle it (and have been before this, so we
shouldn't break it).

> + free(ent);
> + }
> +
> + free(dir.entries);
> + free(dir.ignored);
> + clear_directory(&dir);
> + free(seen);
>   return out->len;
>  }
>  
> @@ -888,7 +903,7 @@ static int check_changes(const char **argv, int 
> include_untracked,
>   goto done;
>   }
>  
> - if (include_untracked && get_untracked_files(argv, 0,
> + if (include_untracked && get_untracked_files(argv, prefix,
>include_untracked, &out))
>   ret = 1;
>  
> @@ -908,7 +923,7 @@ static int save_untracked_files(struct stash_info *info, 
> struct strbuf *msg,
>   struct child_process cp2 = CHILD_PROCESS_INIT;
>  
>   cp.git_cmd = 1;
> - argv_array_pushl(&cp.args, "update-index", "-z", "--add",
> + argv_array_pushl(&cp.args, "update-index", "--add",
>"--remove", "--stdin", NULL);
>   argv_array_pushf(&cp.env_array, "GIT_INDEX_FILE=%s",
>stash_index_path.buf);
> @@ -1134,7 +1149,7 @@ static int do_create_stash(int argc, const char **argv, 
> const char *prefix,
>   goto done;
>   }
>  
> - if (include_untracked && get_untracked_files(argv, 1,
> + if (include_untracked && get_untracked_files(argv, prefix,
>include_untracked, &out)) {
>   if (save_untracked_files(info, &msg, &out)) {
>   if (!quiet)
> -- 
> 2.18.0.573.g56500d98f
> 


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Ævar Arnfjörð Bjarmason


On Sat, Aug 18 2018, Jonathan Nieder wrote:

> Ævar Arnfjörð Bjarmason wrote:
>
>> The reason I can drop a "git-whatever" in my $PATH and invoke it as "git
>> whatever" is just a historical accident of how git was implemented.
>
> No.  This is a very deliberate design decision, to allow people to
> prototype new Git commands (and to create the kind of ecosystem that
> allows commands to be implemented outside Git.
>
> [...]
>> So we don't get to say "you never asked us about git-annex, we're using
>> that name now" without considering how widely used it is. It's us who
>> decided to expose the API of seamlessly integrating 3rd party tools.
>
> I think we're talking past each other.  I haven't proposed any blanket
> policy.  I'm saying that "git bug" is a bad name for this tool:
>
>  - it's hard to find with search engines
>  - it conflicts with some likely good future changes to Git
>  - it assumes that no one else will have some other refinement of the
>Git bugtracker concept, that it is the only "git bug" tool
>
> It's a namespace grab.  There's nothing stopping someone from naming a
> command "bug", either, but that doesn't make it a good idea.  (I'm not
> saying that was the intent --- that's just the effect.)
>
> Meanwhile it looks like a neat tool, and I'm very supportive of the
> idea.  But you certainly still have not convinced me that the name is
> a good idea, or that I shouldn't be bringing this up.
>
> I'm not sure *what* you're trying to convince me of, actually.

I'm not saying the git-bug name is a good idea, or that it isn't. I
don't care about this particular case when it comes to naming.

I'm just pointing out in the more general case that if someone comes up
with a badly named git-xyz it doesn't scale to try to point this out to
them before git-xyz is widely deployed.

So we must either let it go (solution #1), or come up with some
API-level solution that makes it a non-issue (my #3).


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
Ævar Arnfjörð Bjarmason wrote:

> I'm just pointing out in the more general case that if someone comes up
> with a badly named git-xyz it doesn't scale to try to point this out to
> them before git-xyz is widely deployed.
>
> So we must either let it go (solution #1), or come up with some
> API-level solution that makes it a non-issue (my #3).

How about solution #4: live and let live when it comes down to it, but
act like actual people and talk to avoid negative consequences?

Some social problems don't have technical solutions.

Talking scales, in strange ways.  For example, people are able to look
at the list archive, people are able to spread thoughts they have
found interesting, and so on.

Jonathan


Re: [GSoC][PATCH v7 24/26] stash: optimize `get_untracked_files()` and `check_changes()`

2018-08-18 Thread Thomas Gummerer
On 08/08, Paul-Sebastian Ungureanu wrote:
> This commits introduces a optimization by avoiding calling the
> same functions again. For example, `git stash push -u`
> would call at some points the following functions:
> 
>  * `check_changes()`
>  * `do_create_stash()`, which calls: `check_changes()` and
> `get_untracked_files()`
> 
> Note that `check_changes()` also calls `get_untracked_files()`.
> So, `check_changes()` is called 2 times and `get_untracked_files()`
> 3 times. By checking at the beginning of the function if we already
> performed a check, we can avoid making useless calls.

While I can see that this may give us some performance gains, what's
being described above sounds like we should look into why we are
making these duplicate calls in the first place, rather than trying to
return early from them.  I feel like the duplicate calls are mostly a
remnant from the way the shell script was written, but not inherent to
the design of 'git stash'. 

For example 'check_changes' could be called from 'create_stash'
directly, so we don't have to call it in 'do_create_stash', in the
process removing the duplicate call from the 'git stash push'
codepath.  That would provide the same improvements, and keep the code
cleaner, rather than introducing more special cases for these
functions.

I haven't looked into the 'get_untracked_files()' call chain yet, but
I imagine we can do something similar for that.


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
(cc-ing Elijah Newren for the points about merging)
Hi again,

To avoid the other thread shadowing more important things:

Michael Muré wrote:

> Someone suggested in the Hacker News thread [0] to post it here as well.

Thanks to Ævar for that.

[...]
> git-bug use as identifier the hash of the first commit in the chain
> of commit of the bug.

Clever!  I like this approach to the naming problem.

[...]
> Git doesn't provide a low-level command to rebase a branch onto
> another without touching the index.

Thanks for pointing this out.  There's been some recent work to make
Git's merge code (also used for cherry-pick) less reliant on the index
and worktree.  See https://crbug.com/git/12 for some references.
There's also been some heavy refactoring of "git rebase" code to be in
C and be able to make use of library functions instead of being a
shell script.

That's all to say that we're in a pretty good place to consider
introducing commands like

  git cherry-pick --onto= 

In absence of that kind of thing, you can run commands that need to
touch the index (but not the working tree) by setting the GIT_INDEX
environment variable to point to a temporary index file.

> I'd love to have some feedback from you. Contribution are also very
> much welcomed.

Can you say more about the federation model it intends to support?
For example, do you imagine

- having multiple copies of a git bugs repo that automatically fetch
  updates from each other

- having explicit "pull request" synchronization moments when the
  owners of one copy of a bug tracker push or request a fetch of
  changes that have been happening on another

- individual contributors using an offline copy of the bug tracker
  and pushing push/pull mostly to synchronize with a single
  centralized copy

- something else?

Thanks,
Jonathan


Re: Re* [PATCH v7 1/1] sideband: highlight keywords in remote sideband output

2018-08-18 Thread Jeff King
On Sat, Aug 18, 2018 at 09:16:28AM -0700, Junio C Hamano wrote:

> -- >8 --
> Subject: [PATCH] sideband: do not read beyond the end of input
> 
> The caller of maybe_colorize_sideband() gives a counted buffer
> , but the callee checked src[] as if it were a NUL terminated
> buffer.  If src[] had all isspace() bytes in it, we would have made
> n negative, and then
> 
>  (1) made number of strncasecmp() calls to see if the remaining
>  bytes in src[] matched keywords, reading beyond the end of the
>  array (this actually happens even if n does not go negative),
>  and/or
> 
>  (2) called strbuf_add() with negative count, most likely triggering
>  the "you want to use way too much memory" error due to unsigned
>  integer overflow.
> 
> Fix both issues by making sure we do not go beyond &src[n].

Thanks. I've been sporadically seeing "fatal: you want to use way too
much memory" the last few days while running 'next', and finally managed
to catch a reproducible case. This patch definitely fixes it.

> In the longer term we may want to accept size_t as parameter for
> clarity (even though we know that a sideband message we are painting
> typically would fit on a line on a terminal and int is sufficient).
> Write it down as a NEEDSWORK comment.

This "typically" made me nervous about what happens in the non-typical
case, but I think we can say something even stronger: the length comes
from a pktline, so the maximum is less than 16 bits. I wondered if we
might ever call this on the accumulated string from multiple sidebands,
but it doesn't look like it.

-Peff


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Michael Muré
Here was my reasoning for the naming choice:

- I need something meaningful
- I need something that encompass the idea and features of a bug
tracker because the narrower ideas and actions will be in sub commands
- other projects already used other words, in particular "issue"
- it kind of sounds and looks good

You say that it's a namespace grab and I understand that, but in the
other hand, there is not that much freedom when choosing a name. Sorry
if I'm stepping on someone's toe :-|

2018-08-19 0:50 GMT+02:00 Jonathan Nieder :
> (cc-ing Elijah Newren for the points about merging)
> Hi again,
>
> To avoid the other thread shadowing more important things:
>
> Michael Muré wrote:
>
>> Someone suggested in the Hacker News thread [0] to post it here as well.
>
> Thanks to Ævar for that.
>
> [...]
>> git-bug use as identifier the hash of the first commit in the chain
>> of commit of the bug.
>
> Clever!  I like this approach to the naming problem.
>
> [...]
>> Git doesn't provide a low-level command to rebase a branch onto
>> another without touching the index.
>
> Thanks for pointing this out.  There's been some recent work to make
> Git's merge code (also used for cherry-pick) less reliant on the index
> and worktree.  See https://crbug.com/git/12 for some references.
> There's also been some heavy refactoring of "git rebase" code to be in
> C and be able to make use of library functions instead of being a
> shell script.
>
> That's all to say that we're in a pretty good place to consider
> introducing commands like
>
>   git cherry-pick --onto= 
>
> In absence of that kind of thing, you can run commands that need to
> touch the index (but not the working tree) by setting the GIT_INDEX
> environment variable to point to a temporary index file.
>
>> I'd love to have some feedback from you. Contribution are also very
>> much welcomed.
>
> Can you say more about the federation model it intends to support?
> For example, do you imagine
>
> - having multiple copies of a git bugs repo that automatically fetch
>   updates from each other
>
> - having explicit "pull request" synchronization moments when the
>   owners of one copy of a bug tracker push or request a fetch of
>   changes that have been happening on another
>
> - individual contributors using an offline copy of the bug tracker
>   and pushing push/pull mostly to synchronize with a single
>   centralized copy
>
> - something else?
>
> Thanks,
> Jonathan



-- 
Michael


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Michael Muré
> There's been some recent work to make
> Git's merge code (also used for cherry-pick) less reliant on the index
> and worktree.

Yes please ! In the mean time, someone suggested another trick [0].

> Can you say more about the federation model it intends to support?

My goal is to have a workflow similar as what git does, to be
versatile and leave to the users the choice of the topology they want
to use. Obviously, it will be most of the time a single remote where
they collaborate.

As a bug tracker is a different workflow than regular code, there will
be some tooling to help. For instance, automatic push/pull will help
make it easier to use and more "out of the way".

In the data model, each commit in the linear chain link to an array of
new edit operations. That means that each commit is strictly
independent from the others. When you get updates for a bug and you
need to merge them, you will simply rebase your new commits on top of
the linear chain.

This design has several properties:

- the merge happen on the user repo where git-bug is installed and can
validate new data
- the remote used doesn't have to be aware of git-bug
- when pushing an update of a ref, the remote will make sure that it's
fast forward, that is, no previous edit operations has been removed.
It ensure that the history is append only.

So for now, collaboration is based on push/pull to whatever remote you
want, as git does, with the exception of the Web UI. The goal here is
to have it running locally for each user but also to make it a public
interface for users that don't have write access to the repo, much
like any bug tracker has.

In the future, it could be possible to have more fancy features like a
federated forge with ActivityPub, but that's way outside of the scope
of the project for now.

[0]: https://github.com/MichaelMure/git-bug/issues/15

2018-08-19 2:45 GMT+02:00 Michael Muré :
> Here was my reasoning for the naming choice:
>
> - I need something meaningful
> - I need something that encompass the idea and features of a bug
> tracker because the narrower ideas and actions will be in sub commands
> - other projects already used other words, in particular "issue"
> - it kind of sounds and looks good
>
> You say that it's a namespace grab and I understand that, but in the
> other hand, there is not that much freedom when choosing a name. Sorry
> if I'm stepping on someone's toe :-|
>
> 2018-08-19 0:50 GMT+02:00 Jonathan Nieder :
>> (cc-ing Elijah Newren for the points about merging)
>> Hi again,
>>
>> To avoid the other thread shadowing more important things:
>>
>> Michael Muré wrote:
>>
>>> Someone suggested in the Hacker News thread [0] to post it here as well.
>>
>> Thanks to Ævar for that.
>>
>> [...]
>>> git-bug use as identifier the hash of the first commit in the chain
>>> of commit of the bug.
>>
>> Clever!  I like this approach to the naming problem.
>>
>> [...]
>>> Git doesn't provide a low-level command to rebase a branch onto
>>> another without touching the index.
>>
>> Thanks for pointing this out.  There's been some recent work to make
>> Git's merge code (also used for cherry-pick) less reliant on the index
>> and worktree.  See https://crbug.com/git/12 for some references.
>> There's also been some heavy refactoring of "git rebase" code to be in
>> C and be able to make use of library functions instead of being a
>> shell script.
>>
>> That's all to say that we're in a pretty good place to consider
>> introducing commands like
>>
>>   git cherry-pick --onto= 
>>
>> In absence of that kind of thing, you can run commands that need to
>> touch the index (but not the working tree) by setting the GIT_INDEX
>> environment variable to point to a temporary index file.
>>
>>> I'd love to have some feedback from you. Contribution are also very
>>> much welcomed.
>>
>> Can you say more about the federation model it intends to support?
>> For example, do you imagine
>>
>> - having multiple copies of a git bugs repo that automatically fetch
>>   updates from each other
>>
>> - having explicit "pull request" synchronization moments when the
>>   owners of one copy of a bug tracker push or request a fetch of
>>   changes that have been happening on another
>>
>> - individual contributors using an offline copy of the bug tracker
>>   and pushing push/pull mostly to synchronize with a single
>>   centralized copy
>>
>> - something else?
>>
>> Thanks,
>> Jonathan
>
>
>
> --
> Michael



-- 
Michael


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
(cc-ing Scott)
Hi Junio,

Junio C Hamano wrote:
> Michael Muré  writes:

>> I released today git-bug, a distributed bug tracker that embeds in
>> git. It use git's internal storage to store bugs information in a way
>> that can be merged without conflict. You can push/pull to the normal
>> git remote you are already using to interact with other people. Normal
>> code and bugs are completely separated and no files are added in the
>> regular branches.
>
> This reminds me of a demo Scott Chacon showed us ages ago, the name
> of which escapes me.  I guess great minds think alike, or something?

I believe you're thinking of TicGit[1].

Some other related work is listed at [2].  Most of these projects have
gone quiet:

- ditz[3]
- git-issues[4]
- cil[5]
- Bugs Everywhere[6]
- milli by Steve Kemp, which I haven't found a copy of
- simple defects[7]
- kipling[8]

http://www.cs.unb.ca/~bremner/blog/posts/git-issue-trackers/ gives a
nice overview, though it's rather old.

This area seems to have gone mostly quiet since 2014, so it's nice to
see new work.

Thanks,
Jonathan

[1] https://github.com/jeffWelling/ticgit
[2] 
https://git.wiki.kernel.org/index.php/InterfacesFrontendsAndTools#Bug.2Fissue_trackers.2C_etc.
[3] https://github.com/jashmenn/ditz
[4] https://github.com/duplys/git-issues
[5] https://github.com/chilts/cil
[6] http://bugseverywhere.org/
[7] https://syncwith.us/sd/, https://gitorious.org/prophet/sd
[8] https://gitorious.org/kipling/mainline


Re: [PATCH v3] checkout: optimize "git checkout -b "

2018-08-18 Thread Elijah Newren
On Fri, Aug 17, 2018 at 5:41 AM Ben Peart  wrote:
> On 8/16/2018 2:37 PM, Duy Nguyen wrote:
> > On Thu, Aug 16, 2018 at 8:27 PM Ben Peart  wrote:
> >>
> >> From: Ben Peart 
> >>
> >> Skip merging the commit, updating the index and working directory if and
> >> only if we are creating a new branch via "git checkout -b ."
> >> Any other checkout options will still go through the former code path.
> >>
> >> If sparse_checkout is on, require the user to manually opt in to this
> >> optimzed behavior by setting the config setting checkout.optimizeNewBranch
> >> to true as we will no longer update the skip-worktree bit in the index, nor
> >> add/remove files in the working directory to reflect the current sparse
> >> checkout settings.
> >>
> >> For comparison, running "git checkout -b " on a large repo 
> >> takes:
> >>
> >> 14.6 seconds - without this patch
> >> 0.3 seconds - with this patch
> >
> > I still don't think we should do this. If you want lightning fast
> > branch creation, just use 'git branch'. From the timing breakdown you
> > shown in the other thread it looks like sparse checkout still takes
> > seconds, which could be optimized (or even excluded, I mentioned this
> > too). And split index (or something similar if you can't use it) would
> > give you saving across the board. There is still one idea Elijah gave
> > me that should further lower traverse_trees()  cost.
> >
>
> We have investigated some of these already - split index ended up
> slowing things down more than it sped them up do to the higher compute
> costs.  Sparse checkout we've already optimized significantly - limiting
> the patterns we accept so that we can do the lookup via a hashmap
> instead of the robust pattern matching.  We will continue to look for
> other optimizations and appreciate any and all ideas!
>
> In the end, this optimization makes a huge performance improvement by
> avoiding doing a lot of work that isn't necessary.  Taking a command
> from 14+ seconds to sub-second is just too much of a win for us to ignore.
>
> > But anyway, it's not my call. I'll stop here.

It's even less of my call, but since things seem to be stuck in
what-should-we-do state (as per Junio's comments on this patch in the
last two "What's cooking" emails), and since Ben and Duy obviously
have opposite opinions on Ben's patch, let's see if I might be able to
help at all.  Here's my summary and my findings:

== The pain ==
- For repositories with a really large number of entries (500K as Ben
says), some operations are much slower than it feels like they should
be.
- This does not seem to be GFVS-specific in any way, I can duplicate
slowness with a simple git-bomb[1]-like repo that has a sparse
checkout pattern ignoring the "bomb" side.  (It has 1M+1 entries in
the index, and .git/info/sparse-checkout ignores the 1M so the working
copy only has 1 entry).  The timings on my repo for "git checkout -b
$NEWBRANCH" are almost exactly double what Ben reports he gets on
their repo.

[1] https://kate.io/blog/making-your-own-exploding-git-repos/

== Short term solutions ==
- Alternative git commands exist today to do a fast checkout of a new
branch in a huge repo.  I also get sub-second timings in my
even-bigger repo with this:
   git branch $NEWBRANCH && git symbolic-ref HEAD $NEWBRANCH
But I do understand that wrapping this into a script or executable
(git-fast-new-branch?) and asking users to use it is a usability
problem and an uphill battle.  (Sidenote: this isn't quite the same
operation; it's missing a reflog update.  The -m option to
symbolic-ref doesn't seem to help; I guess the fact that HEAD's
sha1sum is not changing is viewed as not an update?  However, that
could also be scripted around.)
- Ben's patch successfully drops the time for "git checkout -b
$NEWBRANCH" from 26+ seconds (in my cooked-up testcase) to sub-second
(in fact, under .1 seconds for me).  That's a _huge_ win.

== unpack_trees optimization notes ==
- Ben's patch is extremely focused.  It only affects "git checkout -b
$NEWBRANCH".  If someone runs "git branch $NEWBRANCH && git checkout
$NEWBRANCH", they get the old 26+ second timing.  They also suddenly
get the really long timing if they add any other flags or checkout a
commit that differs in only a single entry in the entire tree.  It
would be nice if we did general optimization for all issues rather
than just special casing such narrow cases.
- However, optimizing unpack_trees is hard.  It's really easy to get
lost trying to look at the code.  Time has been spent trying to
optimizing it.  Ben really likes the speedup factors of 2-3 that Duy
has produced.  But he's pessimistic we'll find enough to bridge the
gap for this case.  And he's worried about breaking unrelated stuff
due to the complexity of unpack_trees.
- Duy is pretty sure we can optimize unpack_trees in at least one more
way.  I've tried looking through the code and think there are others,
but then again I'm known to get lost and confused in unpack_trees.

== 

Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Elijah Newren
On Sat, Aug 18, 2018 at 3:50 PM Jonathan Nieder  wrote:
> (cc-ing Elijah Newren for the points about merging)

> [...]
> > Git doesn't provide a low-level command to rebase a branch onto
> > another without touching the index.
>
> Thanks for pointing this out.  There's been some recent work to make
> Git's merge code (also used for cherry-pick) less reliant on the index
> and worktree.  See https://crbug.com/git/12 for some references.
> There's also been some heavy refactoring of "git rebase" code to be in
> C and be able to make use of library functions instead of being a
> shell script.
>
> That's all to say that we're in a pretty good place to consider
> introducing commands like
>
>   git cherry-pick --onto= 

Yes, indeed, after the merge refactoring/rewriting stuff is complete,
this is one thing already on my list that I wanted to do with it.
Another thing I'd like to investigate with it is how much "in-memory"
merges could speed up interactive rebases, as suggested by Dscho[1].
Once we do "in-memory" merges for interactive rebases for performance
reasons, we're pretty close to having a
rebase-without-touching-index-or-worktree that we can make accessible
to other scripts like git-bug.  However, we have to have a pretty good
answer about what to do when we hit conflicts.

[1] 
https://public-inbox.org/git/nycvar.qro.7.76.6.180616000...@tvgsbejvaqbjf.bet/


Re: Git Project Leadership Committee

2018-08-18 Thread Elijah Newren
On Thu, Aug 16, 2018 at 3:43 PM Jeff King  wrote:
>  - we should avoid anyone who is affiliated with a company that already
>has a member on the committee. So nobody from Google and nobody from
>GitHub. I would extend that to Microsoft, too, given a certain
>impending acquisition. I'd expect anybody who is affiliated with a
>company to recuse themselves from decisions that directly affect that
>company (which is what we've done so far).

That might make it hard for some of us to nominate others, since as
far as I can tell (e.g. looking at shortlog -sne output) few git
contributors use work email addresses to do so and we don't
necessarily know employers of other contributors.

> So here are the nominations I came up with. If you'd like to nominate
> somebody else (or yourself!), please do. If you have opinions, let me
> know (public or private, as you prefer).
>
>  - Christian Couder
>  - Ævar Arnfjörð Bjarmason

I think either of them would be excellent choices.


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Kyle Meyer
[+cc Stefan Monnier]

Jonathan Nieder  writes:

> (cc-ing Scott)

[...]

> I believe you're thinking of TicGit[1].
>
> Some other related work is listed at [2].  Most of these projects have
> gone quiet:
>
> - ditz[3]
> - git-issues[4]
> - cil[5]
> - Bugs Everywhere[6]
> - milli by Steve Kemp, which I haven't found a copy of
> - simple defects[7]
> - kipling[8]

To add to that list: There's also BuGit [1,2], though it too seems to
have gone quiet.

[1]: https://gitlab.com/monnier/bugit
[2]: 
https://public-inbox.org/git/jwva8psr6vr.fsf-monnier+gmane.comp.version-control@gnu.org/


-- 
Kyle


Re: Git Project Leadership Committee

2018-08-18 Thread Jeff King
On Sat, Aug 18, 2018 at 07:32:38PM -0700, Elijah Newren wrote:

> On Thu, Aug 16, 2018 at 3:43 PM Jeff King  wrote:
> >  - we should avoid anyone who is affiliated with a company that already
> >has a member on the committee. So nobody from Google and nobody from
> >GitHub. I would extend that to Microsoft, too, given a certain
> >impending acquisition. I'd expect anybody who is affiliated with a
> >company to recuse themselves from decisions that directly affect that
> >company (which is what we've done so far).
> 
> That might make it hard for some of us to nominate others, since as
> far as I can tell (e.g. looking at shortlog -sne output) few git
> contributors use work email addresses to do so and we don't
> necessarily know employers of other contributors.

I would say to err on the side of nominating, and the candidate can
provide the information.

-Peff


Re: git-bug: Distributed bug tracker embedded in git

2018-08-18 Thread Jonathan Nieder
(+ git-dit authors)
Kyle Meyer wrote:
> Jonathan Nieder  writes:

>> I believe you're thinking of TicGit[1].
>>
>> Some other related work is listed at [2].  Most of these projects have
>> gone quiet:
>>
>> - ditz[3]
>> - git-issues[4]
>> - cil[5]
>> - Bugs Everywhere[6]
>> - milli by Steve Kemp, which I haven't found a copy of
>> - simple defects[7]
>> - kipling[8]
>
> To add to that list: There's also BuGit [1,2], though it too seems to
> have gone quiet.

I just found a good list on a non Git-specific wiki[3] that is more
helpful (more up to date and has more discussion) than the list on the
Git wiki.

It might be a good place to coordinate and compare notes.  git-dit[4]
in particular seems to have very similar goals and a similar data
model.  In my ideal world there may be a path forward that involves
working more closely together.

Thanks,
Jonathan

> [1]: https://gitlab.com/monnier/bugit
> [2]: 
> https://public-inbox.org/git/jwva8psr6vr.fsf-monnier+gmane.comp.version-control@gnu.org/
[3] https://dist-bugs.branchable.com/software/
[4] https://github.com/neithernut/git-dit


Re: [PATCH v5 0/7] Speed up unpack_trees()

2018-08-18 Thread Duy Nguyen
On Sun, Aug 19, 2018 at 12:01 AM Elijah Newren  wrote:
>
> On Sat, Aug 18, 2018 at 7:41 AM Nguyễn Thái Ngọc Duy  
> wrote:
> >
> > v5 fixes some minor comments from round 4 and a big mistake in 5/5.
> > Junio's scary feeling turns out true. There is a missing invalidation
> > in keep_entry() which is not added in 6/7. 7/7 makes sure that similar
>
> I'm having trouble parsing this.  Did you mean "...which is now
> added..."?

Oops. Yes.

>  Also, if 6/7 represents a fix to the "big mistake in 5/5",
> why is 6/7 separate from 5/7 instead of squashed in?

I felt that was cramming up too much in the commit message. But if
it's the right thing to do, I'll reroll and combine 5/7 and 6/7 .
-- 
Duy