Re: [PATCH 3/3] send-email: also pick up cc addresses from -by trailers
On 2018-10-16 07:57, Junio C Hamano wrote: > Rasmus Villemoes writes: > >>> I wonder if it would make it easier to grok if we made the logic >>> inside out, i.e. >>> >>> if ($sc eq $sender) { >>> ... >>> } else { >>> if ($what =~ /^Signed-off-by$/i) { >>> next if $suppress_cc{'sob'}; >>> } elsif ($what =~ /-by$/i) { >>> next if $suppress_cc{'misc'}; >>> } elsif ($what =~ /^Cc$/i) { >>> next if $suppress_cc{'bodycc'};>} >> >> ...yes, that's probably more readable. > > OK, unless there is more comments and suggestions for improvements, > can you send in a final version sometime not in so distant future so > that we won't forget? Will do, I was just waiting for more comments to come in. It may be surprising to existing users that > the command now suddenly adds more addresses the user did not think > would be added, but it would probably be easy enough for them to > work around. Yeah, I thought about that, but unfortunately the whole auto-cc business is not built around some config options where we can add a new and default false. Also note that there are also cases currently where the user sends off a patch series and is surprised that lots of intended recipients were not cc'ed (that's how I picked this subject up again; I had a long series where I had put specific Cc's in each patch, at v2, some of those had given a Reviewed-by, so I changed the tags, and a --dry-run told me they wouldn't get the new version). I suppose one could make use of -by addresses dependent on a new opt-in config option, but that's not very elegant. Another option is, for a release or two, to make a little (more) noise when picking up a -by address - something like setting a flag in the ($what =~ /-by/) branch, and testing that flag somewhere in send_message(). But I suppose the message printed when needs_confirm eq "inform" is generic enough. Rasmus
[PATCH v2 3/3] send-email: also pick up cc addresses from -by trailers
When rerolling a patch series, including various Reviewed-by etc. that may have come in, it is quite convenient to have git-send-email automatically cc those people. So pick up any *-by lines, with a new suppression category 'misc-by', but special-case Signed-off-by, since that already has its own suppression category. It seems natural to make 'misc-by' implied by 'body'. Based-on-patch-by: Joe Perches Signed-off-by: Rasmus Villemoes --- Documentation/git-send-email.txt | 5 - git-send-email.perl | 19 --- 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt index ea6ea512fe..f6010ac68b 100644 --- a/Documentation/git-send-email.txt +++ b/Documentation/git-send-email.txt @@ -329,8 +329,11 @@ Automating patch body (commit message) except for self (use 'self' for that). - 'sob' will avoid including anyone mentioned in Signed-off-by lines except for self (use 'self' for that). +- 'misc-by' will avoid including anyone mentioned in Acked-by, + Reviewed-by, Tested-by and other "-by" lines in the patch body, + except Signed-off-by (use 'sob' for that). - 'cccmd' will avoid running the --cc-cmd. -- 'body' is equivalent to 'sob' + 'bodycc'. +- 'body' is equivalent to 'sob' + 'bodycc' + 'misc-by'. - 'all' will suppress all auto cc values. -- + diff --git a/git-send-email.perl b/git-send-email.perl index 1916159d2a..58c6aa9d0e 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -94,7 +94,7 @@ sub usage { --identity* Use the sendemail. options. --to-cmd * Email To: via ` \$patch_path` --cc-cmd * Email Cc: via ` \$patch_path` ---suppress-cc * author, self, sob, cc, cccmd, body, bodycc, all. +--suppress-cc * author, self, sob, cc, cccmd, body, bodycc, misc-by, all. --[no-]cc-cover* Email Cc: addresses in the cover letter. --[no-]to-cover* Email To: addresses in the cover letter. --[no-]signed-off-by-cc* Send to Signed-off-by: addresses. Default on. @@ -454,13 +454,13 @@ sub read_config { if (@suppress_cc) { foreach my $entry (@suppress_cc) { die sprintf(__("Unknown --suppress-cc field: '%s'\n"), $entry) - unless $entry =~ /^(?:all|cccmd|cc|author|self|sob|body|bodycc)$/; + unless $entry =~ /^(?:all|cccmd|cc|author|self|sob|body|bodycc|misc-by)$/; $suppress_cc{$entry} = 1; } } if ($suppress_cc{'all'}) { - foreach my $entry (qw (cccmd cc author self sob body bodycc)) { + foreach my $entry (qw (cccmd cc author self sob body bodycc misc-by)) { $suppress_cc{$entry} = 1; } delete $suppress_cc{'all'}; @@ -471,7 +471,7 @@ sub read_config { $suppress_cc{'sob'} = !$signed_off_by_cc if defined $signed_off_by_cc; if ($suppress_cc{'body'}) { - foreach my $entry (qw (sob bodycc)) { + foreach my $entry (qw (sob bodycc misc-by)) { $suppress_cc{$entry} = 1; } delete $suppress_cc{'body'}; @@ -1681,7 +1681,7 @@ sub process_file { # Now parse the message body while(<$fh>) { $message .= $_; - if (/^(Signed-off-by|Cc): (.*)/i) { + if (/^([a-z-]*-by|Cc): (.*)/i) { chomp; my ($what, $c) = ($1, $2); # strip garbage for the address we'll use: @@ -1691,8 +1691,13 @@ sub process_file { if ($sc eq $sender) { next if ($suppress_cc{'self'}); } else { - next if $suppress_cc{'sob'} and $what =~ /Signed-off-by/i; - next if $suppress_cc{'bodycc'} and $what =~ /Cc/i; + if ($what =~ /^Signed-off-by$/i) { + next if $suppress_cc{'sob'}; + } elsif ($what =~ /-by$/i) { + next if $suppress_cc{'misc-by'}; + } elsif ($what =~ /Cc/i) { + next if $suppress_cc{'bodycc'}; + } } if ($c !~ /.+@.+|<.+>/) { printf("(body) Ignoring %s from line '%s'\n", -- 2.19.1.6.gbde171bbf5
[PATCH v2 2/3] send-email: only consider lines containing @ or <> for automatic Cc'ing
While the address sanitizations routines do accept local addresses, that is almost never what is meant in a Cc or Signed-off-by trailer. Looking through all the signed-off-by lines in the linux kernel tree without a @, there are mostly two patterns: Either just a full name, or a full name followed by (i.e., with the word at instead of a @), and minor variations. For cc lines, the same patterns appear, along with lots of "cc stable" variations that do not actually name sta...@vger.kernel.org Cc: stable # introduced pre-git times cc: stable.kernel.org In the cases, one gets a chance to interactively fix it. But when there is no <> pair, it seems we end up just using the first word as a (local) address. As the number of cases where a local address really was meant is likely (and anecdotally) quite small compared to the number of cases where we end up cc'ing a garbage address, insist on at least a @ or a <> pair being present. This is also preparation for the next patch, where we are likely to encounter even more non-addresses in -by lines, such as Reported-by: Coverity Patch-generated-by: Coccinelle Signed-off-by: Rasmus Villemoes --- git-send-email.perl | 5 + 1 file changed, 5 insertions(+) diff --git a/git-send-email.perl b/git-send-email.perl index 2be5dac337..1916159d2a 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1694,6 +1694,11 @@ sub process_file { next if $suppress_cc{'sob'} and $what =~ /Signed-off-by/i; next if $suppress_cc{'bodycc'} and $what =~ /Cc/i; } + if ($c !~ /.+@.+|<.+>/) { + printf("(body) Ignoring %s from line '%s'\n", + $what, $_) unless $quiet; + next; + } push @cc, $c; printf(__("(body) Adding cc: %s from line '%s'\n"), $c, $_) unless $quiet; -- 2.19.1.6.gbde171bbf5
[PATCH v2 0/3] send-email: Also pick up cc addresses from -by trailers
This series extends the logic in git-send-email so that addresses appearing in various *-by: trailers (e.g. Reviewed-by, Acked-by, Tested-by) are picked up and added to the Cc list, in addition to the current logic that picks up Cc: and Signed-off-by: lines. This deliberately only picks up -by trailers (as opposed to any trailer), based on the heuristic that the -by suffix strongly suggests there's a (name +) email address after the colon. This avoids having to deal with BugID:, Link:, or other such tags. Still, widening to any -by trailer does increase the risk that we will pick up stuff that is not an email address, such as Reported-by: Coverity Patch-generated-by: Coccinelle where send-email then ends up cc'ing the local 'coverity' user. Patch 2 tries to weed out the obvious no-email-address-here cases, which should also help avoid cc'ing garbage (local) addresses for malformed cc and signed-off-by lines. Patch 3 is then mostly mechanical, introducing the misc-by suppression category and changing the regexp for matching trailer lines to include .*-by. Changes in v2: Rework logic in patch 3 as suggested by Junio. v1 cover letter: This has been attempted multiple times before, but I hope that this can make it in this time around. That *-by addresses are not automatically Cc'ed certainly still surprises people from time to time. I hope that this addresses all the concerns Junio had in https://lkml.org/lkml/2016/8/31/768 . For the name, I chose 'misc-by', since that has -by in its name. I am fine with absolutely any other name (bodyby, body-by, by-trailers, ...). I doubt we can find a short token that is completely self-explanatory, and note that one has to look in the man page anyway to know what 'sob' means in this line from 'git send-email -h': --suppress-cc * author, self, sob, cc, cccmd, body, bodycc, all. Rasmus Villemoes (3): Documentation/git-send-email.txt: style fixes send-email: only consider lines containing @ or <> for automatic Cc'ing send-email: also pick up cc addresses from -by trailers Documentation/git-send-email.txt | 11 +++ git-send-email.perl | 24 +--- 2 files changed, 24 insertions(+), 11 deletions(-) -- 2.19.1.6.gbde171bbf5
[PATCH v2 1/3] Documentation/git-send-email.txt: style fixes
For consistency, add full stops in a few places and outdent a line by one space. Signed-off-by: Rasmus Villemoes --- Documentation/git-send-email.txt | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt index 465a4ecbed..ea6ea512fe 100644 --- a/Documentation/git-send-email.txt +++ b/Documentation/git-send-email.txt @@ -321,16 +321,16 @@ Automating auto-cc of: + -- -- 'author' will avoid including the patch author -- 'self' will avoid including the sender +- 'author' will avoid including the patch author. +- 'self' will avoid including the sender. - 'cc' will avoid including anyone mentioned in Cc lines in the patch header except for self (use 'self' for that). - 'bodycc' will avoid including anyone mentioned in Cc lines in the patch body (commit message) except for self (use 'self' for that). - 'sob' will avoid including anyone mentioned in Signed-off-by lines except - for self (use 'self' for that). + for self (use 'self' for that). - 'cccmd' will avoid running the --cc-cmd. -- 'body' is equivalent to 'sob' + 'bodycc' +- 'body' is equivalent to 'sob' + 'bodycc'. - 'all' will suppress all auto cc values. -- + -- 2.19.1.6.gbde171bbf5
Re: [PATCH 3/3] send-email: also pick up cc addresses from -by trailers
Rasmus Villemoes writes: >> It may be surprising to existing users that >> the command now suddenly adds more addresses the user did not think >> would be added, but it would probably be easy enough for them to >> work around. > > Yeah, I thought about that, but unfortunately the whole auto-cc business > is not built around some config options where we can add a new and > default false. Also note that there are also cases currently where the > user sends off a patch series and is surprised that lots of intended > recipients were not cc'ed (that's how I picked this subject up again; I That "also note ... people who are not familiar are surprised" is, quite honestly, irrelevant. The behaviour is documented, and the users are supposed to be used to it. Changing the behaviour in quite a different way from what existing users are used to is a very different matter. No matter how you cut it, change of behaviour like this is a regression for some existing users, while helping others, and it does not matter if it helps many more users than it hurts---a regression is a regression to those who are affected negatively. At least this is a deliberate one we are making, and I think it is OK as long as both the change in behaviour and the way to get back the old behaviour are advertised properly. Thanks.
[PATCH 1/1] subtree: make install targets depend on build targets
From: Christian Hesse Now that we have build targets let the install targets depend on them. Also make the targets phony. Signed-off-by: Christian Hesse --- contrib/subtree/Makefile | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile index 6906aae441..4a10a020a0 100644 --- a/contrib/subtree/Makefile +++ b/contrib/subtree/Makefile @@ -69,11 +69,11 @@ install: $(GIT_SUBTREE) install-doc: install-man install-html -install-man: $(GIT_SUBTREE_DOC) +install-man: man $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir) -install-html: $(GIT_SUBTREE_HTML) +install-html: html $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir) $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir) @@ -98,4 +98,4 @@ clean: $(RM) $(GIT_SUBTREE) $(RM) *.xml *.html *.1 -.PHONY: FORCE +.PHONY: FORCE man html install-man install-html
Re: [PATCH v2 0/3] send-email: Also pick up cc addresses from -by trailers
Rasmus Villemoes writes: > This series extends the logic in git-send-email so that addresses > appearing in various *-by: trailers (e.g. Reviewed-by, Acked-by, > Tested-by) are picked up and added to the Cc list, in addition to the > current logic that picks up Cc: and Signed-off-by: lines. Thanks. Will replace. I think this is ready for 'next' so let's see if somebody else have more comments for a few days and then start merging it down.
Re: [PATCH 1/1] subtree: make install targets depend on build targets
Christian Hesse writes: > From: Christian Hesse > > Now that we have build targets let the install targets depend on them. > Also make the targets phony. > > Signed-off-by: Christian Hesse > --- > contrib/subtree/Makefile | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Thanks, will queue. > diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile > index 6906aae441..4a10a020a0 100644 > --- a/contrib/subtree/Makefile > +++ b/contrib/subtree/Makefile > @@ -69,11 +69,11 @@ install: $(GIT_SUBTREE) > > install-doc: install-man install-html > > -install-man: $(GIT_SUBTREE_DOC) > +install-man: man > $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir) > $(INSTALL) -m 644 $^ $(DESTDIR)$(man1dir) > > -install-html: $(GIT_SUBTREE_HTML) > +install-html: html > $(INSTALL) -d -m 755 $(DESTDIR)$(htmldir) > $(INSTALL) -m 644 $^ $(DESTDIR)$(htmldir) > > @@ -98,4 +98,4 @@ clean: > $(RM) $(GIT_SUBTREE) > $(RM) *.xml *.html *.1 > > -.PHONY: FORCE > +.PHONY: FORCE man html install-man install-html
Re: [PATCH v2 12/13] README: add a build badge (status of the Azure Pipelines build)
Hi Eric, On Mon, 15 Oct 2018, Eric Sunshine wrote: > On Mon, Oct 15, 2018 at 6:12 AM Johannes Schindelin via GitGitGadget > wrote: > > Just like so many other OSS projects, we now also have a build badge. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/README.md b/README.md > > @@ -1,3 +1,5 @@ > > +[](https://dev.azure.com/git/git/_build/latest?definitionId=2) > > The first URL is broken "https:/..." rather than "https://...";. I *swear* I fixed this already. Darn. Thank you, Dscho
Re: [PATCH v2 09/13] git-p4: use `test_atexit` to kill the daemon
Hi Eric, On Mon, 15 Oct 2018, Eric Sunshine wrote: > On Mon, Oct 15, 2018 at 6:12 AM Johannes Schindelin via GitGitGadget > wrote: > > This should be more reliable than the current method, and prepares the > > test suite for a consistent way to clean up before re-running the tests > > with different options. > > > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/t/t-basic.sh b/t/t-basic.sh > > @@ -134,6 +134,7 @@ check_sub_test_lib_test_err () { > > +cat >/dev/null <<\DDD > > test_expect_success 'pretend we have a fully passing test suite' " > > run_sub_test_lib_test full-pass '3 passing tests' <<-\\EOF && > > for i in 1 2 3 > > @@ -820,6 +821,7 @@ test_expect_success 'tests clean up even on failures' " > > > 1..2 > > EOF > > " > > +DDD > > Is this "DDD" here-doc leftover debugging goop? Oy, oy, oy. This is definitely a left-over from debugging (as you can imagine, it is pretty slow to run t-init.sh on Windows, and if I add a test case, the development cycle is much faster with the trick you see aboive). This left-over even made it into Git for Windows' `master` branch! (Which is the reason I missed it before contributing v2). Will fix, Dscho
Re: Ignored files being silently overwritten when switching branches
On Tue, Oct 16 2018, Jeff King wrote: > On Mon, Oct 15, 2018 at 01:01:50PM +, Per Lundberg wrote: > >> Sorry if this question has been asked before; I skimmed through the list >> archives and the FAQ but couldn't immediately find it - please point me >> in the right direction if it has indeed been discussed before. > > It is a frequently asked question, but it doesn't seem to be in any FAQ > that I could find. The behavior you're seeing is intended. See this > message (and the rest of the thread) for discussion: > > https://public-inbox.org/git/7viq39avay@alter.siamese.dyndns.org/ > >> So my question is: is this by design or should this be considered a bug >> in git? Of course, it depends largely on what .gitignore is being used >> for - if we are talking about files which can easily be regenerated >> (build artifacts, node_modules folders etc.) I can totally understand >> the current behavior, but when dealing with more sensitive & important >> content it's a bit inconvenient. > > Basically: yes. It would be nice to have that "do not track this, but do > not trash it either" state for a file, but Git does not currently > support that. There's some patches in that thread that could be picked up by someone interested. I think the approach mentioned by Matthieu Moy here makes the most sense: https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/ I don't think the rationale mentioned by Junio in https://public-inbox.org/git/7v4oepaup7@alter.siamese.dyndns.org/ is very convincing. The question is not whether .gitignore is intended to be used in some specific way, e.g. only ignoring *.o files, but whether we can reasonably suspect that users use the combination of the features we expose in such a way that their precious data gets destroyed. User data should get the benefit of the doubt. Off the top of my head, I can imagine many ways in which this'll go wrong: 1. Even if you're using .gitignore only for "trashable" as as Junio mentions, git not trashing your data depends on everyone who modifies .gitignore in your project having enough situational awareness not to inadvertently add a glob to the file which *accidentally* ignores existing files, and *nothing warns about this*. Between the caveat noted in "It is not possible to re-include[...]" in gitignore(5) and negative pathspecs it can be really easy to get this wrong. So e.g. in git.git I can add a line with "*" to .gitignore, and nothing will complain or look unusual as long as I'm not introducing new files, and I'll only find out when some-new-file.c of mine gets trashed. 2. Related, the UI "git add " presents is just "Use -f if you really want to add them". Users who aren't careful will just think "oh, I just need -f in this case" and not alter .gitignore, leaving a timebomb for future users. Those new users will have no way of knowing that they've cloned a repo with a broken overzealous .gitignore, e.g. there's nothing on clone that says "you've just cloned a repo with N files, all of which are ignored, so git clean etc. will likely wipe out anything you have in the checkout". 3. Since we implictly expose this "you need a one-off action to override .gitignore" noted in #2 users can and *do* use this for "soft" ignores. E.g. in a big work repo there's an ignore for *.png, even though the repo has thousands of such files, because it's not considered good practice to add them anymore (there's another static repo), and someone thought to use .gitignore to enforce that suggestion. I have a personal repo where I only want *.gpg files, and due to the inability to re-include files recursively (noted in #1) I just ignore '*' and use git veeery carefully. I was only worried about 'git clean' so far, but now I see I need to worry about "checkout" as well. But maybe the use-cases I'm mentioning are highly unusual and the repos at work have ended up in some bizarre state and nobody else cares about this. It would be interesting if someone at a big git hosting providers (hint: Jeff :) could provide some numbers about how common it is to have a repository containing tracked files ignored by a .gitignore the repository itself carries. This wouldn't cover all of #1-3 above, but is probably a pretty good proxy metric. I thought this could be done by: git ls-tree -r --name-only HEAD | git check-ignore --no-index --stdin But I see that e.g. on git.git this goes wrong due to t/helper/.gitignore. So I don't know how one would answer "does this repo have .gitignored files tracked?" in a one-liner.
Re: [PATCH 0/9] Offer to run CI/PR builds in Visual Studio Team Services
On Tue, Oct 16, 2018 at 6:50 AM Junio C Hamano wrote: > > Johannes Schindelin writes: > > > AFAIR Junio does not push to github.com/git/git, it is an automatic > > mirror. > > > > GitLab could easily do the same. > > It used to be in the early days but these days git/git and > gitster/git are updated in a same for loop that pushes to various > destinations. You are correct that GitLab or any other hosting > sites could do the same polling and mirroring. I am just too lazy > to open a new account at yet another hosting site to add that for > loop, but I may choose to when I am absolutely bored and nothing > else to do ;-). Do you mind if I squat gitlab.com/git/git in the meantime (i.e. create an org etc.) and have it mirror github.com/git/git?, I'll hand the keys over to you if/when you'd like to update your for-loop :) In the meantime sometimes GitLab's UI s better than GitHub's, or the other way around, so it would be useful to have it there as mirror for that reason alone, and then we can always experiment with a .gitlab-ci.yml in-repo.
Re: [PATCH v2 09/13] git-p4: use `test_atexit` to kill the daemon
Hi Luke, On Mon, 15 Oct 2018, Luke Diamand wrote: > On Mon, 15 Oct 2018 at 16:02, Johannes Schindelin > wrote: > > > > Hi Luke, > > > > On Mon, 15 Oct 2018, Luke Diamand wrote: > > > > > On Mon, 15 Oct 2018 at 11:12, Johannes Schindelin via GitGitGadget > > > wrote: > > > > > > > > From: Johannes Schindelin > > > > > > > > This should be more reliable than the current method, and prepares the > > > > test suite for a consistent way to clean up before re-running the tests > > > > with different options. > > > > > > > > > > I'm finding that it's leaving p4d processes lying around. > > > > That's a bummer! > > > > > e.g. > > > > > > $ ./t9820-git-p4-editor-handling.sh > > > > > > $ ./t9820-git-p4-editor-handling.sh > > > > > > > Since I do not currently have a setup with p4d installed, can you run that > > with `sh -x ...` and see whether this code path is hit? > > All you need to do is to put p4 and p4d in your PATH. > > https://www.perforce.com/downloads/helix-core-p4d > https://www.perforce.com/downloads/helix-command-line-client-p4 I did download p4d.exe and p4.exe to $HOME/custom/p4/ (similar to the way ci/install-dependencies.sh does it), from http://filehost.perforce.com/perforce/r18.1/bin.ntx64/p4d.exe http://filehost.perforce.com/perforce/r18.1/bin.ntx64/p4.exe and then prefixed PATH with $HOME/custom/p4, just to run t9820. However, it does not work here at all: -- snip -- [...] ++ git p4 clone '--dest=/usr/src/git/azure-pipelines/t/trash directory.t9820-git-p4-editor-handling/git' //depot Usage: git-p4 clone [options] //depot/path[@revRange] [...] -- snap -- I *think* the reason for that is that the MSYS path mangling kicks in when `git.exe` is called with the `//depot` argument (the leading `//` can be used in MSYS and MSYS2 to indicate that the non-MSYS .exe wants an argument that starts with a single slash, but is not a path that needs to be translated to a Windows path). So I did the next best thing I could do: try things in the Windows Subsystem for Linux (AKA Bash on Ubuntu on Windows). > The server is free to use for a small number of users, you don't need > to do anything to make it go. > > > > > > test_done () { > > GIT_EXIT_OK=t > > > > + test -n "$immediate" || test_atexit_handler On a slight tangent: I made up my mind on the `test -n "$immediate"` part: I will skip that. The daemon should be stopped also when `-i` was passed to the test script (to stop at the first failing test case). There is very, very little use in keeping the daemon alive in that case. The commit message of "tests: introduce `test_atexit`" already made the case for that, but somehow I had not made up my mind yet. > > + > > + test -n > + test_atexit_handler > ./t9820-git-p4-editor-handling.sh: 764: > ./t9820-git-p4-editor-handling.sh: test_atexit_handler: not found > > Is that expected? No, that is not expected. For me, it looks quite a bit different, though: -- snip -- [...] + test_done + GIT_EXIT_OK=t + test -n + test_atexit_handler + test : != { kill_p4d } && (exit "$eval_ret"); eval_ret=$?; : + setup_malloc_check + MALLOC_CHECK_=3 MALLOC_PERTURB_=165 + export MALLOC_CHECK_ MALLOC_PERTURB_ + test_eval_ { kill_p4d } && (exit "$eval_ret"); eval_ret=$?; : + test_eval_inner_ { kill_p4d } && (exit "$eval_ret"); eval_ret=$?; : + eval want_trace && set -x { kill_p4d } && (exit "$eval_ret"); eval_ret=$?; : + want_trace + test = t + kill_p4d + cat /home/wsl/git/wip/t/trash directory.t9820-git-p4-editor-handling/p4d.pid + pid=20093 + retry_until_fail kill 20093 + time_in_seconds + cd / + /usr/bin/python -c import time; print(int(time.time())) + timeout=1539681637 + kill 20093 + time_in_seconds + cd / + /usr/bin/python -c import time; print(int(time.time())) + test 1539681577 -gt 1539681637 + sleep 1 + true + time_in_seconds + cd / + /usr/bin/python -c import time; print(int(time.time())) + test 1539681578 -gt 1539681871 + sleep 1 + kill 20093 + retry_until_fail kill -9 20093 + time_in_seconds + cd / + /usr/bin/python -c import time; print(int(time.time())) + timeout=1539681638 + kill -9 20093 + test_must_fail kill 20093 + _test_ok= + kill 20093 + exit_code=1 + test 1 -eq 0 + test_match_signal 13 1 + test 1 = 141 + test 1 = 269 + return 1 + test 1 -gt 129 + test 1 -eq 127 + test 1 -eq 126 + return 0 [...] -- snap -- As you can see, it works quite well over here. Maybe I could trouble you to fetch the `vsts-ci` branch from https://github.com/dscho/git and test things on that one, just in case that anything important got "lost in the mail"? > > if test -n "$write_junit_xml" && test -n "$junit_xml_path" > > then > > > > > And also > > > > > > $ ./t9800-git-p4-basic.sh > > > > > > Ctrl-C > > > > Oh, you're right. So I tested this in WSL, and the relevant part of the output is this: -- snip -- [...] + eval test_prereq_lazily_UTF8_NFD_TO_NFC=$2 + test_prereq
Re: [PATCH v2 13/13] travis: fix skipping tagged releases
On Mon, Oct 15, 2018 at 03:12:17AM -0700, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > When building a PR, TRAVIS_BRANCH refers to the *target branch*. > Therefore, if a PR targets `master`, and `master` happened to be tagged, > we skipped the build by mistake. > > Fix this by using TRAVIS_PULL_REQUEST_BRANCH (i.e. the *source branch*) > when available, falling back to TRAVIS_BRANCH (i.e. for CI builds, also > known as "push builds"). This all makes sense, but this patch is fixing a long-standing issue in our Travis CI build scripts (present since 09f5e9746c (travis-ci: skip a branch build if equal tag is present, 2017-09-10)), so it should be the first in the series. So it could be picked up and perhaps even graduated faster than the rest of this patch series. > Signed-off-by: Johannes Schindelin > --- > ci/lib.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ci/lib.sh b/ci/lib.sh > index 584abcd529..e1858ae609 100755 > --- a/ci/lib.sh > +++ b/ci/lib.sh > @@ -3,7 +3,7 @@ > if test true = "$TRAVIS" > then > # We are running within Travis CI > - CI_BRANCH="$TRAVIS_BRANCH" > + CI_BRANCH="${TRAVIS_PULL_REQUEST_BRANCH:-$TRAVIS_BRANCH}" > CI_COMMIT="$TRAVIS_COMMIT" > CI_JOB_ID="$TRAVIS_JOB_ID" > CI_JOB_NUMBER="$TRAVIS_JOB_NUMBER" > -- > gitgitgadget
Re: [PATCH v2 05/13] ci/lib.sh: add support for Azure Pipelines
On Mon, Oct 15, 2018 at 03:12:05AM -0700, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > This patch introduces a conditional arm that defines some environment > variables and a function that displays the URL given the job id (to > identify previous runs for known-good trees). > > Signed-off-by: Johannes Schindelin > --- > ci/lib.sh | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/ci/lib.sh b/ci/lib.sh > index 8532555b4e..584abcd529 100755 > --- a/ci/lib.sh > +++ b/ci/lib.sh > @@ -19,6 +19,29 @@ then > BREW_INSTALL_PACKAGES="git-lfs gettext" So we install these two additional packages in the macOS build jobs on Travis CI ... > export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save" > export GIT_TEST_OPTS="--verbose-log -x --immediate" > +elif test -n "$SYSTEM_TASKDEFINITIONSURI" > +then > + # We are running in Azure Pipelines > + CI_BRANCH="$BUILD_SOURCEBRANCH" > + CI_COMMIT="$BUILD_SOURCEVERSION" > + CI_JOB_ID="$BUILD_BUILDID" > + CI_JOB_NUMBER="$BUILD_BUILDNUMBER" > + CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)" > + test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx > + CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')" > + CC="${CC:-gcc}" > + > + # use a subdirectory of the cache dir (because the file share is shared > + # among *all* phases) > + cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME" > + > + url_for_job_id () { > + echo > "$SYSTEM_TASKDEFINITIONSURI$SYSTEM_TEAMPROJECT/_build/results?buildId=$1" > + } > + > + BREW_INSTALL_PACKAGES= ... but not on Azure Pipelines. Is this mere oversight or intentional? If it's intentional, then I think the commit message should mention why. > + export GIT_PROVE_OPTS="--timer --jobs 10 --state=failed,slow,save" > + export GIT_TEST_OPTS="--quiet --write-junit-xml" > fi > > skip_branch_tip_with_tag () { > -- > gitgitgadget >
Re: [PATCH v2 06/13] Add a build definition for Azure DevOps
Hi Junio, On Tue, 16 Oct 2018, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" > writes: > > > Also, we make use of the shiny new feature we just introduced where the > > test suite can output JUnit-style .xml files. This information is made > > available in a nice UI that allows the viewer to filter by phase and/or > > test number, and to see trends such as: number of (failing) tests, time > > spent running the test suite, etc. > > > > Signed-off-by: Johannes Schindelin > > --- > > azure-pipelines.yml | 319 ++ > > ci/mount-fileshare.sh | 26 > > 2 files changed, 345 insertions(+) > > create mode 100644 azure-pipelines.yml > > create mode 100755 ci/mount-fileshare.sh > > I wonder if there is a need to keep what is tested by this and > Travis in sync in any way, but most of the logic is not defined in > these "steps" but implemented in ci/*.sh scripts to be shared, so it > would be OK, I guess. Indeed, that was my intention. These ci scripts are not only useful for Travis and Azure Pipelines, after all, but also a good documentation how to test locally. For example, to repeat Luke's Perforce testing, I could simply use the URLs listed in those ci scripts to get almost the same setup locally (close enough for testing). So it is not only about sharing, and ease of maintenance, but it is also about documenting. And yes, sharing means that we do not have to waste brain cycles on keeping things in sync. > > diff --git a/ci/mount-fileshare.sh b/ci/mount-fileshare.sh > > new file mode 100755 > > index 00..5fb5f74b70 > > --- /dev/null > > +++ b/ci/mount-fileshare.sh > > @@ -0,0 +1,26 @@ > > +#!/bin/sh > > + > > +die () { > > + echo "$*" >&2 > > + exit 1 > > +} > > + > > +test $# = 4 || > > +die "Usage: $0 > Missing closing '>'. Thanks! > > + > > +mkdir -p "$4" || die "Could not create $4" > > + > > +case "$(uname -s)" in > > +Linux) > > + sudo mount -t cifs -o > > vers=3.0,username="$2",password="$3",dir_mode=0777,file_mode=0777,serverino > > "$1" "$4" > > + ;; > > +Darwin) > > + pass="$(echo "$3" | sed -e 's/\//%2F/g' -e 's/+/%2B/g')" && > > + mount -t smbfs,soft "smb://$2:$pass@${1#//}" "$4" > > + ;; > > +*) > > + die "No support for $(uname -s)" > > + ;; > > +esac || > > +die "Could not mount $4" > > + > > Trailing blank line. > > Thanks. Thank you, both issues will be fixed in the next iteration, Dscho
Re: [PATCH v2 10/13] tests: include detailed trace logs with --write-junit-xml upon failure
On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin > > The JUnit XML format lends itself to be presented in a powerful UI, > where you can drill down to the information you are interested in very > quickly. > > For test failures, this usually means that you want to see the detailed > trace of the failing tests. > > With Travis CI, we passed the `--verbose-log` option to get those > traces. However, that seems excessive, as we do not need/use the logs in As someone who has dug into a few occasional failures found by Travis CI, I'd say that the output of '--verbose-log -x' is not excessive, but downright essential. > almost all of those cases: only when a test fails do we have a way to > include the trace. > > So let's do something different when using Azure DevOps: let's run all > the tests with `--quiet` first, and only if a failure is encountered, > try to trace the commands as they are executed. > > Of course, we cannot turn on `--verbose-log` after the fact. So let's > just re-run the test with all the same options, adding `--verbose-log`. > And then munging the output file into the JUnit XML on the fly. > > Note: there is an off chance that re-running the test in verbose mode > "fixes" the failures (and this does happen from time to time!). That is > a possibility we should be able to live with. Any CI system worth its salt should provide as much information about any failures as possible, especially when it was lucky enough to stumble upon a rare and hard to reproduce non-deterministic failure. > Ideally, we would label > this as "Passed upon rerun", and Azure Pipelines even know about that > outcome, but it is not available when using the JUnit XML format for > now: > https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs > > Signed-off-by: Johannes Schindelin
Re: [PATCH v10 00/21] Convert "git stash" to C builtin
Hi Thomas, On Mon, 15 Oct 2018, Thomas Gummerer wrote: > 2: 63f2e0e6f9 ! 2: 2d45985676 strbuf.c: add `strbuf_join_argv()` > @@ -14,19 +14,17 @@ > strbuf_setlen(sb, sb->len + sb2->len); > } > > -+const char *strbuf_join_argv(struct strbuf *buf, > -+ int argc, const char **argv, char delim) > ++void strbuf_join_argv(struct strbuf *buf, > ++ int argc, const char **argv, char delim) While the patch series does not use the return value, I have to ask whether it would really be useful to change it to return `void`. I could imagine that there may already be quite a few code paths that would love to use strbuf_join_argv(), *and* would benefit from the `const char *` return value. In other words: just because the *current* patches do not make use of that quite convenient return value does not mean that we should remove that convenience. > 7: a2abd1b4bd ! 8: 974dbaa492 stash: convert apply to builtin > @@ -370,18 +370,20 @@ > + > +if (diff_tree_binary(&out, &info->w_commit)) { > +strbuf_release(&out); > -+return -1; > ++return error(_("Could not generate diff > %s^!."), > ++ > oid_to_hex(&info->w_commit)); Please start the argument of an `error()` call with a lower-case letter. > +} > + > +ret = apply_cached(&out); > +strbuf_release(&out); > +if (ret) > -+return -1; > ++return error(_("Conflicts in index." > ++ "Try without --index.")); Same here. > + > +discard_cache(); > +read_cache(); > +if (write_cache_as_tree(&index_tree, 0, NULL)) > -+return -1; > ++return error(_("Could not save index > tree")); And here. > 15: bd827be103 ! 15: 989db67e9a stash: convert create to builtin > @@ -119,7 +119,6 @@ > +static int check_changes(struct pathspec ps, int include_untracked) > +{ > +int result; > -+int ret = 0; I was curious about this change, and could not find it in the git-stash-v10 tag of https://github.com/ungps/git... > 18: 1c501ad666 ! 18: c90e30173a stash: convert save to builtin > @@ -72,8 +72,10 @@ > + git_stash_helper_save_usage, > + PARSE_OPT_KEEP_DASHDASH); > + > -+if (argc) > -+stash_msg = (char*) strbuf_join_argv(&buf, argc, argv, > ' '); > ++if (argc) { > ++strbuf_join_argv(&buf, argc, argv, ' '); > ++stash_msg = buf.buf; > ++} Aha! So there *was* a user of that return value. I really would prefer a non-void return value here. > 19: c4401b21db ! 19: 4360ea875d stash: convert `stash--helper.c` into > `stash.c` > @@ -264,9 +320,9 @@ > -argc = parse_options(argc, argv, prefix, options, > - git_stash_helper_create_usage, > - 0); > -+/* Startinf with argv[1], since argv[0] is "create" */ > -+stash_msg = (char*) strbuf_join_argv(&stash_msg_buf, argc - 1, > -+ ++argv, ' '); > ++/* Starting with argv[1], since argv[0] is "create" */ > ++strbuf_join_argv(&stash_msg_buf, argc - 1, ++argv, ' '); > ++stash_msg = stash_msg_buf.buf; Again, I would strongly prefer the convenience of assigning the return value directly, rather than having two lines. > @@ -375,10 +425,8 @@ > + * they need to be immediately followed by a > string > + * (i.e.`-m"foobar"` or `--message="foobar"`). > + */ > -+if ((strlen(argv[i]) > 2 && > -+ !strncmp(argv[i], "-m", 2)) || > -+(strlen(argv[i]) > 10 && > -+ !strncmp(argv[i], "--message=", 10))) > ++if (starts_with(argv[i], "-m") || > ++starts_with(argv[i], "--message=")) Very nice. > 20: 92dc11fd16 ! 20: a384b05008 stash: optimize `get_untracked_files()` and > `check_changes()` > @@ -52,7 +52,6 @@ > +static int check_changes_tracked_files(struct pathspec ps) > { > int result; > --int ret = 0; I also wonder about this change, in light of... > struct re
Re: [PATCH v2 00/13] Offer to run CI/PR builds in Azure Pipelines
On Mon, Oct 15, 2018 at 07:22:15AM -0700, Taylor Blau wrote: > Would we like to abandon Travis as our main CI service for upstream > git.git, and build on Azure Pipelines only? It's not only about "upstream git.git", but also about contributors, who might have enabled Travis CI integration on their forks on GitHub. Having a '.travis.yml' and associated 'ci/*' scripts in git.git makes it possible for them to easily build and test their branches on their own.
I 'm Baker AKEMI From Japan
-- Dear Partner. Please forgive me if my proposal come to you as an embarrassment as I had no previous correspondence with you before now, I got your email contact from Asian Business Directory. I am Akemi BAKER; I was born in 1987, from Fukushima, Japan. Unfortunately my parent died in magnitude 7.4 earthquake strikes Iwaki city on the coast of Fukushima on Tuesday that took the lives of 573 and more than 2,500 people are still reported missing. I need a reliable and trustworthy person for business investment purpose; I will like us to get to know each other more better. Wait for your response. Best Regards, Ms. Akemi BAKER --
Re: [PATCH 0/4] Bloom filter experiment
On 10/16/2018 12:45 AM, Junio C Hamano wrote: Derrick Stolee writes: 2. The filters are sized according to the number of changes in each commit, with a minimum of one 64-bit word. ... 6. When we compute the Bloom filters, we don't store a filter for commits whose first-parent diff has more than 512 paths. Just being curious but was 512 taken out of thin air or is there some math behind it, e.g. to limit false positive rate down to certain threshold? With a wide-enough bitset, you could store arbitrary large number of paths with low enough false positive, I guess, but is there a point where there is too many paths in the change that gives us diminishing returns and not worth having a filter in the first place? 512 is somewhat arbitrary, but having a maximum size is not. In a normal source-code-control context, the set of paths modified by any single commit ought to be a small subset of the entire paths, and whole-tree changes ought to be fairly rare. In a project for which that assumption does not hold, it might help to have a negative bloom filter (i.e. "git log -- A" asks "does the commit modify A?" and the filter would say "we know it does not, because we threw all the paths that are not touched to the bloom filter"), but I think that would optimize for a wrong case. A commit with many changed paths is very rare. The 512 I picked above is enough to cover 99% of commits in each of the repos I sampled when first investigating Bloom filters. When a Bloom filter response says "maybe yes" (in our case, "maybe not TREESAME"), then we need to verify that it is correct. In the extreme case that every path is changed, then the Bloom filter does nothing but add extra work. These extreme cases are also not unprecedented: in our Azure Repos codebase, we were using core.autocrlf to smudge CRLFs to LFs, but when it was time to dogfood VFS for Git, we needed to turn off the smudge filter. So, there is one commit that converts every LF to a CRLF in every text file. Storing a Bloom filter for those ~250,000 entries would take ~256KB for essentially no value. By not storing a filter for this commit, we go immediately to the regular TREESAME check, which would happen for most pathspecs. This is all to say: having a maximum size is good. 512 is big enough to cover _most_ commits, but not so big that we may store _really_ big filters. Thanks, -Stolee
Re: [PATCH v2 00/13] Offer to run CI/PR builds in Azure Pipelines
On Tue, Oct 16, 2018 at 4:55 AM Taylor Blau wrote: > > On Mon, Oct 15, 2018 at 04:55:25PM +0200, Johannes Schindelin wrote: > > Another really good reason for me to do this is that I can prod the Azure > > Pipelines team directly. And I even get an answer, usually within minutes. > > Which is a lot faster than the Travis team answers my questions, which > > is... not yet? (I tried to get in contact with them in late 2015 or early > > 2016, and I tried again a year later, and then a couple of months later, > > and I have yet to hear back.) > > Certainly a good reason. To be clear/fair, I've sent in a number of > support tickets to Travis CI over the years, and have always been > responded to in a short amount of time with helpful answers. I think > that we would really be fine in either case, TBH. Was this in the context of "I'm this random dude using Travis" or "as you guys know I work for GitHub, your biggest? customer..." ? :)
Re: [PATCH 2/3] http: add support for disabling SSL revocation checks in cURL
Hi Eric, On Mon, 15 Oct 2018, Eric Sunshine wrote: > On Mon, Oct 15, 2018 at 6:14 AM Brendan Forster via GitGitGadget > wrote: > > This config value is only used if http.sslBackend is set to "schannel", > > which forces cURL to use the Windows Certificate Store when validating > > server certificates associated with a remote server. > > > > This is only supported in cURL 7.44 or later. > > [...] > > Signed-off-by: Brendan Forster > > --- > > diff --git a/http.c b/http.c > > @@ -811,6 +818,16 @@ static CURL *get_curl_handle(void) > > + if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) && > > + !http_schannel_check_revoke) { > > +#if LIBCURL_VERSION_NUM >= 0x072c00 > > + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, > > CURLSSLOPT_NO_REVOKE); > > +#else > > + warning("CURLSSLOPT_NO_REVOKE not applied to curl SSL > > options because\n" > > + "your curl version is too old (>= 7.44.0)"); > > This message is confusing. If your curl is too old, shouldn't the ">=" be a > "<"? Absolutely correct. Will fix, Dscho > > > +#endif > > + } >
Re: [PATCH 2/3] http: add support for disabling SSL revocation checks in cURL
Hi Junio, On Tue, 16 Oct 2018, Junio C Hamano wrote: > "Brendan Forster via GitGitGadget" writes: > > > Note: an earlier iteration tried to use the config setting > > http.schannel.checkRevoke, but the http.* config settings can be limited > > to specific URLs via http..* (which would mistake `schannel` for a > > URL). > > Yeah, "http.schannel.anything" would not work, but is this note > relevant here? As far as the git development community as a whole > is concerned, this is the first iteration of the patch we see and > review. I did review that commit message before typing /submit, and I figured that it would still make sense to leave a comment there why we did *not* use http.schannel.checkRevoke. I know that *my* review comments would include a question about this, had I not been part of the development of this patch. > In any case, you can use "http..$variable" to say "I want the > http.$variable to be in effect but only when I am talking to ". > Does it make sense for this new variable, too? That is, does it > benefit the users to be able to do something like this? > > [http] schannelCheckRevoke = no > [http "https://microsoft.com/";] schannelCheckRevoke = yes > > I am guessing that the answer is yes. Frankly, I do not know. Does it hurt, though? This patch neither introduces the `http..` feature nor prevents it. Its original design (which I found more logical than the current one), however, clashes with that feature. > I guess the same comment applies to the previous step, but I suspect > that the code structure may not allow us to switch the SSL backend > so late in the game (e.g. "when talking to microsoft, use schannel, > but when talking to github, use openssl"). That's a really good question, and I suspect that it is actually not too late. Let me try. *clicketyclick* -- snip -- $ git config --show-origin http.sslbackend file:C:/Program Files/Git/mingw64/etc/gitconfig schannel $ GIT_TRACE_CURL=1 git -c 'http.https://github.com/dscho/images.sslbackend=openssl' ls-remote https://github.com/dscho/images 14:03:52.319986 http.c:724 == Info: Couldn't find host github.com in the _netrc file; using defaults 14:03:52.366858 http.c:724 == Info: Trying 192.30.253.113... 14:03:52.366858 http.c:724 == Info: TCP_NODELAY set 14:03:52.482825 http.c:724 == Info: Connected to github.com (192.30.253.113) port 443 (#0) 14:03:52.721173 http.c:724 == Info: ALPN, offering http/1.1 14:03:52.721173 http.c:724 == Info: Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH 14:03:52.721173 http.c:724 == Info: error setting certificate verify locations: CAfile: C:/Program Files/Git/mingw64/libexec/ssl/certs/ca-bundle.crt CApath: none fatal: unable to access 'https://github.com/dscho/images/': error setting certificate verify locations: CAfile: C:/Program Files/Git/mingw64/libexec/ssl/certs/ca-bundle.crt CApath: none -- snap -- Please allow me to help understand this log. First, I verified that the currently-configured backend is Secure Channel. Then, I ask Git to list the remote refs of a small repository, special-casing it to the OpenSSL backend. Crucially, this fails. The short version is: this is good! Because it means that Git used the OpenSSL backend, as clearly intended. Why does it fail? Two reasons: 1) Git for Windows has to disable the certificate bundle. The gist of it is: Git LFS uses Git for Windows' certificate bundle, if configured, and that would override the Windows Certificate Store. When users configure Secure Channel, however, they *want* to use the Windows Certificate Store, so to accommodate Git LFS, we "unconfigure" http.sslCAInfo in that case. 2) The libcurl used by Git for Windows has some smarts built in to understand relative paths to its "Unix pseudo root". However, this fails when libcurl is loaded from libexec/git-core/ (which is the case, to avoid any libcurl-4.dll in C:\Windows\system32 from being picked up by mistake). If this explanation sounds obscure, the reason is that it *is* obscure. If you are truly interested in the details, I can point you to the relevant tickets on Git for Windows' bug tracker. > > +#if LIBCURL_VERSION_NUM >= 0x072c00 > > + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, > > CURLSSLOPT_NO_REVOKE); > > +#else > > + warning("CURLSSLOPT_NO_REVOKE not applied to curl SSL options > > because\n" > > + "your curl version is too old (>= 7.44.0)"); > > +#endif > > That ">=" is hard to grok. I think you meant it to be pronounced > "requries at least", but that is not a common reading. People more > commonly pronounce it "is greater than or equal to". Eric made the same point. I will change it to '<'. Side note: this is the kind of change that I am completely comfortable making, even on patches that were battle-tested, because those kind of changes are certain not to aff
Re: [PATCH 2/3] http: add support for disabling SSL revocation checks in cURL
Hi Peff, On Tue, 16 Oct 2018, Jeff King wrote: > On Tue, Oct 16, 2018 at 01:23:25PM +0900, Junio C Hamano wrote: > > > > +#if LIBCURL_VERSION_NUM >= 0x072c00 > > > + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, > > > CURLSSLOPT_NO_REVOKE); > > > +#else > > > + warning("CURLSSLOPT_NO_REVOKE not applied to curl SSL options > > > because\n" > > > + "your curl version is too old (>= 7.44.0)"); > > > +#endif > > > > That ">=" is hard to grok. I think you meant it to be pronounced > > "requries at least", but that is not a common reading. People more > > commonly pronounce it "is greater than or equal to". > > This seemed oddly familiar: > > > https://public-inbox.org/git/8da9d436-88b9-7959-dd9c-65bdd376b...@mail.mipt.ru/ > > Since this one is clearly copied from there, it may be worth fixing the > original. Good memory. I just integrated the patch here. It was not signed off, but it is too obvious to be protected under copyright, so I re-did it, adding a nice commit message. Ciao, Dscho
Re: [PATCH 3/3] mingw: use domain information for default email
Hi Eric, On Mon, 15 Oct 2018, Eric Sunshine wrote: > On Mon, Oct 15, 2018 at 5:47 AM Johannes Schindelin via GitGitGadget > wrote: > > When a user is registered in a Windows domain, it is really easy to > > obtain the email address. So let's do that. > > [...] > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/compat/mingw.c b/compat/mingw.c > > @@ -1826,6 +1826,11 @@ static char *get_extended_user_info(enum > > EXTENDED_NAME_FORMAT type) > > +char *mingw_query_user_email(void) > > +{ > > + return get_extended_user_info(NameUserPrincipal); > > +} > > diff --git a/ident.c b/ident.c > > @@ -168,6 +168,9 @@ const char *ident_default_email(void) > > + } else if ((email = query_user_email()) && email[0]) { > > + strbuf_addstr(&git_default_email, email); > > + free((char *)email); > > If query_user_email(), which calls get_extended_user_info(), returns > NULL, then we do nothing (and don't worry about freeing the result). > However, if query_user_email() returns a zero-length allocated string, > then this conditional will leak that string (due to the email[0] > check). From patch 2/3, we see in get_extended_user_info(): > > +static char *get_extended_user_info(enum EXTENDED_NAME_FORMAT type) > +{ > + if (GetUserNameExW(type, wbuffer, &len)) { > + char *converted = xmalloc((len *= 3)); > + if (xwcstoutf(converted, wbuffer, len) >= 0) > + return converted; > > that it may possibly return a zero-length string (due to the ">=0"). > Do we care about this potential leak (or is GetUserNameExW() > guaranteed never to return such a string)? Quite honestly, I think that this is so rare an instance (if it *can* happen at all) that I simply don't care ;-) Ciao, Dscho
Re: [PATCH 2/3] getpwuid(mingw): provide a better default for the user name
Hi Eric, On Mon, 15 Oct 2018, Eric Sunshine wrote: > On Mon, Oct 15, 2018 at 5:47 AM Johannes Schindelin via GitGitGadget > wrote: > > We do have the excellent GetUserInfoEx() function to obtain more > > detailed information of the current user (if the user is part of a > > Windows domain); Let's use it. > > [...] > > Signed-off-by: Johannes Schindelin > > --- > > diff --git a/compat/mingw.c b/compat/mingw.c > > @@ -1798,6 +1799,33 @@ int mingw_getpagesize(void) > > +static char *get_extended_user_info(enum EXTENDED_NAME_FORMAT type) > > +{ > > + DECLARE_PROC_ADDR(secur32.dll, BOOL, GetUserNameExW, > > + enum EXTENDED_NAME_FORMAT, LPCWSTR, PULONG); > > + static wchar_t wbuffer[1024]; > > Does this need to be static? It's not being returned to the caller. It does not. Fixed. > > + len = ARRAY_SIZE(wbuffer); > > + if (GetUserNameExW(type, wbuffer, &len)) { > > + char *converted = xmalloc((len *= 3)); Oh wow. I *just* realized this. It is an off-by-one: the `xmalloc()` call needs to receive `len + 1` to accommodate for the trailing NUL. Will fix, too. Ciao, Dscho > > + if (xwcstoutf(converted, wbuffer, len) >= 0) > > + return converted; > > + free(converted); > > + } > > If xwcstoutf() fails, 'converted' is freed; otherwise, the allocated > 'converted' is stored in the caller's statically held 'passwd' struct. > Okay. > > > + return NULL; > > +} >
Re: [PATCH 2/3] getpwuid(mingw): provide a better default for the user name
On Tue, Oct 16, 2018 at 8:38 AM Johannes Schindelin wrote: > On Mon, 15 Oct 2018, Eric Sunshine wrote: > > On Mon, Oct 15, 2018 at 5:47 AM Johannes Schindelin via GitGitGadget > > wrote: > > > + len = ARRAY_SIZE(wbuffer); > > > + if (GetUserNameExW(type, wbuffer, &len)) { > > > + char *converted = xmalloc((len *= 3)); > > Oh wow. I *just* realized this. It is an off-by-one: the `xmalloc()` call > needs to receive `len + 1` to accommodate for the trailing NUL. Will fix, > too. Or, xmallocz() (note the "z") which gives you overflow-checking of the +1 for free.
Re: [PATCH v2 13/13] travis: fix skipping tagged releases
Hi Junio, On Tue, 16 Oct 2018, SZEDER Gábor wrote: > On Mon, Oct 15, 2018 at 03:12:17AM -0700, Johannes Schindelin via > GitGitGadget wrote: > > From: Johannes Schindelin > > > > When building a PR, TRAVIS_BRANCH refers to the *target branch*. > > Therefore, if a PR targets `master`, and `master` happened to be tagged, > > we skipped the build by mistake. > > > > Fix this by using TRAVIS_PULL_REQUEST_BRANCH (i.e. the *source branch*) > > when available, falling back to TRAVIS_BRANCH (i.e. for CI builds, also > > known as "push builds"). > > This all makes sense, but this patch is fixing a long-standing issue > in our Travis CI build scripts (present since 09f5e9746c (travis-ci: > skip a branch build if equal tag is present, 2017-09-10)), so it > should be the first in the series. So it could be picked up and > perhaps even graduated faster than the rest of this patch series. Makes sense. Thanks, Dscho > > > Signed-off-by: Johannes Schindelin > > --- > > ci/lib.sh | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/ci/lib.sh b/ci/lib.sh > > index 584abcd529..e1858ae609 100755 > > --- a/ci/lib.sh > > +++ b/ci/lib.sh > > @@ -3,7 +3,7 @@ > > if test true = "$TRAVIS" > > then > > # We are running within Travis CI > > - CI_BRANCH="$TRAVIS_BRANCH" > > + CI_BRANCH="${TRAVIS_PULL_REQUEST_BRANCH:-$TRAVIS_BRANCH}" > > CI_COMMIT="$TRAVIS_COMMIT" > > CI_JOB_ID="$TRAVIS_JOB_ID" > > CI_JOB_NUMBER="$TRAVIS_JOB_NUMBER" > > -- > > gitgitgadget >
Re: [PATCH v2 05/13] ci/lib.sh: add support for Azure Pipelines
Hi Gábor, On Tue, 16 Oct 2018, SZEDER Gábor wrote: > On Mon, Oct 15, 2018 at 03:12:05AM -0700, Johannes Schindelin via > GitGitGadget wrote: > > From: Johannes Schindelin > > > > This patch introduces a conditional arm that defines some environment > > variables and a function that displays the URL given the job id (to > > identify previous runs for known-good trees). > > > > Signed-off-by: Johannes Schindelin > > --- > > ci/lib.sh | 23 +++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/ci/lib.sh b/ci/lib.sh > > index 8532555b4e..584abcd529 100755 > > --- a/ci/lib.sh > > +++ b/ci/lib.sh > > @@ -19,6 +19,29 @@ then > > BREW_INSTALL_PACKAGES="git-lfs gettext" > > So we install these two additional packages in the macOS build jobs on > Travis CI ... > > > export GIT_PROVE_OPTS="--timer --jobs 3 --state=failed,slow,save" > > export GIT_TEST_OPTS="--verbose-log -x --immediate" > > +elif test -n "$SYSTEM_TASKDEFINITIONSURI" > > +then > > + # We are running in Azure Pipelines > > + CI_BRANCH="$BUILD_SOURCEBRANCH" > > + CI_COMMIT="$BUILD_SOURCEVERSION" > > + CI_JOB_ID="$BUILD_BUILDID" > > + CI_JOB_NUMBER="$BUILD_BUILDNUMBER" > > + CI_OS_NAME="$(echo "$AGENT_OS" | tr A-Z a-z)" > > + test darwin != "$CI_OS_NAME" || CI_OS_NAME=osx > > + CI_REPO_SLUG="$(expr "$BUILD_REPOSITORY_URI" : '.*/\([^/]*/[^/]*\)$')" > > + CC="${CC:-gcc}" > > + > > + # use a subdirectory of the cache dir (because the file share is shared > > + # among *all* phases) > > + cache_dir="$HOME/test-cache/$SYSTEM_PHASENAME" > > + > > + url_for_job_id () { > > + echo > > "$SYSTEM_TASKDEFINITIONSURI$SYSTEM_TEAMPROJECT/_build/results?buildId=$1" > > + } > > + > > + BREW_INSTALL_PACKAGES= > > ... but not on Azure Pipelines. Is this mere oversight or > intentional? If it's intentional, then I think the commit message > should mention why. Both packages are already available on Azure Pipelines' VMs. I will mention this in the commit message. Thanks, Dscho > > > + export GIT_PROVE_OPTS="--timer --jobs 10 --state=failed,slow,save" > > + export GIT_TEST_OPTS="--quiet --write-junit-xml" > > fi > > > > skip_branch_tip_with_tag () { > > -- > > gitgitgadget > > >
Re: [PATCH 0/4] Bloom filter experiment
On Tue, Oct 16 2018, Derrick Stolee wrote: > On 10/16/2018 12:45 AM, Junio C Hamano wrote: >> Derrick Stolee writes: >> >>> 2. The filters are sized according to the number of changes in each >>> commit, with a minimum of one 64-bit word. >>> ... >>> 6. When we compute the Bloom filters, we don't store a filter for >>> commits whose first-parent diff has more than 512 paths. >> Just being curious but was 512 taken out of thin air or is there >> some math behind it, e.g. to limit false positive rate down to >> certain threshold? With a wide-enough bitset, you could store >> arbitrary large number of paths with low enough false positive, I >> guess, but is there a point where there is too many paths in the >> change that gives us diminishing returns and not worth having a >> filter in the first place? > 512 is somewhat arbitrary, but having a maximum size is not. >> In a normal source-code-control context, the set of paths modified >> by any single commit ought to be a small subset of the entire paths, >> and whole-tree changes ought to be fairly rare. In a project for >> which that assumption does not hold, it might help to have a >> negative bloom filter (i.e. "git log -- A" asks "does the commit >> modify A?" and the filter would say "we know it does not, because we >> threw all the paths that are not touched to the bloom filter"), but >> I think that would optimize for a wrong case. > > A commit with many changed paths is very rare. The 512 I picked above > is enough to cover 99% of commits in each of the repos I sampled when > first investigating Bloom filters. > > When a Bloom filter response says "maybe yes" (in our case, "maybe not > TREESAME"), then we need to verify that it is correct. In the extreme > case that every path is changed, then the Bloom filter does nothing > but add extra work. > > These extreme cases are also not unprecedented: in our Azure Repos > codebase, we were using core.autocrlf to smudge CRLFs to LFs, but > when it was time to dogfood VFS for Git, we needed to turn off the > smudge filter. So, there is one commit that converts every LF to a > CRLF in every text file. Storing a Bloom filter for those ~250,000 > entries would take ~256KB for essentially no value. By not storing a > filter for this commit, we go immediately to the regular TREESAME > check, which would happen for most pathspecs. > > This is all to say: having a maximum size is good. 512 is big enough > to cover _most_ commits, but not so big that we may store _really_ big > filters. Makes sense. 512 is good enough to hardcode initially, but I couldn't tell from briefly skimming the patches if it was possible to make this size dynamic per-repo when the graph/filter is written. I.e. we might later add some discovery step where we look at N number of commits at random, until we're satisfied that we've come up with some average/median number of total (recursive) tree entries & how many tend to be changed per-commit. I.e. I can imagine repositories (with some automated changes) where we have 10k files and tend to change 1k per commit, or ones with 10k files where we tend to change just 1-10 per commit, which would mean a larger/smaller filter would be needed / would do.
Re: [PATCH v2 10/13] tests: include detailed trace logs with --write-junit-xml upon failure
Hi Gábor, On Tue, 16 Oct 2018, SZEDER Gábor wrote: > On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via > GitGitGadget wrote: > > From: Johannes Schindelin > > > > The JUnit XML format lends itself to be presented in a powerful UI, > > where you can drill down to the information you are interested in very > > quickly. > > > > For test failures, this usually means that you want to see the detailed > > trace of the failing tests. > > > > With Travis CI, we passed the `--verbose-log` option to get those > > traces. However, that seems excessive, as we do not need/use the logs in > > As someone who has dug into a few occasional failures found by Travis > CI, I'd say that the output of '--verbose-log -x' is not excessive, > but downright essential. I agree that the output is essential for drilling down into failures. This paragraph, however, talks about the general case: where there are *no* failures. See here: > > almost all of those cases: only when a test fails do we have a way to > > include the trace. > > > > So let's do something different when using Azure DevOps: let's run all > > the tests with `--quiet` first, and only if a failure is encountered, > > try to trace the commands as they are executed. > > > > Of course, we cannot turn on `--verbose-log` after the fact. So let's > > just re-run the test with all the same options, adding `--verbose-log`. > > And then munging the output file into the JUnit XML on the fly. > > > > Note: there is an off chance that re-running the test in verbose mode > > "fixes" the failures (and this does happen from time to time!). That is > > a possibility we should be able to live with. > > Any CI system worth its salt should provide as much information about > any failures as possible, especially when it was lucky enough to > stumble upon a rare and hard to reproduce non-deterministic failure. I would agree with you if more people started to pay attention to our CI failures. And if we had some sort of a development model where a CI failure would halt development on that particular topic until the failure is fixed, with the responsibility assigned to somebody to fix it. This is not the case here, though. pu is broken for ages, at least on Windows, and even a *single* topic is enough to do that. And this is even worse with flakey tests. I cannot remember *how often* I saw CI failures in t5570-git-daemon.sh, for example. It is rare enough that it is obvious that this is a problem of the *regression test*, rather than a problem of the code that is to be tested. So I would suggest to go forward with my proposed strategy for the moment, right up until the time when we have had the resources to fix t5570, for starters. Ciao, Dscho > > Ideally, we would label this as "Passed upon rerun", and Azure > > Pipelines even know about that outcome, but it is not available when > > using the JUnit XML format for now: > > https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs > > > > Signed-off-by: Johannes Schindelin >
Re: [PATCH 0/4] Bloom filter experiment
On 10/16/2018 8:57 AM, Ævar Arnfjörð Bjarmason wrote: On Tue, Oct 16 2018, Derrick Stolee wrote: On 10/16/2018 12:45 AM, Junio C Hamano wrote: Derrick Stolee writes: 2. The filters are sized according to the number of changes in each commit, with a minimum of one 64-bit word. ... 6. When we compute the Bloom filters, we don't store a filter for commits whose first-parent diff has more than 512 paths. Just being curious but was 512 taken out of thin air or is there some math behind it, e.g. to limit false positive rate down to certain threshold? With a wide-enough bitset, you could store arbitrary large number of paths with low enough false positive, I guess, but is there a point where there is too many paths in the change that gives us diminishing returns and not worth having a filter in the first place? 512 is somewhat arbitrary, but having a maximum size is not. In a normal source-code-control context, the set of paths modified by any single commit ought to be a small subset of the entire paths, and whole-tree changes ought to be fairly rare. In a project for which that assumption does not hold, it might help to have a negative bloom filter (i.e. "git log -- A" asks "does the commit modify A?" and the filter would say "we know it does not, because we threw all the paths that are not touched to the bloom filter"), but I think that would optimize for a wrong case. A commit with many changed paths is very rare. The 512 I picked above is enough to cover 99% of commits in each of the repos I sampled when first investigating Bloom filters. When a Bloom filter response says "maybe yes" (in our case, "maybe not TREESAME"), then we need to verify that it is correct. In the extreme case that every path is changed, then the Bloom filter does nothing but add extra work. These extreme cases are also not unprecedented: in our Azure Repos codebase, we were using core.autocrlf to smudge CRLFs to LFs, but when it was time to dogfood VFS for Git, we needed to turn off the smudge filter. So, there is one commit that converts every LF to a CRLF in every text file. Storing a Bloom filter for those ~250,000 entries would take ~256KB for essentially no value. By not storing a filter for this commit, we go immediately to the regular TREESAME check, which would happen for most pathspecs. This is all to say: having a maximum size is good. 512 is big enough to cover _most_ commits, but not so big that we may store _really_ big filters. Makes sense. 512 is good enough to hardcode initially, but I couldn't tell from briefly skimming the patches if it was possible to make this size dynamic per-repo when the graph/filter is written. My proof-of-concept has it as a constant, but part of my plan is to make these all config options, as of this item in my message: >>> 2. We need config values for writing and consuming bloom filters, but also to override the default settings. I.e. we might later add some discovery step where we look at N number of commits at random, until we're satisfied that we've come up with some average/median number of total (recursive) tree entries & how many tend to be changed per-commit. I.e. I can imagine repositories (with some automated changes) where we have 10k files and tend to change 1k per commit, or ones with 10k files where we tend to change just 1-10 per commit, which would mean a larger/smaller filter would be needed / would do. I'm not sure a dynamic approach would be worth the effort, but I'm open to hearing the results of an experiment. Thanks, -Stolee
Re: [PATCH 2/3] getpwuid(mingw): provide a better default for the user name
Hi Eric, On Tue, 16 Oct 2018, Eric Sunshine wrote: > On Tue, Oct 16, 2018 at 8:38 AM Johannes Schindelin > wrote: > > On Mon, 15 Oct 2018, Eric Sunshine wrote: > > > On Mon, Oct 15, 2018 at 5:47 AM Johannes Schindelin via GitGitGadget > > > wrote: > > > > + len = ARRAY_SIZE(wbuffer); > > > > + if (GetUserNameExW(type, wbuffer, &len)) { > > > > + char *converted = xmalloc((len *= 3)); > > > > Oh wow. I *just* realized this. It is an off-by-one: the `xmalloc()` call > > needs to receive `len + 1` to accommodate for the trailing NUL. Will fix, > > too. > > Or, xmallocz() (note the "z") which gives you overflow-checking of the > +1 for free. Very good point! Thanks for the suggestion, Dscho
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
On 12/10/2018 14:36, Junio C Hamano wrote: Phillip Wood writes: It would be nice if the parsing used starts_with(option_name, user_text) rather than strcmp() as well. Also I think --color-moved=no is valid as a synonym of --no-color-moved but --color-moved-ws=no is not supported. I am not sure about starts_with(). Do you mean we should accept "--color-mo", as that is a prefix of "--color-moved" that is not shared with any existing option, until we gain a different option "--color-more"? I was thinking of the option arguments rather than the option names although being able to abbreviate the names in the same way as the commands that parse_options() would be good too (I seem to remember someone saying they had some rough patches to use parse_options() for diff and log in a discussion of adding completion support to parse_options()) If you mean "--color-moved-ws=no" (or "--no-color-moved-ws") as a way to countermand an earlier --color-moved-ws= on the command line, I fully agree that it is a good idea. Oh I assumed --no-color-moved-ws was allowed but it isn't it. Allowing --color-moved-ws=no as well would match what is allowed for --color-moved. I'll try and look at that. Best Wishes Phillip
Re: [PATCH] branch: trivial style fix
>Sorry for the slow reply. For some reason I do not think your message >here made it to the list (but I don't see anything obviously wrong with >it). Yes, I cann't send message to the list using my email clinet, I don't know why. The only way I can make it is using `git send-email`(including this message).
[PATCH v2] branch: trivial style fix
Signed-off-by: Tao Qingyun --- builtin/branch.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index c396c41533..0aa3dac27b 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -716,8 +716,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix) print_columns(&output, colopts, NULL); string_list_clear(&output, 0); return 0; - } - else if (edit_description) { + } else if (edit_description) { const char *branch_name; struct strbuf branch_ref = STRBUF_INIT; -- 2.19.0
Re: [PATCH] branch: trivial style fix
On Tue, Oct 16, 2018 at 4:16 PM Tao Qingyun wrote: > > >Sorry for the slow reply. For some reason I do not think your message > >here made it to the list (but I don't see anything obviously wrong with > >it). > Yes, I cann't send message to the list using my email clinet, I don't > know why. The only way I can make it is using `git send-email`(including > this message). It's almost certainly because your message contains a HTML part. See if you can find how to disable that in your mailer and send plain-text only. The vger.kernel.org mailer just refuses all HTML E-Mail as a spam heuristic.
Urgent.
Dear Friend, My name is Mr Bello Benard, I am a lawyer by profession. I wish to offer you the next of kin to my client. You will inherit the sum of $4.6 million dollars my client left in the bank before his death. My client is a citizen of your country who died in auto crash with his wife and the only son. Many thanks in advance, Barr. Bello Benard,
Re: [PATCH v2 10/13] Add a base implementation of SHA-256 support
On Tue, Oct 16, 2018 at 1:31 AM brian m. carlson wrote: > > On Mon, Oct 15, 2018 at 04:59:12PM +0200, Duy Nguyen wrote: > > On Mon, Oct 15, 2018 at 4:23 AM brian m. carlson > > wrote: > > > > > > SHA-1 is weak and we need to transition to a new hash function. For > > > some time, we have referred to this new function as NewHash. Recently, > > > we decided to pick SHA-256 as NewHash. > > > > > > Add a basic implementation of SHA-256 based off libtomcrypt, which is in > > > the public domain. Optimize it and restructure it to meet our coding > > > standards. Place it in a directory called "sha256" where it and any > > > future implementations can live so as to avoid a proliferation of > > > implementation directories. > > > > > > Wire up SHA-256 in the list of hash algorithms, and add a test that the > > > algorithm works correctly. > > > > > > Note that with this patch, it is still not possible to switch to using > > > SHA-256 in Git. Additional patches are needed to prepare the code to > > > handle a larger hash algorithm and further test fixes are needed. > > > > At some point I assume SHA-256 will become functional and be part of a > > git release without all file formats updated to support multiple > > hashes. Should we somehow discourage the user from using it because it > > will break when all file formats are finally updated? > > In order to activate SHA-256 in the codebase, currently you need a patch > to force it on. Otherwise, the code is simply inert and does nothing > (other than in the test-tool). I've included the patch below so you can > see what it does (or if you want to play around with it). > > Without this patch, Git remains fully SHA-1 and can't access any of the > SHA-256 code. I have some very preliminary patches that do wire up > extensions.objectFormat (branch object-id-part15 [sic]) but I haven't > picked them up in a while. (I need to finish test fixes first.) Ah, I thought that extensions.objectFormat and setup changes already landed (I think I saw that series on this list). Sorry for the noise. -- Duy
Re: [PATCH v12 0/8] filter: support for excluding all trees and blobs
On Sun, Oct 14, 2018 at 8:43 PM Junio C Hamano wrote: > > Junio C Hamano writes: > > > Matthew DeVore writes: > > > >> Here is a re-roll-up since I haven't received any additional corrections > >> for > >> almost a week. > > > > Sorry, but doesn't this topic already sit in 'next'? If so, please make > > these small clean-ups as incremental patches. > > Here is what I'd queue for now, with forged s-o-by from you ;-). > Yes, this is fine, thank you! I've reapplied the patch in my own repo on top of "next" in case I need to fix it and re-send, but please queue what you have as-is.
[PATCH v2] builtin/branch.c: remove useless branch_get
branch_get sometimes returns current_branch, which can be NULL (e.g., if you're on a detached HEAD). Try: $ git branch HEAD fatal: no such branch 'HEAD' $ git branch '' fatal: no such branch '' However, it seems weird that we'd check those cases here (and provide such lousy messages). And indeed, dropping that and letting us eventually hit create_branch() gives a much better message: $ git branch HEAD fatal: 'HEAD' is not a valid branch name. $ git branch '' fatal: '' is not a valid branch name. Signed-off-by: Tao Qingyun --- builtin/branch.c | 5 - 1 file changed, 5 deletions(-) diff --git a/builtin/branch.c b/builtin/branch.c index c396c41533..2367703034 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -809,11 +809,6 @@ int cmd_branch(int argc, const char **argv, const char *prefix) git_config_set_multivar(buf.buf, NULL, NULL, 1); strbuf_release(&buf); } else if (argc > 0 && argc <= 2) { - struct branch *branch = branch_get(argv[0]); - - if (!branch) - die(_("no such branch '%s'"), argv[0]); - if (filter.kind != FILTER_REFS_BRANCHES) die(_("-a and -r options to 'git branch' do not make sense with a branch name")); -- 2.19.0
Re: Ignored files being silently overwritten when switching branches
On Tue, Oct 16, 2018 at 11:12 AM Ævar Arnfjörð Bjarmason wrote: > > > On Tue, Oct 16 2018, Jeff King wrote: > > > On Mon, Oct 15, 2018 at 01:01:50PM +, Per Lundberg wrote: > > > >> Sorry if this question has been asked before; I skimmed through the list > >> archives and the FAQ but couldn't immediately find it - please point me > >> in the right direction if it has indeed been discussed before. > > > > It is a frequently asked question, but it doesn't seem to be in any FAQ > > that I could find. The behavior you're seeing is intended. See this > > message (and the rest of the thread) for discussion: > > > > https://public-inbox.org/git/7viq39avay@alter.siamese.dyndns.org/ > > > >> So my question is: is this by design or should this be considered a bug > >> in git? Of course, it depends largely on what .gitignore is being used > >> for - if we are talking about files which can easily be regenerated > >> (build artifacts, node_modules folders etc.) I can totally understand > >> the current behavior, but when dealing with more sensitive & important > >> content it's a bit inconvenient. > > > > Basically: yes. It would be nice to have that "do not track this, but do > > not trash it either" state for a file, but Git does not currently > > support that. > > There's some patches in that thread that could be picked up by someone > interested. I think the approach mentioned by Matthieu Moy here makes > the most sense: > https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/ > > I don't think the rationale mentioned by Junio in > https://public-inbox.org/git/7v4oepaup7@alter.siamese.dyndns.org/ is > very convincing. Just fyi I also have some wip changes that add the forth "precious" class in addition to tracked, untracked and ignored [1]. If someone has time it could be another option to pick up. [1] https://gitlab.com/pclouds/git/commit/0e7f7afa1879b055369ebd3f1224311c43c8a32b -- Duy
Re: [PATCH v2 01/13] sha1-file: rename algorithm to "sha1"
On Mon, Oct 15, 2018 at 4:21 AM brian m. carlson wrote: > > The transition plan anticipates us using a syntax such as "^{sha1}" for > disambiguation. Since this is a syntax some people will be typing a > lot, it makes sense to provide a short, easy-to-type syntax. Omitting > the dash doesn't create any ambiguity, but it does make it shorter and "but" or "and"? I think both clauses are on the same side ... or did you mean omitting the dash does create ambiguity? > easier to type, especially for touch typists. In addition, the > transition plan already uses "sha1" in this context. > > Rename the name of SHA-1 implementation to "sha1". > > Note that this change creates no backwards compatibility concerns, since > we haven't yet used this field in any serialized data formats. But we're not going to use this _string_ in any data format either because we'll stick to format_id field anyway, right? -- Duy
Re: [PATCH 2/3] http: add support for disabling SSL revocation checks in cURL
On Tue, Oct 16, 2018 at 02:25:57PM +0200, Johannes Schindelin wrote: > > > That ">=" is hard to grok. I think you meant it to be pronounced > > > "requries at least", but that is not a common reading. People more > > > commonly pronounce it "is greater than or equal to". > > > > This seemed oddly familiar: > > > > > > https://public-inbox.org/git/8da9d436-88b9-7959-dd9c-65bdd376b...@mail.mipt.ru/ > > > > Since this one is clearly copied from there, it may be worth fixing the > > original. > > Good memory. I just integrated the patch here. It was not signed off, but > it is too obvious to be protected under copyright, so I re-did it, adding > a nice commit message. Yay, thank you (I considered doing the same, but was scratching my head over how to deal with authorship and signoff). -Peff
Re: [PATCH] branch: trivial style fix
On Tue, Oct 16, 2018 at 04:27:44PM +0200, Ævar Arnfjörð Bjarmason wrote: > On Tue, Oct 16, 2018 at 4:16 PM Tao Qingyun wrote: > > > > >Sorry for the slow reply. For some reason I do not think your message > > >here made it to the list (but I don't see anything obviously wrong with > > >it). > > Yes, I cann't send message to the list using my email clinet, I don't > > know why. The only way I can make it is using `git send-email`(including > > this message). > > It's almost certainly because your message contains a HTML part. See > if you can find how to disable that in your mailer and send plain-text > only. The vger.kernel.org mailer just refuses all HTML E-Mail as a > spam heuristic. That was my guess, too, but the message that I got did not have such a part. Weird. -Peff
Re: [PATCH v2] branch: trivial style fix
On Tue, Oct 16, 2018 at 10:19:20PM +0800, Tao Qingyun wrote: > Signed-off-by: Tao Qingyun > --- > builtin/branch.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/builtin/branch.c b/builtin/branch.c > index c396c41533..0aa3dac27b 100644 > --- a/builtin/branch.c > +++ b/builtin/branch.c > @@ -716,8 +716,7 @@ int cmd_branch(int argc, const char **argv, const char > *prefix) > print_columns(&output, colopts, NULL); > string_list_clear(&output, 0); > return 0; > - } > - else if (edit_description) { > + } else if (edit_description) { > const char *branch_name; > struct strbuf branch_ref = STRBUF_INIT; Yep, this looks reasonable. Normally I would complain about an empty commit message, but there is really not much else to say. ;) -Peff
Re: [PATCH v2 13/13] commit-graph: specify OID version for SHA-256
On Mon, Oct 15, 2018 at 4:23 AM brian m. carlson wrote: > > Since the commit-graph code wants to serialize the hash algorithm into > the data store, specify a version number for each supported algorithm. > Note that we don't use the values of the constants themselves, as they > are internal and could change in the future. > > Signed-off-by: brian m. carlson > --- > commit-graph.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 7a28fbb03f..e587c21bb6 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -45,7 +45,14 @@ char *get_commit_graph_filename(const char *obj_dir) > > static uint8_t oid_version(void) > { > - return 1; > + switch (hash_algo_by_ptr(the_hash_algo)) { > + case GIT_HASH_SHA1: > + return 1; > + case GIT_HASH_SHA256: > + return 2; Should we just increase this field to uint32_t and store format_id instead? That will keep oid version unique in all data formats. > + default: > + BUG("unknown hash algorithm"); > + } > } > > static struct commit_graph *alloc_commit_graph(void) -- Duy
Re: [PATCH v2 12/13] hash: add an SHA-256 implementation using OpenSSL
On Mon, Oct 15, 2018 at 4:22 AM brian m. carlson wrote: > > We already have OpenSSL routines available for SHA-1, so add routines > for SHA-256 as well. Since we have "hash-speed" tool now, it would be great to keep some numbers of these hash implementations in the commit message (and maybe sha1 as well just for comparison). > > Signed-off-by: brian m. carlson > --- > Makefile | 7 +++ > hash.h | 2 ++ > 2 files changed, 9 insertions(+) > > diff --git a/Makefile b/Makefile > index 3d91555a81..3164e2aeee 100644 > --- a/Makefile > +++ b/Makefile > @@ -183,6 +183,8 @@ all:: > # > # Define GCRYPT_SHA256 to use the SHA-256 routines in libgcrypt. > # > +# Define OPENSSL_SHA256 to use the SHA-256 routines in OpenSSL. > +# > # Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl > (Darwin). > # > # Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto > (Darwin). > @@ -1638,6 +1640,10 @@ endif > endif > endif > > +ifdef OPENSSL_SHA256 > + EXTLIBS += $(LIB_4_CRYPTO) > + BASIC_CFLAGS += -DSHA256_OPENSSL > +else > ifdef GCRYPT_SHA256 > BASIC_CFLAGS += -DSHA256_GCRYPT > EXTLIBS += -lgcrypt > @@ -1645,6 +1651,7 @@ else > LIB_OBJS += sha256/block/sha256.o > BASIC_CFLAGS += -DSHA256_BLK > endif > +endif > > ifdef SHA1_MAX_BLOCK_SIZE > LIB_OBJS += compat/sha1-chunked.o > diff --git a/hash.h b/hash.h > index 9df562f2f6..9df06d56b4 100644 > --- a/hash.h > +++ b/hash.h > @@ -17,6 +17,8 @@ > > #if defined(SHA256_GCRYPT) > #include "sha256/gcrypt.h" > +#elif defined(SHA256_OPENSSL) > +#include > #else > #include "sha256/block/sha256.h" > #endif -- Duy
Re: [PATCH v2 00/13] Base SHA-256 implementation
On Mon, Oct 15, 2018 at 4:21 AM brian m. carlson wrote: > > This series provides an actual SHA-256 implementation and wires it up, > along with some housekeeping patches to make it suitable for testing. > > New in this version is a patch which reverts the change to limit hashcmp > to 20 bytes. I've taken care to permit the compiler to inline as much > as possible for efficiency, but it would be helpful to see what the > performance impact of these changes is on real-life workflows, or with > MSVC and other non-GCC and non-clang compilers. The resulting change is > uglier and more duplicative than I wanted, but oh, well. > > I didn't attempt to pull in the full complement of code from libtomcrypt > to try to show the changes made, since that would have involved > importing a significant quantity of code in order to make things work. > > I realize the increase to GIT_MAX_HEXSZ will result in an increase in > memory usage, but we're going to need to do it at some point, and the > sooner the code is in the codebase, the sooner people can play around > with it and test it. > > This piece should be the final piece of preparatory work required for a > fully functioning SHA-256-only Git. Additional work should be able to > come into the testsuite and codebase without needing to build on work in > any series after this one. Thanks again for keeping working on this. I'm still sick and can't really do a deep review. With that in mind, the patches look good. -- Duy
Re: [PATCH v2 04/13] cache: make hashcmp and hasheq work with larger hashes
On Mon, Oct 15, 2018 at 4:21 AM brian m. carlson wrote: > diff --git a/cache.h b/cache.h > index a13d14ce0a..0b88c3a344 100644 > --- a/cache.h > +++ b/cache.h > @@ -1024,16 +1024,12 @@ extern const struct object_id null_oid; > static inline int hashcmp(const unsigned char *sha1, const unsigned char > *sha2) > { > /* > -* This is a temporary optimization hack. By asserting the size here, > -* we let the compiler know that it's always going to be 20, which > lets > -* it turn this fixed-size memcmp into a few inline instructions. > -* > -* This will need to be extended or ripped out when we learn about > -* hashes of different sizes. > +* Teach the compiler that there are only two possibilities of hash > size > +* here, so that it can optimize for this case as much as possible. > */ > - if (the_hash_algo->rawsz != 20) > - BUG("hash size not yet supported by hashcmp"); > - return memcmp(sha1, sha2, the_hash_algo->rawsz); > + if (the_hash_algo->rawsz == GIT_MAX_RAWSZ) It's tangent. But performance is probably another good reason to detach the_hash_algo from the_repository so we have one less dereference to do. (the other good reason is these hash operations should work in "no-repo" commands as well, where the_repository does not really make sense). > + return memcmp(sha1, sha2, GIT_MAX_RAWSZ); > + return memcmp(sha1, sha2, GIT_SHA1_RAWSZ); > } > > static inline int oidcmp(const struct object_id *oid1, const struct > object_id *oid2) > @@ -1043,7 +1039,13 @@ static inline int oidcmp(const struct object_id *oid1, > const struct object_id *o > > static inline int hasheq(const unsigned char *sha1, const unsigned char > *sha2) > { > - return !hashcmp(sha1, sha2); > + /* > +* We write this here instead of deferring to hashcmp so that the > +* compiler can properly inline it and avoid calling memcmp. > +*/ > + if (the_hash_algo->rawsz == GIT_MAX_RAWSZ) > + return !memcmp(sha1, sha2, GIT_MAX_RAWSZ); > + return !memcmp(sha1, sha2, GIT_SHA1_RAWSZ); > } > > static inline int oideq(const struct object_id *oid1, const struct object_id > *oid2) -- Duy
Re: [PATCH v2 13/13] commit-graph: specify OID version for SHA-256
On 10/16/2018 11:35 AM, Duy Nguyen wrote: On Mon, Oct 15, 2018 at 4:23 AM brian m. carlson wrote: Since the commit-graph code wants to serialize the hash algorithm into the data store, specify a version number for each supported algorithm. Note that we don't use the values of the constants themselves, as they are internal and could change in the future. Signed-off-by: brian m. carlson --- commit-graph.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/commit-graph.c b/commit-graph.c index 7a28fbb03f..e587c21bb6 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -45,7 +45,14 @@ char *get_commit_graph_filename(const char *obj_dir) static uint8_t oid_version(void) { - return 1; + switch (hash_algo_by_ptr(the_hash_algo)) { + case GIT_HASH_SHA1: + return 1; + case GIT_HASH_SHA256: + return 2; Should we just increase this field to uint32_t and store format_id instead? That will keep oid version unique in all data formats. Both the commit-graph and multi-pack-index store a single byte for the hash version, so that ship has sailed (without incrementing the full file version number in each format). It may be good to make this method accessible to both formats. I'm not sure if Brian's branch is built on top of the multi-pack-index code. Probably best to see if ds/multi-pack-verify is in the history. Thanks, -Stolee
Re: [PATCH v2 10/13] tests: include detailed trace logs with --write-junit-xml upon failure
On Tue, Oct 16, 2018 at 03:02:38PM +0200, Johannes Schindelin wrote: > Hi Gábor, > > On Tue, 16 Oct 2018, SZEDER Gábor wrote: > > > On Mon, Oct 15, 2018 at 03:12:12AM -0700, Johannes Schindelin via > > GitGitGadget wrote: > > > From: Johannes Schindelin > > > > > > The JUnit XML format lends itself to be presented in a powerful UI, > > > where you can drill down to the information you are interested in very > > > quickly. > > > > > > For test failures, this usually means that you want to see the detailed > > > trace of the failing tests. > > > > > > With Travis CI, we passed the `--verbose-log` option to get those > > > traces. However, that seems excessive, as we do not need/use the logs in > > > > As someone who has dug into a few occasional failures found by Travis > > CI, I'd say that the output of '--verbose-log -x' is not excessive, > > but downright essential. > > I agree that the output is essential for drilling down into failures. This > paragraph, however, talks about the general case: where there are *no* > failures. See here: But you don't know in advance whether there will be any failures or not, so it only makes sense to run all tests with '--verbose-log -x' by default, just in case a Heisenbug decides to make an appearance. > > > almost all of those cases: only when a test fails do we have a way to > > > include the trace. > > > > > > So let's do something different when using Azure DevOps: let's run all > > > the tests with `--quiet` first, and only if a failure is encountered, > > > try to trace the commands as they are executed. > > > > > > Of course, we cannot turn on `--verbose-log` after the fact. So let's > > > just re-run the test with all the same options, adding `--verbose-log`. > > > And then munging the output file into the JUnit XML on the fly. > > > > > > Note: there is an off chance that re-running the test in verbose mode > > > "fixes" the failures (and this does happen from time to time!). That is > > > a possibility we should be able to live with. > > > > Any CI system worth its salt should provide as much information about > > any failures as possible, especially when it was lucky enough to > > stumble upon a rare and hard to reproduce non-deterministic failure. > > I would agree with you if more people started to pay attention to our CI > failures. And if we had some sort of a development model where a CI > failure would halt development on that particular topic until the failure > is fixed, with the responsibility assigned to somebody to fix it. > > This is not the case here, though. pu is broken for ages, at least on > Windows, and even a *single* topic is enough to do that. And this is even > worse with flakey tests. I cannot remember *how often* I saw CI failures > in t5570-git-daemon.sh, for example. It is rare enough that it is obvious > that this is a problem of the *regression test*, rather than a problem of > the code that is to be tested. Some occasional failures in t5570 are actually caused by issues in Git on certain platforms: https://public-inbox.org/git/CAM0VKj=mcs+cmogzf_xypeb+qzrfmumh52-pv_ndmza9x+r...@mail.gmail.com/T/#u > So I would suggest to go forward with my proposed strategy for the moment, > right up until the time when we have had the resources to fix t5570, for > starters. I don't really understand what the occasional failures in t5570 have to do with the amount of information a CI system should gather about failures in general. Or how many people pay attention to it, or what kind of development model we have, for that matter. The way I see it these are unrelated issues, and a CI system should always provide as much information about failures as possible. If only a few people pay attention to it, then for the sake of those few. > > > Ideally, we would label this as "Passed upon rerun", and Azure > > > Pipelines even know about that outcome, but it is not available when > > > using the JUnit XML format for now: > > > https://github.com/Microsoft/azure-pipelines-agent/blob/master/src/Agent.Worker/TestResults/JunitResultReader.cs > > > > > > Signed-off-by: Johannes Schindelin > >
Re: [PATCH v2 13/13] commit-graph: specify OID version for SHA-256
On Tue, Oct 16, 2018 at 6:01 PM Derrick Stolee wrote: > > On 10/16/2018 11:35 AM, Duy Nguyen wrote: > > On Mon, Oct 15, 2018 at 4:23 AM brian m. carlson > > wrote: > >> Since the commit-graph code wants to serialize the hash algorithm into > >> the data store, specify a version number for each supported algorithm. > >> Note that we don't use the values of the constants themselves, as they > >> are internal and could change in the future. > >> > >> Signed-off-by: brian m. carlson > >> --- > >> commit-graph.c | 9 - > >> 1 file changed, 8 insertions(+), 1 deletion(-) > >> > >> diff --git a/commit-graph.c b/commit-graph.c > >> index 7a28fbb03f..e587c21bb6 100644 > >> --- a/commit-graph.c > >> +++ b/commit-graph.c > >> @@ -45,7 +45,14 @@ char *get_commit_graph_filename(const char *obj_dir) > >> > >> static uint8_t oid_version(void) > >> { > >> - return 1; > >> + switch (hash_algo_by_ptr(the_hash_algo)) { > >> + case GIT_HASH_SHA1: > >> + return 1; > >> + case GIT_HASH_SHA256: > >> + return 2; > > Should we just increase this field to uint32_t and store format_id > > instead? That will keep oid version unique in all data formats. > Both the commit-graph and multi-pack-index store a single byte for the > hash version, so that ship has sailed (without incrementing the full > file version number in each format). And it's probably premature to add the oid version field when multiple hash support has not been fully realized. Now we have different ways of storing hash id and need separate mappings. I would go for incrementing file version. Otherwise maybe we just update format_id to be one byte instead, and the way of storing hash version in commit-graph will be used everywhere. > It may be good to make this method accessible to both formats. I'm not > sure if Brian's branch is built on top of the multi-pack-index code. > Probably best to see if ds/multi-pack-verify is in the history. > > Thanks, > -Stolee -- Duy
Re: What's cooking in git.git (Oct 2018, #01; Wed, 10)
On Tue, Oct 16, 2018 at 6:39 AM Phillip Wood wrote: > > If you mean "--color-moved-ws=no" (or "--no-color-moved-ws") as a > > way to countermand an earlier --color-moved-ws= on the > > command line, I fully agree that it is a good idea. > > Oh I assumed --no-color-moved-ws was allowed but it isn't it. Allowing > --color-moved-ws=no as well would match what is allowed for > --color-moved. I'll try and look at that. Thanks for taking a look! Stefan
[PATCH] submodule helper: convert relative URL to absolute URL if needed
The submodule helper update_clone called by "git submodule update", clones submodules if needed. As submodules used to have the URL indicating if they were active, the step to resolve relative URLs was done in the "submodule init" step. Nowadays submodules can be configured active without calling an explicit init, e.g. via configuring submodule.active. When trying to obtain submodules that are set active this way, we'll fallback to the URL found in the .gitmodules, which may be relative to the superproject, but we do not resolve it, yet: git clone https://gerrit.googlesource.com/gerrit cd gerrit && grep url .gitmodules url = ../plugins/codemirror-editor ... git config submodule.active . git submodule update fatal: repository '../plugins/codemirror-editor' does not exist fatal: clone of '../plugins/codemirror-editor' into submodule path '/tmp/gerrit/plugins/codemirror-editor' failed Failed to clone 'plugins/codemirror-editor'. Retry scheduled [...] fatal: clone of '../plugins/codemirror-editor' into submodule path '/tmp/gerrit/plugins/codemirror-editor' failed Failed to clone 'plugins/codemirror-editor' a second time, aborting [...] To resolve the issue, factor out the function that resolves the relative URLs in "git submodule init" (in the submodule helper in the init_submodule function) and call it at the appropriate place in the update_clone helper. Reported-by: Jaewoong Jung Signed-off-by: Stefan Beller --- builtin/submodule--helper.c | 51 - t/t7400-submodule-basic.sh | 24 + 2 files changed, 58 insertions(+), 17 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index f6fb8991f3..13c2e4b556 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -584,6 +584,26 @@ static int module_foreach(int argc, const char **argv, const char *prefix) return 0; } +static char *compute_submodule_clone_url(const char *rel_url) +{ + char *remoteurl, *relurl; + char *remote = get_default_remote(); + struct strbuf remotesb = STRBUF_INIT; + + strbuf_addf(&remotesb, "remote.%s.url", remote); + if (git_config_get_string(remotesb.buf, &remoteurl)) { + warning(_("could not look up configuration '%s'. Assuming this repository is its own authoritative upstream."), remotesb.buf); + remoteurl = xgetcwd(); + } + relurl = relative_url(remoteurl, rel_url, NULL); + + free(remote); + free(remoteurl); + strbuf_release(&remotesb); + + return relurl; +} + struct init_cb { const char *prefix; unsigned int flags; @@ -634,21 +654,9 @@ static void init_submodule(const char *path, const char *prefix, /* Possibly a url relative to parent */ if (starts_with_dot_dot_slash(url) || starts_with_dot_slash(url)) { - char *remoteurl, *relurl; - char *remote = get_default_remote(); - struct strbuf remotesb = STRBUF_INIT; - strbuf_addf(&remotesb, "remote.%s.url", remote); - free(remote); - - if (git_config_get_string(remotesb.buf, &remoteurl)) { - warning(_("could not lookup configuration '%s'. Assuming this repository is its own authoritative upstream."), remotesb.buf); - remoteurl = xgetcwd(); - } - relurl = relative_url(remoteurl, url, NULL); - strbuf_release(&remotesb); - free(remoteurl); - free(url); - url = relurl; + char *oldurl = url; + url = compute_submodule_clone_url(oldurl); + free(oldurl); } if (git_config_set_gently(sb.buf, url)) @@ -1514,6 +1522,7 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, struct strbuf sb = STRBUF_INIT; const char *displaypath = NULL; int needs_cloning = 0; + int need_free_url = 0; if (ce_stage(ce)) { if (suc->recursive_prefix) @@ -1562,8 +1571,14 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, strbuf_reset(&sb); strbuf_addf(&sb, "submodule.%s.url", sub->name); - if (repo_config_get_string_const(the_repository, sb.buf, &url)) - url = sub->url; + if (repo_config_get_string_const(the_repository, sb.buf, &url)) { + if (starts_with_dot_slash(sub->url) || + starts_with_dot_dot_slash(sub->url)) { + url = compute_submodule_clone_url(sub->url); + need_free_url = 1; + } else + url = sub->url; +
Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote
Hi Christian, On Tue, Sep 25, 2018, Christian Couder wrote: > In the cover letter there is a "Discussion" section which is about > this, but I agree that it might not be very clear. > > The main issue that this patch series tries to solve is that > extensions.partialclone config option limits the partial clone and > promisor features to only one remote. One related issue is that it > also prevents to have other kind of promisor/partial clone/odb > remotes. By other kind I mean remotes that would not necessarily be > git repos, but that could store objects (that's where ODB, for Object > DataBase, comes from) and could provide those objects to Git through a > helper (or driver) script or program. Thanks for this explanation. I took the opportunity to learn more while you were in the bay area for the google summer of code mentor summit and learned a little more, which was very helpful to me. The broader picture is that this is meant to make Git natively handle large blobs in a nicer way. The design in this series has a few components: 1. Teaching partial clone to attempt to fetch missing objects from multiple remotes instead of only one. This is useful because you can have a server that is nearby and cheaper to serve from (some kind of local cache server) that you make requests to first before falling back to the canonical source of objects. 2. Simplifying the protocol for fetching missing objects so that it can be satisfied by a lighter weight object storage system than a full Git server. The ODB helpers introduced in this series are meant to speak such a simpler protocol since they are only used for one-off requests of a collection of missing objects instead of needing to understand refs, Git's negotiation, etc. 3. (possibly, though not in this series) Making the criteria for what objects can be missing more aggressive, so that I can "git add" a large file and work with it using Git without even having a second copy of that object in my local object store. For (2), I would like to see us improve the remote helper infrastructure instead of introducing a new ODB helper. Remote helpers are already permitted to fetch some objects without listing refs --- perhaps we will want to i. split listing refs to a separate capability, so that a remote helper can advertise that it doesn't support that. (Alternatively the remote could advertise that it has no refs.) ii. Use the "long-running process" mechanism to improve how Git communicates with a remote helper. For (1), things get more tricky. In an object store from a partial clone today, we relax the ordinary "closure under reachability" invariant but in a minor way. We'll need to work out how this works with multiple promisor remotes. The idea today is that there are two kinds of packs: promisor packs (from the promisor remote) and non-promisor packs. Promisor packs are allowed to have reachability edges (for example a tree->blob edge) that point to a missing object, since the promisor remote has promised that we will be able to access that object on demand. Non-promisor packs are also allowed to have reachability edges that point to a missing object, as long as there is a reachability edge from an object in a promisor pack to the same object (because of the same promise). See "Handling Missing Objects" in Documentation/technical/partial-clone.txt for more details. To prevent older versions of Git from being confused by partial clone repositories, they use the repositoryFormatVersion mechanism: [core] repositoryFormatVersion = 1 [extensions] partialClone = ... If we change the invariant, we will need to use a new extensions.* key to ensure that versions of Git that are not aware of the new invariant do not operate on the repository. A promisor pack is indicated by there being a .promisor file next to the usual .pack file. Currently the .promisor file is empty. The previous idea was that once we want more metadata (e.g. for the sake of multiple promisor remotes), we could write it in that file. For example, remotes could be associated to a and the .promisor file could indicate which has promised to serve requests for objects reachable from objects in this pack. That will complicate the object access code as well, since currently we only find who has promised an object during "git fsck" and similar operations. During everyday access we do not care which promisor pack caused the object to be promised, since there is only one promisor remote to fetch from anyway. So much for the current setup. For (1), I believe you are proposing to still have only one effective , so it doesn't necessarily require modifying the extensions.* configuration. Instead, the idea is that when trying to access an object, we would follow one of a list of steps: 1. First, check the local object store. If it's there, we're done. 2. Second, try alte
Re: [PATCH 2/4] merge-recursive: increase marker length with depth of recursion
On Mon, Oct 15, 2018 at 7:17 PM Junio C Hamano wrote: > > Elijah Newren writes: > > > Would you like me to edit the commit message to include this more > > difficult case? > > Neither. If the "marker length" change is required in a separate > series that will build on top of the current 4-patch series, I think > dropping this step from the current series and make it a part of the > series that deals with rename/rename would make more sense. Okay, I'll resubmit this series without this patch or associated testcase, and include those in the later file collision series.
[PATCH 1/9] sha1-array: provide oid_array_filter
Helped-by: Junio C Hamano Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- Documentation/technical/api-oid-array.txt | 5 + sha1-array.c | 17 + sha1-array.h | 3 +++ 3 files changed, 25 insertions(+) diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt index 9febfb1d52..c97428c2c3 100644 --- a/Documentation/technical/api-oid-array.txt +++ b/Documentation/technical/api-oid-array.txt @@ -48,6 +48,11 @@ Functions is not sorted, this function has the side effect of sorting it. +`oid_array_filter`:: + Apply the callback function `want` to each entry in the array, + retaining only the entries for which the function returns true. + Preserve the order of the entries that are retained. + Examples diff --git a/sha1-array.c b/sha1-array.c index 265941fbf4..d505a004bb 100644 --- a/sha1-array.c +++ b/sha1-array.c @@ -77,3 +77,20 @@ int oid_array_for_each_unique(struct oid_array *array, } return 0; } + +void oid_array_filter(struct oid_array *array, + for_each_oid_fn want, + void *cb_data) +{ + unsigned nr = array->nr, src, dst; + struct object_id *oids = array->oid; + + for (src = dst = 0; src < nr; src++) { + if (want(&oids[src], cb_data)) { + if (src != dst) + oidcpy(&oids[dst], &oids[src]); + dst++; + } + } + array->nr = dst; +} diff --git a/sha1-array.h b/sha1-array.h index 232bf95017..55d016c4bf 100644 --- a/sha1-array.h +++ b/sha1-array.h @@ -22,5 +22,8 @@ int oid_array_for_each(struct oid_array *array, int oid_array_for_each_unique(struct oid_array *array, for_each_oid_fn fn, void *data); +void oid_array_filter(struct oid_array *array, + for_each_oid_fn want, + void *cbdata); #endif /* SHA1_ARRAY_H */ -- 2.19.0
[PATCH 4/9] submodule.c: move global changed_submodule_names into fetch submodule struct
The `changed_submodule_names` are only used for fetching, so let's make it part of the struct that is passed around for fetching submodules. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 42 +++--- 1 file changed, 23 insertions(+), 19 deletions(-) diff --git a/submodule.c b/submodule.c index 9fbfcfcfe1..6b4cee82bf 100644 --- a/submodule.c +++ b/submodule.c @@ -24,7 +24,7 @@ #include "object-store.h" static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF; -static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP; + static int initialized_fetch_ref_tips; static struct oid_array ref_tips_before_fetch; static struct oid_array ref_tips_after_fetch; @@ -1124,7 +1124,22 @@ void check_for_new_submodule_commits(struct object_id *oid) oid_array_append(&ref_tips_after_fetch, oid); } -static void calculate_changed_submodule_paths(void) +struct submodule_parallel_fetch { + int count; + struct argv_array args; + struct repository *r; + const char *prefix; + int command_line_option; + int default_option; + int quiet; + int result; + + struct string_list changed_submodule_names; +}; +#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, STRING_LIST_INIT_DUP } + +static void calculate_changed_submodule_paths( + struct submodule_parallel_fetch *spf) { struct argv_array argv = ARGV_ARRAY_INIT; struct string_list changed_submodules = STRING_LIST_INIT_DUP; @@ -1162,7 +1177,8 @@ static void calculate_changed_submodule_paths(void) continue; if (!submodule_has_commits(path, commits)) - string_list_append(&changed_submodule_names, name->string); + string_list_append(&spf->changed_submodule_names, + name->string); } free_submodules_oids(&changed_submodules); @@ -1199,18 +1215,6 @@ int submodule_touches_in_range(struct object_id *excl_oid, return ret; } -struct submodule_parallel_fetch { - int count; - struct argv_array args; - struct repository *r; - const char *prefix; - int command_line_option; - int default_option; - int quiet; - int result; -}; -#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0} - static int get_fetch_recurse_config(const struct submodule *submodule, struct submodule_parallel_fetch *spf) { @@ -1271,7 +1275,7 @@ static int get_next_submodule(struct child_process *cp, case RECURSE_SUBMODULES_ON_DEMAND: if (!submodule || !string_list_lookup( - &changed_submodule_names, + &spf->changed_submodule_names, submodule->name)) continue; default_argv = "on-demand"; @@ -1363,8 +1367,8 @@ int fetch_populated_submodules(struct repository *r, argv_array_push(&spf.args, "--recurse-submodules-default"); /* default value, "--submodule-prefix" and its value are added later */ - calculate_changed_submodule_paths(); - string_list_sort(&changed_submodule_names); + calculate_changed_submodule_paths(&spf); + string_list_sort(&spf.changed_submodule_names); run_processes_parallel(max_parallel_jobs, get_next_submodule, fetch_start_failure, @@ -1373,7 +1377,7 @@ int fetch_populated_submodules(struct repository *r, argv_array_clear(&spf.args); out: - string_list_clear(&changed_submodule_names, 1); + string_list_clear(&spf.changed_submodule_names, 1); return spf.result; } -- 2.19.0
[PATCH 5/9] submodule.c: do not copy around submodule list
'calculate_changed_submodule_paths' uses a local list to compute the changed submodules, and then produces the result by copying appropriate items into the result list. Instead use the result list directly and prune items afterwards using string_list_remove_empty_items. By doing so we'll have access to the util pointer for longer that contains the commits that we need to fetch, which will be useful in a later patch. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/submodule.c b/submodule.c index 6b4cee82bf..cbefe5f54d 100644 --- a/submodule.c +++ b/submodule.c @@ -1142,8 +1142,7 @@ static void calculate_changed_submodule_paths( struct submodule_parallel_fetch *spf) { struct argv_array argv = ARGV_ARRAY_INIT; - struct string_list changed_submodules = STRING_LIST_INIT_DUP; - const struct string_list_item *name; + struct string_list_item *name; /* No need to check if there are no submodules configured */ if (!submodule_from_path(the_repository, NULL, NULL)) @@ -1160,9 +1159,9 @@ static void calculate_changed_submodule_paths( * Collect all submodules (whether checked out or not) for which new * commits have been recorded upstream in "changed_submodule_names". */ - collect_changed_submodules(&changed_submodules, &argv); + collect_changed_submodules(&spf->changed_submodule_names, &argv); - for_each_string_list_item(name, &changed_submodules) { + for_each_string_list_item(name, &spf->changed_submodule_names) { struct oid_array *commits = name->util; const struct submodule *submodule; const char *path = NULL; @@ -1176,12 +1175,14 @@ static void calculate_changed_submodule_paths( if (!path) continue; - if (!submodule_has_commits(path, commits)) - string_list_append(&spf->changed_submodule_names, - name->string); + if (submodule_has_commits(path, commits)) { + oid_array_clear(commits); + *name->string = '\0'; + } } - free_submodules_oids(&changed_submodules); + string_list_remove_empty_items(&spf->changed_submodule_names, 1); + argv_array_clear(&argv); oid_array_clear(&ref_tips_before_fetch); oid_array_clear(&ref_tips_after_fetch); @@ -1377,7 +1378,7 @@ int fetch_populated_submodules(struct repository *r, argv_array_clear(&spf.args); out: - string_list_clear(&spf.changed_submodule_names, 1); + free_submodules_oids(&spf.changed_submodule_names); return spf.result; } -- 2.19.0
[PATCH 6/9] repository: repo_submodule_init to take a submodule struct
When constructing a struct repository for a submodule for some revision of the superproject where the submodule is not contained in the index, it may not be present in the working tree currently either. In that siutation giving a 'path' argument is not useful. Upgrade the repo_submodule_init function to take a struct submodule instead. While we are at it, overhaul the repo_submodule_init function by renaming the submodule repository struct, which is to be initialized, to a name that is not confused with the struct submodule as easily. Also move its documentation into the header file. Signed-off-by: Stefan Beller --- builtin/grep.c | 17 +++- builtin/ls-files.c | 12 + repository.c | 27 repository.h | 11 ++-- t/helper/test-submodule-nested-repo-config.c | 8 +++--- 5 files changed, 41 insertions(+), 34 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 7da8fef31a..ba7634258a 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -418,7 +418,10 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, const struct object_id *oid, const char *filename, const char *path) { - struct repository submodule; + struct repository subrepo; + const struct submodule *sub = submodule_from_path(superproject, + &null_oid, path); + int hit; /* @@ -434,12 +437,12 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, return 0; } - if (repo_submodule_init(&submodule, superproject, path)) { + if (repo_submodule_init(&subrepo, superproject, sub)) { grep_read_unlock(); return 0; } - repo_read_gitmodules(&submodule); + repo_read_gitmodules(&subrepo); /* * NEEDSWORK: This adds the submodule's object directory to the list of @@ -451,7 +454,7 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, * store is no longer global and instead is a member of the repository * object. */ - add_to_alternates_memory(submodule.objects->objectdir); + add_to_alternates_memory(subrepo.objects->objectdir); grep_read_unlock(); if (oid) { @@ -476,14 +479,14 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject, init_tree_desc(&tree, data, size); hit = grep_tree(opt, pathspec, &tree, &base, base.len, - object->type == OBJ_COMMIT, &submodule); + object->type == OBJ_COMMIT, &subrepo); strbuf_release(&base); free(data); } else { - hit = grep_cache(opt, &submodule, pathspec, 1); + hit = grep_cache(opt, &subrepo, pathspec, 1); } - repo_clear(&submodule); + repo_clear(&subrepo); return hit; } diff --git a/builtin/ls-files.c b/builtin/ls-files.c index 7f9919a362..4d1649c1b3 100644 --- a/builtin/ls-files.c +++ b/builtin/ls-files.c @@ -206,17 +206,19 @@ static void show_files(struct repository *repo, struct dir_struct *dir); static void show_submodule(struct repository *superproject, struct dir_struct *dir, const char *path) { - struct repository submodule; + struct repository subrepo; + const struct submodule *sub = submodule_from_path(superproject, + &null_oid, path); - if (repo_submodule_init(&submodule, superproject, path)) + if (repo_submodule_init(&subrepo, superproject, sub)) return; - if (repo_read_index(&submodule) < 0) + if (repo_read_index(&subrepo) < 0) die("index file corrupt"); - show_files(&submodule, dir); + show_files(&subrepo, dir); - repo_clear(&submodule); + repo_clear(&subrepo); } static void show_ce(struct repository *repo, struct dir_struct *dir, diff --git a/repository.c b/repository.c index 5dd1486718..aabe64ee5d 100644 --- a/repository.c +++ b/repository.c @@ -166,30 +166,23 @@ int repo_init(struct repository *repo, return -1; } -/* - * Initialize 'submodule' as the submodule given by 'path' in parent repository - * 'superproject'. - * Return 0 upon success and a non-zero value upon failure. - */ -int repo_submodule_init(struct repository *submodule, +int repo_submodule_init(struct repository *subrepo, struct repository *superproject, - const char *path) + const struct submodule *sub) { - const struct submodule *sub; struct strbuf gitdir = STRBUF_INIT; s
[PATCH 8/9] fetch: retry fetching submodules if needed objects were not fetched
Currently when git-fetch is asked to recurse into submodules, it dispatches a plain "git-fetch -C " (with some submodule related options such as prefix and recusing strategy, but) without any information of the remote or the tip that should be fetched. This works surprisingly well in some workflows (such as using submodules as a third party library), while not so well in other scenarios, such as in a Gerrit topic-based workflow, that can tie together changes (potentially across repositories) on the server side. One of the parts of such a Gerrit workflow is to download a change when wanting to examine it, and you'd want to have its submodule changes that are in the same topic downloaded as well. However these submodule changes reside in their own repository in their own ref (refs/changes/). Retry fetching a submodule by object name if the object id that the superproject points to, cannot be found. This retrying does not happen when the "git fetch" done at the superproject is not storing the fetched results in remote tracking branches (i.e. instead just recording them to FETCH_HEAD) in this step. A later patch will fix this. builtin/fetch used to only inspect submodules when they were fetched "on-demand", as in either on/off case it was clear whether the submodule needs to be fetched. However to know whether we need to try fetching the object ids, we need to identify the object names, which is done in this function check_for_new_submodule_commits(), so we'll also run that code in case the submodule recursion is set to "on". Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- builtin/fetch.c | 9 +- submodule.c | 185 ++-- t/t5526-fetch-submodules.sh | 16 3 files changed, 177 insertions(+), 33 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 61bec5d213..95c44bf6ff 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -700,8 +700,7 @@ static int update_local_ref(struct ref *ref, what = _("[new ref]"); } - if ((recurse_submodules != RECURSE_SUBMODULES_OFF) && - (recurse_submodules != RECURSE_SUBMODULES_ON)) + if (recurse_submodules != RECURSE_SUBMODULES_OFF) check_for_new_submodule_commits(&ref->new_oid); r = s_update_ref(msg, ref, 0); format_display(display, r ? '!' : '*', what, @@ -716,8 +715,7 @@ static int update_local_ref(struct ref *ref, strbuf_add_unique_abbrev(&quickref, ¤t->object.oid, DEFAULT_ABBREV); strbuf_addstr(&quickref, ".."); strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV); - if ((recurse_submodules != RECURSE_SUBMODULES_OFF) && - (recurse_submodules != RECURSE_SUBMODULES_ON)) + if (recurse_submodules != RECURSE_SUBMODULES_OFF) check_for_new_submodule_commits(&ref->new_oid); r = s_update_ref("fast-forward", ref, 1); format_display(display, r ? '!' : ' ', quickref.buf, @@ -731,8 +729,7 @@ static int update_local_ref(struct ref *ref, strbuf_add_unique_abbrev(&quickref, ¤t->object.oid, DEFAULT_ABBREV); strbuf_addstr(&quickref, "..."); strbuf_add_unique_abbrev(&quickref, &ref->new_oid, DEFAULT_ABBREV); - if ((recurse_submodules != RECURSE_SUBMODULES_OFF) && - (recurse_submodules != RECURSE_SUBMODULES_ON)) + if (recurse_submodules != RECURSE_SUBMODULES_OFF) check_for_new_submodule_commits(&ref->new_oid); r = s_update_ref("forced-update", ref, 1); format_display(display, r ? '!' : '+', quickref.buf, diff --git a/submodule.c b/submodule.c index 30c06507e3..7246b776f3 100644 --- a/submodule.c +++ b/submodule.c @@ -1141,8 +1141,12 @@ struct submodule_parallel_fetch { int result; struct string_list changed_submodule_names; + struct get_next_submodule_task **retry; + int retry_nr, retry_alloc; }; -#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, STRING_LIST_INIT_DUP } +#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0, \ + STRING_LIST_INIT_DUP, \ + NULL, 0, 0} static void calculate_changed_submodule_paths( struct submodule_parallel_fetch *spf) @@ -1247,6 +1251,56 @@ static int get_fetch_recurse_config(const struct submodule *submodule, return spf->default_option; } +struct get_next_submodule_task { + struct repository *repo; + const struct submodule *sub; + unsigned free_sub : 1; /* Do we need to free the submodule? */ + struct oid_array *commits; +}; + +static const struct submodule *get_default_submodule(const char *path) +{ + struct submodule *ret = NULL; + const char *name = default_name_or_path
[PATCH 9/9] builtin/fetch: check for submodule updates for non branch fetches
Gerrit, the code review tool, has a different workflow than our mailing list based approach. Usually users upload changes to a Gerrit server and continuous integration and testing happens by bots. Sometimes however a user wants to checkout a change locally and look at it locally. For this use case, Gerrit offers a command line snippet to copy and paste to your terminal, which looks like git fetch https:///gerrit refs/changes/ && git checkout FETCH_HEAD For Gerrit changes that contain changing submodule gitlinks, it would be easy to extend both the fetch and checkout with the '--recurse-submodules' flag, such that this command line snippet would produce the state of a change locally. However the functionality added in the previous patch, which would ensure that we fetch the objects in the submodule that the gitlink pointed at, only works for remote tracking branches so far, not for FETCH_HEAD. Make sure that fetching a superproject to its FETCH_HEAD, also respects the existence checks for objects in the submodule recursion. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- builtin/fetch.c | 5 - t/t5526-fetch-submodules.sh | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/builtin/fetch.c b/builtin/fetch.c index 95c44bf6ff..ea6ecd123e 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -887,11 +887,14 @@ static int store_updated_refs(const char *raw_url, const char *remote_name, rc |= update_local_ref(ref, what, rm, ¬e, summary_width); free(ref); - } else + } else { + check_for_new_submodule_commits(&rm->old_oid); format_display(¬e, '*', *kind ? kind : "branch", NULL, *what ? what : "HEAD", "FETCH_HEAD", summary_width); + } + if (note.len) { if (verbosity >= 0 && !shown_url) { fprintf(stderr, _("From %.*s\n"), diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh index af12c50e7d..a509eabb04 100755 --- a/t/t5526-fetch-submodules.sh +++ b/t/t5526-fetch-submodules.sh @@ -615,7 +615,7 @@ test_expect_success "fetch new commits on-demand when they are not reachable" ' git update-ref refs/changes/2 $D && ( cd downstream && - git fetch --recurse-submodules --recurse-submodules-default on-demand origin refs/changes/2:refs/heads/my_branch && + git fetch --recurse-submodules origin refs/changes/2 && git -C submodule cat-file -t $C && git checkout --recurse-submodules FETCH_HEAD ) -- 2.19.0
[PATCH 0/9] Resending sb/submodule-recursive-fetch-gets-the-tip
This is based on ao/submodule-wo-gitmodules-checked-out. This resends origin/sb/submodule-recursive-fetch-gets-the-tip, resolving the issues pointed out via origin/xxx/sb-submodule-recursive-fetch-gets-the-tip-in-pu by basing this series on origin/ao/submodule-wo-gitmodules-checked-out A range-diff below shows how picking a different base changed the patches, apart from that no further adjustments have been made. Thanks, Stefan Stefan Beller (9): sha1-array: provide oid_array_filter submodule.c: fix indentation submodule.c: sort changed_submodule_names before searching it submodule.c: move global changed_submodule_names into fetch submodule struct submodule.c: do not copy around submodule list repository: repo_submodule_init to take a submodule struct submodule: fetch in submodules git directory instead of in worktree fetch: retry fetching submodules if needed objects were not fetched builtin/fetch: check for submodule updates for non branch fetches Documentation/technical/api-oid-array.txt| 5 + builtin/fetch.c | 14 +- builtin/grep.c | 17 +- builtin/ls-files.c | 12 +- repository.c | 27 +- repository.h | 11 +- sha1-array.c | 17 ++ sha1-array.h | 3 + submodule.c | 275 +++ t/helper/test-submodule-nested-repo-config.c | 8 +- t/t5526-fetch-submodules.sh | 23 +- 11 files changed, 315 insertions(+), 97 deletions(-) git range-diff origin/xxx/sb-submodule-recursive-fetch-gets-the-tip-in-pu... [...] 585: ac1f98a0df < -: -- doc: move git-rev-parse from porcelain to plumbing 586: 7cf1a0fbef = 1: a035323c49 sha1-array: provide oid_array_filter 587: 01077381d0 = 2: 30ed20b4f0 submodule.c: fix indentation 588: 4b0cdf5899 = 3: cd590ea88d submodule.c: sort changed_submodule_names before searching it 589: 78e5099ecc ! 4: ce959811ba submodule.c: move global changed_submodule_names into fetch submodule struct @@ -12,7 +12,7 @@ --- a/submodule.c +++ b/submodule.c @@ - #include "commit-reach.h" + #include "object-store.h" static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF; -static struct string_list changed_submodule_names = STRING_LIST_INIT_DUP; 590: d813f18bb3 = 5: 151f9a8ad4 submodule.c: do not copy around submodule list 591: a077d63af7 ! 6: 3a97743fa2 repository: repo_submodule_init to take a submodule struct @@ -15,7 +15,6 @@ Also move its documentation into the header file. Signed-off-by: Stefan Beller -Signed-off-by: Junio C Hamano diff --git a/builtin/grep.c b/builtin/grep.c --- a/builtin/grep.c @@ -31,12 +30,16 @@ + int hit; - if (!is_submodule_active(superproject, path)) + /* +@@ return 0; + } -- if (repo_submodule_init(&submodule, superproject, path)) -+ if (repo_submodule_init(&subrepo, superproject, sub)) +- if (repo_submodule_init(&submodule, superproject, path)) { ++ if (repo_submodule_init(&subrepo, superproject, sub)) { + grep_read_unlock(); return 0; + } - repo_read_gitmodules(&submodule); + repo_read_gitmodules(&subrepo); @@ -44,9 +47,9 @@ /* * NEEDSWORK: This adds the submodule's object directory to the list of @@ +* store is no longer global and instead is a member of the repository * object. */ - grep_read_lock(); - add_to_alternates_memory(submodule.objects->objectdir); + add_to_alternates_memory(subrepo.objects->objectdir); grep_read_unlock(); @@ -100,19 +103,6 @@ static void show_ce(struct repository *repo, struct dir_struct *dir, -diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c a/builtin/submodule--helper.c -+++ b/builtin/submodule--helper.c -@@ - if (!sub) - BUG("We could get the submodule handle before?"); - -- if (repo_submodule_init(&subrepo, the_repository, path)) -+ if (repo_submodule_init(&subrepo, the_repository, sub)) - die(_("could not get a repository handle for submodule '%s'"), path); - - if (!repo_config_get_string(&subrepo, "core.worktree", &cw)) { - diff --git a/repository.c b/repository.c --- a/repository.c +++ b/repository.c @@ -197,3 +187,32 @@ void repo_clear(struct repository *repo); /* + +diff --git a/t/helper/test-submodule-nested-repo-config.c b/t/helper/test-submodule-nested-repo-config.c +--- a/t/helper/test-submodule-nested-repo-config.c b/t/helper/test-submodule-nested-repo-config.c
[PATCH 2/9] submodule.c: fix indentation
The submodule subsystem is really bad at staying within 80 characters. Fix it while we are here. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/submodule.c b/submodule.c index 2b7082b2db..e145ebbb16 100644 --- a/submodule.c +++ b/submodule.c @@ -1258,7 +1258,8 @@ static int get_next_submodule(struct child_process *cp, if (!submodule) { const char *name = default_name_or_path(ce->name); if (name) { - default_submodule.path = default_submodule.name = name; + default_submodule.path = name; + default_submodule.name = name; submodule = &default_submodule; } } @@ -1268,8 +1269,10 @@ static int get_next_submodule(struct child_process *cp, default: case RECURSE_SUBMODULES_DEFAULT: case RECURSE_SUBMODULES_ON_DEMAND: - if (!submodule || !unsorted_string_list_lookup(&changed_submodule_names, -submodule->name)) + if (!submodule || + !unsorted_string_list_lookup( + &changed_submodule_names, + submodule->name)) continue; default_argv = "on-demand"; break; -- 2.19.0
[PATCH 3/9] submodule.c: sort changed_submodule_names before searching it
We can string_list_insert() to maintain sorted-ness of the list as we find new items, or we can string_list_append() to build an unsorted list and sort it at the end just once. As we do not rely on the sortedness while building the list, we pick the "append and sort at the end" as it has better worst case execution times. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/submodule.c b/submodule.c index e145ebbb16..9fbfcfcfe1 100644 --- a/submodule.c +++ b/submodule.c @@ -1270,7 +1270,7 @@ static int get_next_submodule(struct child_process *cp, case RECURSE_SUBMODULES_DEFAULT: case RECURSE_SUBMODULES_ON_DEMAND: if (!submodule || - !unsorted_string_list_lookup( + !string_list_lookup( &changed_submodule_names, submodule->name)) continue; @@ -1364,6 +1364,7 @@ int fetch_populated_submodules(struct repository *r, /* default value, "--submodule-prefix" and its value are added later */ calculate_changed_submodule_paths(); + string_list_sort(&changed_submodule_names); run_processes_parallel(max_parallel_jobs, get_next_submodule, fetch_start_failure, -- 2.19.0
[PATCH 7/9] submodule: fetch in submodules git directory instead of in worktree
This patch started as a refactoring to make 'get_next_submodule' more readable, but upon doing so, I realized that "git fetch" of the submodule actually doesn't need to be run in the submodules worktree. So let's run it in its git dir instead. That should pave the way towards fetching submodules that are currently not checked out. This patch leaks the cp->dir in get_next_submodule, as any further callback in run_processes_parallel doesn't have access to the child process any more. In an early iteration of this patch, the function get_submodule_repo_for directly returned the string containing the git directory, which would be a better design choice for this patch. However the next patch both fixes the memory leak of cp->dir and also has a use case for using the full repository handle of the submodule, so it makes sense to introduce the get_submodule_repo_for here already. Signed-off-by: Stefan Beller Signed-off-by: Junio C Hamano --- submodule.c | 51 +++-- t/t5526-fetch-submodules.sh | 7 - 2 files changed, 44 insertions(+), 14 deletions(-) diff --git a/submodule.c b/submodule.c index cbefe5f54d..30c06507e3 100644 --- a/submodule.c +++ b/submodule.c @@ -495,6 +495,12 @@ void prepare_submodule_repo_env(struct argv_array *out) DEFAULT_GIT_DIR_ENVIRONMENT); } +static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out) +{ + prepare_submodule_repo_env_no_git_dir(out); + argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT); +} + /* Helper function to display the submodule header line prior to the full * summary output. If it can locate the submodule objects directory it will * attempt to lookup both the left and right commits and put them into the @@ -1241,6 +1247,29 @@ static int get_fetch_recurse_config(const struct submodule *submodule, return spf->default_option; } +static struct repository *get_submodule_repo_for(struct repository *r, +const struct submodule *sub) +{ + struct repository *ret = xmalloc(sizeof(*ret)); + + if (repo_submodule_init(ret, r, sub)) { + /* +* No entry in .gitmodules? Technically not a submodule, +* but historically we supported repositories that happen to be +* in-place where a gitlink is. Keep supporting them. +*/ + struct strbuf gitdir = STRBUF_INIT; + strbuf_repo_worktree_path(&gitdir, r, "%s/.git", sub->path); + if (repo_init(ret, gitdir.buf, NULL)) { + strbuf_release(&gitdir); + return NULL; + } + strbuf_release(&gitdir); + } + + return ret; +} + static int get_next_submodule(struct child_process *cp, struct strbuf *err, void *data, void **task_cb) { @@ -1248,12 +1277,11 @@ static int get_next_submodule(struct child_process *cp, struct submodule_parallel_fetch *spf = data; for (; spf->count < spf->r->index->cache_nr; spf->count++) { - struct strbuf submodule_path = STRBUF_INIT; - struct strbuf submodule_git_dir = STRBUF_INIT; struct strbuf submodule_prefix = STRBUF_INIT; const struct cache_entry *ce = spf->r->index->cache[spf->count]; - const char *git_dir, *default_argv; + const char *default_argv; const struct submodule *submodule; + struct repository *repo; struct submodule default_submodule = SUBMODULE_INIT; if (!S_ISGITLINK(ce->ce_mode)) @@ -1288,16 +1316,12 @@ static int get_next_submodule(struct child_process *cp, continue; } - strbuf_repo_worktree_path(&submodule_path, spf->r, "%s", ce->name); - strbuf_addf(&submodule_git_dir, "%s/.git", submodule_path.buf); strbuf_addf(&submodule_prefix, "%s%s/", spf->prefix, ce->name); - git_dir = read_gitfile(submodule_git_dir.buf); - if (!git_dir) - git_dir = submodule_git_dir.buf; - if (is_directory(git_dir)) { + repo = get_submodule_repo_for(spf->r, submodule); + if (repo) { child_process_init(cp); - cp->dir = strbuf_detach(&submodule_path, NULL); - prepare_submodule_repo_env(&cp->env_array); + prepare_submodule_repo_env_in_gitdir(&cp->env_array); + cp->dir = xstrdup(repo->gitdir); cp->git_cmd = 1; if (!spf->quiet) strbuf_addf(err, "Fetching submodule %s%s\n", @@ -1307,10 +1331,11 @@ static int get_next_submodule(struct child_process *cp, argv_array_push(&cp-
On overriding make variables from the environment...
Our Makefile has lines like these: CFLAGS = -g -O2 -Wall CC = cc AR = ar SPATCH = spatch Note the use of '=', not '?='. This means that these variables can be overridden from the command line, i.e. 'make CC=gcc-X.Y' will build with that particular GCC version, but not from the environment, i.e. 'CC=gcc-X.Y make' will still build with 'cc'. This can be confusing for developers who come from other projects where they used to run 'CC=whatever make'. And our build jobs on Travis CI are badly affected by this. We have dedicated build jobs to build Git with GCC and Clang both on Linux and OSX from the very beginning (522354d70f (Add Travis CI support, 2015-11-27)). But guess how Travis CI specifies which compiler to use! With 'export CC=gcc' and 'export CC=clang', respectively. Consequently, our Clang Linux build job has always used gcc, because that's where 'cc' points at on Linux by default, while the GCC OSX build job has always used Clang. Oh, well. Furthermore, see 37fa4b3c78 (travis-ci: run gcc-8 on linux-gcc jobs, 2018-05-19), where Duy added an 'export CC=gcc-8' in an attempt to use a more modern compiler, but this had no effect either. I'm not sure what to do. I'm fine with updating our 'ci/' scripts to explicitly respect CC in the environment (either by running 'make CC=$CC' or by writing $CC into 'config.mak'). Or I could update our Makefile to use '?=' for specific variables, but: - I'm afraid to break somebody's setup relying on the current behavior and CC having different values in the environment and in 'config.mak'. - Where to stop, IOW which variables should be set with '?='? CFLAGS, LDFLAGS, CC, AR, ..., SPATCH, SPATCH_FLAGS? Dunno. We already have 'STRIP ?= strip' and there are variables that are checked explicitly (e.g. 'DEVELOPER=y make' works). Note also that prior to b05701c5b4 (Make CFLAGS overridable from make command line., 2005-08-06) our Makefile used 'CC?=gcc' as well.
receive.denyCurrentBranch=updateInstead updates working tree even when receive.denyNonFastForwards rejects push
Hi, I am looking to report the below behavior when seems incorrect to me when receive.denyCurrentBranch is set to updateInstead and receive.denyNonFastForwards is set to true. Below are the steps to reproduce the scenario. Please excuse my ignorance if I'm missing something fundamental. Step 1 - Setup remote repository (remote host): git config --global receive.denyCurrentBranch updateInstead git config --global receive.denyNonFastForwards true mkdir /tmp/hello cd /tmp/hello git init echo hello > hello.txt git add . && git commit -m "hello.txt" Step 2 - Create 2 Clones (local host): git clone ssh://REMOTEIP/tmp/hello /tmp/hello1 git clone ssh://REMOTEIP/tmp/hello /tmp/hello2 Step 3 - Push a commit from Clone 1 cd /tmp/hello1 echo hello1 > hello1.txt git add . && git commit -m "hello1.txt" git push --> at this point server working tree contains hello1.txt which is expected Step 4: Try to force push a commit from Clone 2 cd /tmp/hello2 echo hello2 > hello2.txt git add . && git commit -m "hello2.txt" git push --> Remote rejects with message that remote contains work i do not have locally git push --force --> Remote rejects again with error: denying non-fast-forward refs/heads/master (you should pull first) --> At this point, since the push is rejected, I expect the servers working tree to not contain any rejected changes. BUT the servers working tree got updated to delete hello1.txt and create hello2.txt. Push rejected but not really. I also noticed the same behavior (incorrect) when the update hook rejects changes on the server (but not the pre-receive hook). Thank you
Re: [PATCH v2 06/13] Add a build definition for Azure DevOps
On Mon, Oct 15, 2018 at 03:12:06AM -0700, Johannes Schindelin via GitGitGadget wrote: > diff --git a/azure-pipelines.yml b/azure-pipelines.yml > new file mode 100644 > index 00..b5749121d2 > --- /dev/null > +++ b/azure-pipelines.yml > @@ -0,0 +1,319 @@ > +resources: > +- repo: self > + fetchDepth: 1 > + > +phases: > +- phase: linux_clang > + displayName: linux-clang > + condition: succeeded() > + queue: > +name: Hosted Ubuntu 1604 > + steps: > + - bash: | > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || > ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache > gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 > + > + sudo apt-get update && > + sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev > libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin && > + > + export CC=clang || exit 1 > + > + ci/install-dependencies.sh I think you would want to 'exit 1' when this script fails. This applies to other build jobs (erm, phases?) below as well. > + ci/run-build-and-tests.sh || { > + ci/print-test-failures.sh > + exit 1 > + } > + > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount > "$HOME/test-cache" || exit 1 > +displayName: 'ci/run-build-and-tests.sh' > +env: > + GITFILESHAREPWD: $(gitfileshare.pwd) > + - task: PublishTestResults@2 > +displayName: 'Publish Test Results **/TEST-*.xml' > +inputs: > + mergeTestResults: true > + testRunTitle: 'linux-clang' > + platform: Linux > + publishRunAttachments: false > +condition: succeededOrFailed() > + > +- phase: linux_gcc > + displayName: linux-gcc > + condition: succeeded() > + queue: > +name: Hosted Ubuntu 1604 > + steps: > + - bash: | > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || > ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache > gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 > + > + sudo apt-get update && > + sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev > libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin || exit 1 > + On Travis CI the Linux GCC build job uses gcc-8 instead of whatever the default is in that old-ish Ubuntu LTS; see 37fa4b3c78 (travis-ci: run gcc-8 on linux-gcc jobs, 2018-05-19). > + ci/install-dependencies.sh > + ci/run-build-and-tests.sh || { > + ci/print-test-failures.sh > + exit 1 > + } > + > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount > "$HOME/test-cache" || exit 1 > +displayName: 'ci/run-build-and-tests.sh' > +env: > + GITFILESHAREPWD: $(gitfileshare.pwd) > + - task: PublishTestResults@2 > +displayName: 'Publish Test Results **/TEST-*.xml' > +inputs: > + mergeTestResults: true > + testRunTitle: 'linux-gcc' > + platform: Linux > + publishRunAttachments: false > +condition: succeededOrFailed() > + > +- phase: osx_clang > + displayName: osx-clang > + condition: succeeded() > + queue: > +name: Hosted macOS > + steps: > + - bash: | > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || > ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache > gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 > + > + export CC=clang > + > + ci/install-dependencies.sh > + ci/run-build-and-tests.sh || { > + ci/print-test-failures.sh > + exit 1 > + } > + > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount > "$HOME/test-cache" || exit 1 > +displayName: 'ci/run-build-and-tests.sh' > +env: > + GITFILESHAREPWD: $(gitfileshare.pwd) > + - task: PublishTestResults@2 > +displayName: 'Publish Test Results **/TEST-*.xml' > +inputs: > + mergeTestResults: true > + testRunTitle: 'osx-clang' > + platform: macOS > + publishRunAttachments: false > +condition: succeededOrFailed() > + > +- phase: osx_gcc > + displayName: osx-gcc > + condition: succeeded() > + queue: > +name: Hosted macOS > + steps: > + - bash: | > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || > ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache > gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 > + Here you should 'export CC=gcc', because on macOS 'cc' is 'clang' by default. Note, however, that setting 'CC' in the environment alone has no effect on the build process, it will still use 'cc'. Keep an eye on where this thread will lead to: https://public-inbox.org/git/20181016184537.gn19...@szeder.dev/T/#u > + ci/install-dependencies.sh > + ci/run-build-and-tests.sh || { > + ci/print-test-failures.sh > + exit 1 > + } > + > + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount > "$HOME/test-cache" || exit 1 > +displayName: 'ci/run-bu
Re: [PATCH 17/19] submodule: use submodule repos for object lookup
On Thu, Oct 11, 2018 at 3:41 PM Jonathan Tan wrote: > > > +/* > > + * Initialize 'out' based on the provided submodule path. > > + * > > + * Unlike repo_submodule_init, this tolerates submodules not present > > + * in .gitmodules. NEEDSWORK: The repo_submodule_init behavior is > > + * preferrable. This function exists only to preserve historical behavior. > > What do you mean by "The repo_submodule_init behavior is preferable"? If > we need to preserve historical behavior, then it seems that the most > preferable one is something that meets our needs (like open_submodule() > in this patch). > > > +static int open_submodule(struct repository *out, const char *path) > > +{ > > + struct strbuf sb = STRBUF_INIT; > > + > > + if (submodule_to_gitdir(&sb, path) || repo_init(out, sb.buf, NULL)) { > > + strbuf_release(&sb); > > + return -1; > > + } > > + > > + out->submodule_prefix = xstrdup(path); > > Do we need to set submodule_prefix? > > > @@ -507,7 +533,7 @@ static void show_submodule_header(struct diff_options > > *o, const char *path, > > else if (is_null_oid(two)) > > message = "(submodule deleted)"; > > > > - if (add_submodule_odb(path)) { > > + if (open_submodule(sub, path) < 0) { > > This function, as a side effect, writes the open repository to "sub" for > its caller to use. I think it's better if its callers open "sub" and > then pass it to show_submodule_header() if successful. If opening the > submodule is expected to fail sometimes (like it seems here), then we > can allow callers to also pass NULL, and document what happens when NULL > is passed. Thanks for the review of the whole series! I have redone this series, addressing all your comments. I addressed this comment differently than suggested, and put the submodule repository argument at the end of the parameter list, such that it goes with all the other arguments to be filled in. I was about to resend the series, but test-merged with the other submodule series in flight (origin/sb/submodule-recursive-fetch-gets-the-tip) which had some conflicts that I can easily resolve by rebasing on top. It conflicts a lot when merging to next, due to the latest patch ("Apply semantic patches from previous patches"), so I am not sure how to proceed properly. Maybe we'd omit that patch and run 'make coccicheck' on next to apply the semantic patches there instead.
Re: [PATCH v10 00/21] Convert "git stash" to C builtin
On 10/16, Johannes Schindelin wrote: > Hi Thomas, > > On Mon, 15 Oct 2018, Thomas Gummerer wrote: > > > 2: 63f2e0e6f9 ! 2: 2d45985676 strbuf.c: add `strbuf_join_argv()` > > @@ -14,19 +14,17 @@ > > strbuf_setlen(sb, sb->len + sb2->len); > > } > > > > -+const char *strbuf_join_argv(struct strbuf *buf, > > -+ int argc, const char **argv, char delim) > > ++void strbuf_join_argv(struct strbuf *buf, > > ++int argc, const char **argv, char delim) > > While the patch series does not use the return value, I have to ask > whether it would really be useful to change it to return `void`. I could > imagine that there may already be quite a few code paths that would love > to use strbuf_join_argv(), *and* would benefit from the `const char *` > return value. Fair enough. I did suggest changing the return type to void here, as I found the API a bit odd compared to the rest of the strbuf API, however after looking at this again I agree with you, and returning a const char * here does seem more helpful. Sorry about the confusion Paul-Sebastian! > In other words: just because the *current* patches do not make use of that > quite convenient return value does not mean that we should remove that > convenience. > > > 7: a2abd1b4bd ! 8: 974dbaa492 stash: convert apply to builtin > > @@ -370,18 +370,20 @@ > > + > > + if (diff_tree_binary(&out, &info->w_commit)) { > > + strbuf_release(&out); > > -+ return -1; > > ++ return error(_("Could not generate diff > > %s^!."), > > ++ > > oid_to_hex(&info->w_commit)); > > Please start the argument of an `error()` call with a lower-case letter. I think this comes from your fixup! commit ;) But I do agree, these should be lower-case. > > + } > > + > > + ret = apply_cached(&out); > > + strbuf_release(&out); > > + if (ret) > > -+ return -1; > > ++ return error(_("Conflicts in index." > > ++ "Try without --index.")); > > Same here. > > > + > > + discard_cache(); > > + read_cache(); > > + if (write_cache_as_tree(&index_tree, 0, NULL)) > > -+ return -1; > > ++ return error(_("Could not save index > > tree")); > > And here. > > > 15: bd827be103 ! 15: 989db67e9a stash: convert create to builtin > > @@ -119,7 +119,6 @@ > > +static int check_changes(struct pathspec ps, int include_untracked) > > +{ > > + int result; > > -+ int ret = 0; > > I was curious about this change, and could not find it in the > git-stash-v10 tag of https://github.com/ungps/git... This line has been removed in v10, but did exist in v9, so the git-stash-v10 should indeed not have this line. I suggested removing it in [*1*], because it breaks compilation with DEVELOPER=1 at this step. > > 18: 1c501ad666 ! 18: c90e30173a stash: convert save to builtin > > @@ -72,8 +72,10 @@ > > + git_stash_helper_save_usage, > > + PARSE_OPT_KEEP_DASHDASH); > > + > > -+ if (argc) > > -+ stash_msg = (char*) strbuf_join_argv(&buf, argc, argv, > > ' '); > > ++ if (argc) { > > ++ strbuf_join_argv(&buf, argc, argv, ' '); > > ++ stash_msg = buf.buf; > > ++ } > > Aha! So there *was* a user of that return value. I really would prefer a > non-void return value here. Right, I'd argue we're mis-using the API here though. do_push_stash who we later pass stash_msg to takes ownership and later free's the memory before returning. This doesn't cause issues in the test suite at the moment, because do_create_stash() doesn't always free stash_msg before assigning a new value to the pointer, but would cause issues when do_create_stash exits early. Rather than the solution I proposed in I think it would be nicer to use 'stash_msg = strbuf_detach(...)' above. I'm still happy with the function returning buf->buf as const char *, but I'm not sure we should use that return value here. > > 19: c4401b21db ! 19: 4360ea875d stash: convert `stash--helper.c` into > > `stash.c` > > @@ -264,9 +320,9 @@ > > - argc = parse_options(argc, argv, prefix, options, > > - git_stash_helper_create_usage, > > - 0); > > -+ /* Startinf with argv[1], since argv[0] is "create" */ > > -+ stash_msg = (char*) strbuf_join_argv(&stash_msg_buf, argc - 1, >
[PATCH v2 2/2] merge-recursive: avoid showing conflicts with merge branch before HEAD
We want to load unmerged entries from HEAD into the index at stage 2 and from MERGE_HEAD into stage 3. Similarly, folks expect merge conflicts to look like HEAD content from our side content from their side MERGE_HEAD not MERGE_HEAD content from their side content from our side HEAD The correct order usually comes naturally and for free, but with renames we often have data in the form {rename_branch, other_branch}, and working relative to the rename first (e.g. for rename/add) is more convenient elsewhere in the code. Address the slight impedance mismatch by having some functions re-call themselves with flipped arguments when the branch order is reversed. Note that setup_rename_conflict_info() has one asymmetry in it, in setting dst_entry1->processed=0 but not doing similarly for dst_entry2->processed. When dealing with rename/rename and similar conflicts, we do not want the processing to happen twice, so the desire to only set one of the entries to unprocessed is intentional. So, while this change modifies which branch's entry will be marked as unprocessed, that dovetails nicely with putting HEAD first so that we get the index stage entries and conflict markers in the right order. Signed-off-by: Elijah Newren --- merge-recursive.c | 33 ++- t/t6036-recursive-corner-cases.sh | 8 2 files changed, 36 insertions(+), 5 deletions(-) diff --git a/merge-recursive.c b/merge-recursive.c index 8a47e54e2f..16980db7f9 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -228,7 +228,26 @@ static inline void setup_rename_conflict_info(enum rename_type rename_type, struct stage_data *src_entry1, struct stage_data *src_entry2) { - struct rename_conflict_info *ci = xcalloc(1, sizeof(struct rename_conflict_info)); + struct rename_conflict_info *ci; + + /* +* When we have two renames involved, it's easiest to get the +* correct things into stage 2 and 3, and to make sure that the +* content merge puts HEAD before the other branch if we just +* ensure that branch1 == o->branch1. So, simply flip arguments +* around if we don't have that. +*/ + if (dst_entry2 && branch1 != o->branch1) { + setup_rename_conflict_info(rename_type, + pair2, pair1, + branch2,branch1, + dst_entry2, dst_entry1, + o, + src_entry2, src_entry1); + return; + } + + ci = xcalloc(1, sizeof(struct rename_conflict_info)); ci->rename_type = rename_type; ci->pair1 = pair1; ci->branch1 = branch1; @@ -1283,6 +1302,18 @@ static int merge_mode_and_contents(struct merge_options *o, const char *branch2, struct merge_file_info *result) { + if (o->branch1 != branch1) { + /* +* It's weird getting a reverse merge with HEAD on the bottom +* side of the conflict markers and the other branch on the +* top. Fix that. +*/ + return merge_mode_and_contents(o, one, b, a, + filename, + branch2, branch1, + extra_marker_size, result); + } + result->merge = 0; result->clean = 1; diff --git a/t/t6036-recursive-corner-cases.sh b/t/t6036-recursive-corner-cases.sh index 59e52c5a09..e1cef58f2a 100755 --- a/t/t6036-recursive-corner-cases.sh +++ b/t/t6036-recursive-corner-cases.sh @@ -230,13 +230,13 @@ test_expect_success 'git detects differently handled merges conflict' ' :2:new_a :3:new_a && test_cmp expect actual && - git cat-file -p B:new_a >ours && - git cat-file -p C:new_a >theirs && + git cat-file -p C:new_a >ours && + git cat-file -p B:new_a >theirs && >empty && test_must_fail git merge-file \ - -L "Temporary merge branch 2" \ - -L "" \ -L "Temporary merge branch 1" \ + -L "" \ + -L "Temporary merge branch 2" \ ours empty theirs && sed -e "s/^\([<=>]\)/\1\1\1/" ours >expect && git cat-file -p :1:new_a >actual && -- 2.19.1.280.g0c175526bf
[PATCH v2 1/2] merge-recursive: improve auto-merging messages with path collisions
Each individual file involved in a rename could have also been modified on both sides of history, meaning it may need to have content merges. If two such files are renamed into the same location, then on top of the two natural auto-merging messages we also have to two-way merge the result, giving us messages that look like Auto-merging somefile.c (was somecase.c) Auto-merging somefile.c (was somefolder.c) Auto-merging somefile.c However, despite the fact that I was the one who put the "(was %s)" portions into the messages (and just a few months ago), I was still initially confused when running into a rename/rename(2to1) case and wondered if somefile.c had been merged three times. Update this to instead be: Auto-merging version of somefile.c from somecase.c Auto-merging version of somefile.c from someportfolio.c Auto-merging somefile.c This is an admittedly long set of messages for a single path, but you only get all three messages when dealing with the rare case of a rename/rename(2to1) conflict where both sides of both original files were also modified, in conflicting ways. Signed-off-by: Elijah Newren --- merge-recursive.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/merge-recursive.c b/merge-recursive.c index 5206d6cfb6..8a47e54e2f 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -1674,8 +1674,8 @@ static int handle_rename_rename_2to1(struct merge_options *o, remove_file(o, 1, a->path, o->call_depth || would_lose_untracked(a->path)); remove_file(o, 1, b->path, o->call_depth || would_lose_untracked(b->path)); - path_side_1_desc = xstrfmt("%s (was %s)", path, a->path); - path_side_2_desc = xstrfmt("%s (was %s)", path, b->path); + path_side_1_desc = xstrfmt("version of %s from %s", path, a->path); + path_side_2_desc = xstrfmt("version of %s from %s", path, b->path); if (merge_mode_and_contents(o, a, c1, &ci->ren1_other, path_side_1_desc, o->branch1, o->branch2, &mfi_c1) || merge_mode_and_contents(o, b, &ci->ren2_other, c2, path_side_2_desc, -- 2.19.1.280.g0c175526bf
[PATCH v2 0/2] More merge cleanups
This series adds a few more cleanups on top of en/merge-cleanup. Changes since v1: - Removed two patches that will instead be included in a follow-on series, as suggested by Junio. - Incorporated commit message cleanups (capitalization and indents) made by Junio to the previous round. Elijah Newren (2): merge-recursive: improve auto-merging messages with path collisions merge-recursive: avoid showing conflicts with merge branch before HEAD merge-recursive.c | 37 --- t/t6036-recursive-corner-cases.sh | 8 +++ 2 files changed, 38 insertions(+), 7 deletions(-) -- 2.19.1.280.g0c175526bf
problem with not being able to enforce git content filters
Hi, TL;DR Our open source project dev team has a continuous problem with git content filters, because developers don't always have them configured. We need a way for git to support content filters w/o using user's .gitconfig. Otherwise it leads to an inconsistent behavior and messed up git checkouts. = Full story: We use a version of the nbstripout content filter (https://github.com/kynan/nbstripout), which removes user-specific information from the jupyter notebooks during commit. But the problem would be the same with any one way clean filter. First the setup: https://github.com/kynan/nbstripout#manual-filter-installation = Set up a git filter using nbstripout as follows: git config filter.nbstripout.clean '/path/to/nbstripout' git config filter.nbstripout.smudge cat git config filter.nbstripout.required true Create a file .gitattributes or .git/info/attributes with: *.ipynb filter=nbstripout Apply the filter for git diff of *.ipynb files: git config diff.ipynb.textconv '/path/to/nbstripout -t' In file .gitattributes or .git/info/attributes add: *.ipynb diff=ipynb = The problem is that it can't be enforced. When it's not enforced, we end up with some devs using it and others don't, or more often is the same dev sometimes doesn't have it configured. When a person has a stripped out notebook checked out, when another person commits un-stripped out notebook, it leads to: invalid `git status` reports, `git pull` breaks, `git stash` doesn't work, since it tries to stash using the filters, and `git pull' can never succeed because it thinks that it'll overwrite the local changes, but `git diff` returns no changes. So the only solution when this happens is to disable the filters, clean up the mess, re-enable the filters. Many people just make a new clone - ouch! And the biggest problem is that it affects all users who may have the filters enabled, e.g. because they worked on a PR, and not just devs - i.e. the repercussions are much bigger than just a few devs affected. We can't use server-side hooks to enforce this because the project is on github. And the devs honestly try to do their best to remember to configure the filters, but for some reason they disappear for them, don't ask me why, I don't know. This is an open source project team, not a work place. You can see some related complaints here: https://github.com/kynan/nbstripout/issues/65#issuecomment-430346894 https://stackoverflow.com/questions/51883227/git-pull-stash-conflicts-with-a-git-filter and I can find you a whole bunch more if you need more evidence. == Proposed solution: There needs to be a way for a project to define content filters w/o going via user's .gitconfig configuration, since due to git's security it can't be distributed to all users and must be enabled manually by each user, which is just not the right solution in this case. Of course, I'm open to other suggestions that may help this issue. "Tell your developers they must configure the filters" is not it - I tried it for a long time and in vain. If you look at our install instructions: https://github.com/fastai/fastai#developer-install git clone https://github.com/fastai/fastai cd fastai tools/run-after-git-clone It already includes an instruction to run a script which enables the filters, but this doesn't seem to help (and no, it's not a problem with the script). The devs report that the configuration is there for a while, and then suddenly it is not there, I don't know why. Perhaps they make a new clone and forget to re-enable the filters, perhaps they disable them to clean up and forget to reenable them, I can't tell. The bottom line it sucks and I hope that you can help with offering a programmatic solution, rather than recommending creating a police department. Thank you for listening. -- Stas Bekman <'>< <'>< https://stasosphere.com https://chestofbooks.com https://experientialsexlab.com https://stason.org https://stasosphere.com/experience-life/my-books
Re: [PATCH v2 10/13] tests: include detailed trace logs with --write-junit-xml upon failure
Hi Gábor, On Tue, 16 Oct 2018, SZEDER Gábor wrote: > On Tue, Oct 16, 2018 at 03:02:38PM +0200, Johannes Schindelin wrote: > > > So I would suggest to go forward with my proposed strategy for the > > moment, right up until the time when we have had the resources to fix > > t5570, for starters. > > I don't really understand what the occasional failures in t5570 have > to do with the amount of information a CI system should gather about > failures in general. I see it plenty of times that too many CI failures essentially render every developer numb. If every 3rd CI run causes a failure, and seemingly every of these failures indicates a mistake in the regression test, rather than a regression, developers stop paying attention. Which is the exact opposite of what I want to achieve here. Ciao, Dscho
[PATCH 0/3] Fix gc segfault
In 9ac3f0e (pack-objects: fix performance issues on packing large deltas, 2018-07-22), a mutex was introduced that is used to guard the call to set the delta size. This commit added code to initialize it, but at an incorrect spot: in init_threaded_search(), while the call to oe_set_delta_size() (and hence to packing_data_lock()) can happen in the call chain check_object() <- get_object_details() <-prepare_pack() <- cmd_pack_objects(), which is long before theprepare_pack() function calls ll_find_deltas() (which initializes the threaded search). Another tell-tale that the mutex was initialized in an incorrect spot is that the function to initialize it lives in builtin/, while the code that uses the mutex is defined in a libgit.a header file. Let's use a more appropriate function: prepare_packing_data(), which not only lives in libgit.a, but has to be called before thepacking_data struct is used that contains that mutex. Johannes Schindelin (3): Fix typo 'detla' -> 'delta' pack-objects (mingw): demonstrate a segmentation fault with large deltas pack-objects (mingw): initialize `packing_data` mutex in the correct spot builtin/pack-objects.c| 1 - pack-objects.c| 3 +++ pack-objects.h| 2 +- t/t5321-pack-large-objects.sh | 32 4 files changed, 36 insertions(+), 2 deletions(-) create mode 100755 t/t5321-pack-large-objects.sh base-commit: 5a0cc8aca797dbd7d2be3b67458ff880ed45cddf Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-47%2Fjamill%2Ffix-gc-segfault-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-47/jamill/fix-gc-segfault-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/47 -- gitgitgadget
[PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas
From: Johannes Schindelin There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large deltas, 2018-07-22) initializes that mutex in the `packing_data` struct. The problem manifests in a segmentation fault on Windows, when a mutex (AKA critical section) is accessed without being initialized. (With pthreads, you apparently do not really have to initialize them?) This was reported in https://github.com/git-for-windows/git/issues/1839. Signed-off-by: Johannes Schindelin --- t/t5321-pack-large-objects.sh | 32 1 file changed, 32 insertions(+) create mode 100755 t/t5321-pack-large-objects.sh diff --git a/t/t5321-pack-large-objects.sh b/t/t5321-pack-large-objects.sh new file mode 100755 index 00..c36c66fbb4 --- /dev/null +++ b/t/t5321-pack-large-objects.sh @@ -0,0 +1,32 @@ +#!/bin/sh +# +# Copyright (c) 2018 Johannes Schindelin +# + +test_description='git pack-object with "large" deltas + +' +. ./test-lib.sh +. "$TEST_DIRECTORY"/lib-pack.sh + +# Two similar-ish objects that we have computed deltas between. +A=01d7713666f4de822776c7622c10f1b07de280dc +B=e68fe8129b546b101aee9510c5328e7f21ca1d18 + +test_expect_success 'setup' ' + clear_packs && + { + pack_header 2 && + pack_obj $A $B && + pack_obj $B + } >ab.pack && + pack_trailer ab.pack && + git index-pack --stdin
[PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot
From: Johannes Schindelin In 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large deltas, 2018-07-22), a mutex was introduced that is used to guard the call to set the delta size. This commit even added code to initialize it, but at an incorrect spot: in `init_threaded_search()`, while the call to `oe_set_delta_size()` (and hence to `packing_data_lock()`) can happen in the call chain `check_object()` <- `get_object_details()` <- `prepare_pack()` <- `cmd_pack_objects()`, which is long before the `prepare_pack()` function calls `ll_find_deltas()` (which initializes the threaded search). Another tell-tale that the mutex was initialized in an incorrect spot is that the function to initialize it lives in builtin/, while the code that uses the mutex is defined in a libgit.a header file. Let's use a more appropriate function: `prepare_packing_data()`, which not only lives in libgit.a, but *has* to be called before the `packing_data` struct is used that contains that mutex. This fixes https://github.com/git-for-windows/git/issues/1839. Signed-off-by: Johannes Schindelin --- builtin/pack-objects.c| 1 - pack-objects.c| 3 +++ t/t5321-pack-large-objects.sh | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index e6316d294d..e752cf9c7a 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2363,7 +2363,6 @@ static void init_threaded_search(void) pthread_mutex_init(&cache_mutex, NULL); pthread_mutex_init(&progress_mutex, NULL); pthread_cond_init(&progress_cond, NULL); - pthread_mutex_init(&to_pack.lock, NULL); old_try_to_free_routine = set_try_to_free_routine(try_to_free_from_threads); } diff --git a/pack-objects.c b/pack-objects.c index 7e624c30eb..b6cdbb0166 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -148,6 +148,9 @@ void prepare_packing_data(struct packing_data *pdata) 1U << OE_SIZE_BITS); pdata->oe_delta_size_limit = git_env_ulong("GIT_TEST_OE_DELTA_SIZE", 1UL << OE_DELTA_SIZE_BITS); +#ifndef NO_PTHREADS + pthread_mutex_init(&pdata->lock, NULL); +#endif } struct object_entry *packlist_alloc(struct packing_data *pdata, diff --git a/t/t5321-pack-large-objects.sh b/t/t5321-pack-large-objects.sh index c36c66fbb4..a75eab87d3 100755 --- a/t/t5321-pack-large-objects.sh +++ b/t/t5321-pack-large-objects.sh @@ -24,7 +24,7 @@ test_expect_success 'setup' ' git index-pack --stdin
[PATCH 1/3] Fix typo 'detla' -> 'delta'
From: Johannes Schindelin Signed-off-by: Johannes Schindelin --- pack-objects.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-objects.h b/pack-objects.h index 2ca39cfcfe..86ee93feb4 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -377,7 +377,7 @@ static inline unsigned long oe_delta_size(struct packing_data *pack, return e->delta_size_; /* -* pack->detla_size[] can't be NULL because oe_set_delta_size() +* pack->delta_size[] can't be NULL because oe_set_delta_size() * must have been called when a new delta is saved with * oe_set_delta(). * If oe_delta() returns NULL (i.e. default state, which means -- gitgitgadget
Re: [PATCH] submodule helper: convert relative URL to absolute URL if needed
Stefan Beller wrote: > Reported-by: Jaewoong Jung > Signed-off-by: Stefan Beller > --- > builtin/submodule--helper.c | 51 - > t/t7400-submodule-basic.sh | 24 + > 2 files changed, 58 insertions(+), 17 deletions(-) Reviewed-by: Jonathan Nieder Thanks for your patient work.
Re: problem with not being able to enforce git content filters
On Tue, Oct 16 2018, Stas Bekman wrote: > When a person has a stripped out notebook checked out, when another > person commits un-stripped out notebook, it leads to: invalid `git > status` reports, `git pull` breaks, `git stash` doesn't work, since it > tries to stash using the filters, and `git pull' can never succeed > because it thinks that it'll overwrite the local changes, but `git diff` > returns no changes. Planting a flag here. Let's get to this later. > So the only solution when this happens is to disable the filters, clean > up the mess, re-enable the filters. Many people just make a new clone - > ouch! > > And the biggest problem is that it affects all users who may have the > filters enabled, e.g. because they worked on a PR, and not just devs - > i.e. the repercussions are much bigger than just a few devs affected. > > We can't use server-side hooks to enforce this because the project is on > github. Ultimately git's a distributed system and we won't ever be able to enforce that users in their local copies use filters, and they might edit stuff e.g. in the GitHub UI directly, or with some other git implementation. So if you have such special needs maybe consider hosting your own setup where you can have pre-receive hooks, or within GitHub e.g. enforce that all things must go through merge requests, and have some robot that checks that the content to be merged has gone through filters before being approved. > And the devs honestly try to do their best to remember to configure the > filters, but for some reason they disappear for them, don't ask me why, > I don't know. This is an open source project team, not a work place. *Picks up flag*. For the purposes of us trying to understand this report it would be really useful to boil what's happening down to some step-by-step reproducible recipe. I.e. with some dummy filter configured & not configured how does "git pull" end up breaking, and in this case you're alluding to git simply forgetting config, how would that happen? > There needs to be a way for a project to define content filters w/o > going via user's .gitconfig configuration, since due to git's security > it can't be distributed to all users and must be enabled manually by > each user, which is just not the right solution in this case. > > Of course, I'm open to other suggestions that may help this issue. > > "Tell your developers they must configure the filters" is not it - I > tried it for a long time and in vain. If you look at our install > instructions: https://github.com/fastai/fastai#developer-install > > git clone https://github.com/fastai/fastai > cd fastai > tools/run-after-git-clone > > It already includes an instruction to run a script which enables the > filters, but this doesn't seem to help (and no, it's not a problem with > the script). The devs report that the configuration is there for a > while, and then suddenly it is not there, I don't know why. Perhaps they > make a new clone and forget to re-enable the filters, perhaps they > disable them to clean up and forget to reenable them, I can't tell. FWIW I tried following these on my Debian install and both on python2 and python3 I get either syntax errors or a combo of that and a missing pathlib from that script. I don't know Python well enough to debug it. Maybe that's part of the issue? > The bottom line it sucks and I hope that you can help with offering a > programmatic solution, rather than recommending creating a police > department. I think it would be great to have .gitconfig in-repo support, but a lot of security & UI problems need to be surmounted before that would happen, here's some previous ramblings of mine on that issue: https://public-inbox.org/git/?q=87zi6eakkt.fsf%40evledraar.gmail.com It now occurs to me that a rather minimal proof-of-concept version of that would be: 1. On repository clone, detect if HEAD:.gitconfig exists 2. If it does, and we trust $(git config -f -l) enough, emit some advice() output saying the project suggests setting config so-and-so, and that you could run the following one liner(s) to set it if you agree. 3. Once we have that we could eventually nudge our way towards something like what I suggested in the linked threads above, i.e. consuming some subset of that config directly from the repo's HEAD:.gitconfig
Re: What's cooking in git.git (Oct 2018, #02; Sat, 13)
On 2018.10.12 23:53, Junio C Hamano wrote: > * js/remote-archive-v2 (2018-09-28) 4 commits > (merged to 'next' on 2018-10-12 at 5f34377f60) > + archive: allow archive over HTTP(S) with proto v2 > + archive: implement protocol v2 archive command > + archive: use packet_reader for communications > + archive: follow test standards around assertions > > The original implementation of "git archive --remote" more or less > bypassed the transport layer and did not work over http(s). The > version 2 of the protocol is defined to allow going over http(s) as > well as Git native transport. > > Will merge to 'master'. The first two patches (test cleanup and packet_reader refactor) are OK, but the latter two will break the archive command when an old client attempts to talk to a new server (due to the version advertisement problem noted in [1]). Sorry that I didn't catch that these had made it into next. [1]: https://public-inbox.org/git/20180927223314.ga230...@google.com/#t
[RFC] revision: Add --sticky-default option
Hi, here's a long-overdue update of my proposal from August 29: [RFC] revision: Don't let ^ cancel out the default Does this look more acceptable that my first shot? Thanks, Andreas -- Some commands like 'log' default to HEAD if no other revisions are specified on the command line or otherwise. Currently, excludes (^) cancel out that default, so when a command only excludes revisions (e.g., 'git log ^origin/master'), it will produce no output. With the --sticky-default option, the default becomes more "sticky" and is no longer canceled out by excludes. This is useful in wrappers that exclude certain revisions: for example, a simple alias l='git log --sticky-default ^origin/master' will show the revisions between origin/master and HEAD when invoked without arguments, and 'l foo' will show the revisions between origin/master and foo. Signed-off-by: Andreas Gruenbacher --- revision.c | 31 +++ revision.h | 1 + t/t4202-log.sh | 6 ++ 3 files changed, 34 insertions(+), 4 deletions(-) diff --git a/revision.c b/revision.c index e18bd530e..6c93ec74b 100644 --- a/revision.c +++ b/revision.c @@ -1159,6 +1159,25 @@ static void add_rev_cmdline(struct rev_info *revs, info->nr++; } +static int has_revisions(struct rev_info *revs) +{ + struct rev_cmdline_info *info = &revs->cmdline; + + return info->nr != 0; +} + +static int has_interesting_revisions(struct rev_info *revs) +{ + struct rev_cmdline_info *info = &revs->cmdline; + unsigned int n; + + for (n = 0; n < info->nr; n++) { + if (!(info->rev[n].flags & UNINTERESTING)) + return 1; + } + return 0; +} + static void add_rev_cmdline_list(struct rev_info *revs, struct commit_list *commit_list, int whence, @@ -2132,6 +2151,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if (!strcmp(arg, "--children")) { revs->children.name = "children"; revs->limited = 1; + } else if (!strcmp(arg, "--sticky-default")) { + revs->sticky_default = 1; } else if (!strcmp(arg, "--ignore-missing")) { revs->ignore_missing = 1; } else if (!strcmp(arg, "--exclude-promisor-objects")) { @@ -2319,7 +2340,8 @@ static void NORETURN diagnose_missing_default(const char *def) */ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct setup_revision_opt *opt) { - int i, flags, left, seen_dashdash, got_rev_arg = 0, revarg_opt; + int i, flags, left, seen_dashdash, revarg_opt; + int cancel_default; struct argv_array prune_data = ARGV_ARRAY_INIT; const char *submodule = NULL; @@ -2401,8 +2423,6 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s argv_array_pushv(&prune_data, argv + i); break; } - else - got_rev_arg = 1; } if (prune_data.argc) { @@ -2431,7 +2451,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s opt->tweak(revs, opt); if (revs->show_merge) prepare_show_merge(revs); - if (revs->def && !revs->pending.nr && !revs->rev_input_given && !got_rev_arg) { + cancel_default = revs->sticky_default ? +has_interesting_revisions(revs) : +has_revisions(revs); + if (revs->def && !revs->rev_input_given && !cancel_default) { struct object_id oid; struct object *object; struct object_context oc; diff --git a/revision.h b/revision.h index 2b30ac270..570fa1a6d 100644 --- a/revision.h +++ b/revision.h @@ -92,6 +92,7 @@ struct rev_info { unsigned int early_output; + unsigned intsticky_default:1; unsigned intignore_missing:1, ignore_missing_links:1; diff --git a/t/t4202-log.sh b/t/t4202-log.sh index 153a50615..9517a65da 100755 --- a/t/t4202-log.sh +++ b/t/t4202-log.sh @@ -213,6 +213,12 @@ test_expect_success 'git show leaves list of commits as given' ' test_cmp expect actual ' +printf "sixth\nfifth\n" > expect +test_expect_success '--sticky-default ^' ' + git log --pretty="tformat:%s" --sticky-default ^HEAD~2 > actual && + test_cmp expect actual +' + test_expect_success 'setup case sensitivity tests' ' echo case >one && test_tick && -- 2.17.2
Re: On overriding make variables from the environment...
Hi, SZEDER Gábor wrote: > Our Makefile has lines like these: > > CFLAGS = -g -O2 -Wall > CC = cc > AR = ar > SPATCH = spatch > > Note the use of '=', not '?='. [...] > I'm not sure what to do. I'm fine with updating our 'ci/' scripts to > explicitly respect CC in the environment (either by running 'make > CC=$CC' or by writing $CC into 'config.mak'). Or I could update our > Makefile to use '?=' for specific variables, but: That's a good question. I don't have a strong opinion myself, so I tend to trust larger projects like Linux to have thought this through more, and they use 'CC = cc' as well. So I'd lean toward the updating 'ci/' scripts approach, to do something like make ${CC:+"CC=$CC"} ... (or the equivalent multi-line construction). That also has the bonus of being explicit. Just my two cents, Jonathan
[PATCH] upload-pack: clear flags before each v2 request
Suppose a server has the following commit graph: A B \ / O We create a client by cloning A from the server with depth 1, and add many commits to it (so that future fetches span multiple requests due to lengthy negotiation). If it then fetches B using protocol v2, the fetch spanning multiple requests, the resulting packfile does not contain O even though the client did report that A is shallow. This is because upload_pack_v2() can be called multiple times while processing the same session. During the 2nd and all subsequent invocations, some object flags remain from the previous invocations. In particular, CLIENT_SHALLOW remains, preventing process_shallow() from adding client-reported shallows to the "shallows" array, and hence pack-objects not knowing about these client-reported shallows. Therefore, teach upload_pack_v2() to clear object flags at the start of each invocation. (One alternative is to reduce or eliminate usage of object flags in protocol v2, but that doesn't seem feasible because almost all 8 flags are used pervasively in v2 code.) Signed-off-by: Jonathan Tan --- This was noticed by Arturas Moskvinas in [1]. The reproduction steps given were to repeat a shallow fetch twice in succession, but I found it easier to write a more understandable test if I made the 2nd fetch an ordinary fetch. In any case, I also reran the original reproduction steps, and the fetch completes without error. This patch doesn't cover the negotiation issue that I mentioned in my previous reply [2]. [1] https://public-inbox.org/git/CAGY-PBgsG-T3JY=awszwgmpfx+jdx-a1fcv0s6vr067bsqg...@mail.gmail.com/ [2] https://public-inbox.org/git/20181013004356.257709-1-jonathanta...@google.com/ --- t/t5702-protocol-v2.sh | 25 + upload-pack.c | 5 + 2 files changed, 30 insertions(+) diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 88a886975d..70b88385ba 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -429,6 +429,31 @@ test_expect_success 'fetch supports include-tag and tag following' ' git -C client cat-file -e $(git -C client rev-parse annotated_tag) ' +test_expect_success 'upload-pack respects client shallows' ' + rm -rf server client trace && + + git init server && + test_commit -C server base && + test_commit -C server client_has && + + git clone --depth=1 "file://$(pwd)/server" client && + + # Add extra commits to the client so that the whole fetch takes more + # than 1 request (due to negotiation) + for i in $(seq 1 32) + do + test_commit -C client c$i + done && + + git -C server checkout -b newbranch base && + test_commit -C server client_wants && + + GIT_TRACE_PACKET="$(pwd)/trace" git -C client -c protocol.version=2 \ + fetch origin newbranch && + # Ensure that protocol v2 is used + grep "git< version 2" trace +' + # Test protocol v2 with 'http://' transport # . "$TEST_DIRECTORY"/lib-httpd.sh diff --git a/upload-pack.c b/upload-pack.c index 62a1000f44..de7de1de38 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -37,6 +37,9 @@ #define CLIENT_SHALLOW (1u << 18) #define HIDDEN_REF (1u << 19) +#define ALL_FLAGS (THEY_HAVE | OUR_REF | WANTED | COMMON_KNOWN | SHALLOW | \ + NOT_SHALLOW | CLIENT_SHALLOW | HIDDEN_REF) + static timestamp_t oldest_have; static int deepen_relative; @@ -1393,6 +1396,8 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys, enum fetch_state state = FETCH_PROCESS_ARGS; struct upload_pack_data data; + clear_object_flags(ALL_FLAGS); + git_config(upload_pack_config, NULL); upload_pack_data_init(&data); -- 2.19.0.271.gfe8321ec05.dirty
Re: problem with not being able to enforce git content filters
On 2018-10-16 02:17 PM, Ævar Arnfjörð Bjarmason wrote: [...] >> We can't use server-side hooks to enforce this because the project is on >> github. > > Ultimately git's a distributed system and we won't ever be able to > enforce that users in their local copies use filters, and they might > edit stuff e.g. in the GitHub UI directly, or with some other git > implementation. In this particular case github won't be a problem, since for the problem to appear it has to be executed on the user side. Editing directly in github UI is not a problem. Just to give a little big more context to the issue in first place. A jupyter notebook is a json file that contains the source code, the outputs from executing that code and the unique user environment bits. We want to "git" the source code but not the rest. Otherwise merging is a hell. Hence the stripping. There are other approaches to solve this problem, besides stripping, but they all require some kind of pre-processing before committing the file. > So if you have such special needs maybe consider hosting your own setup > where you can have pre-receive hooks, or within GitHub e.g. enforce that > all things must go through merge requests, and have some robot that > checks that the content to be merged has gone through filters before > being approved. Yes, that would be ideal. But I doubt this is going to happen, I'm just a contributing developer, not the creator/owner of the project. And as I said this affects anybody who collaborates on jupyter notebooks, not just in our project. I think there are several millions of them on github. > *Picks up flag*. For the purposes of us trying to understand this report > it would be really useful to boil what's happening down to some > step-by-step reproducible recipe. > > I.e. with some dummy filter configured & not configured how does "git > pull" end up breaking, and in this case you're alluding to git simply > forgetting config, how would that happen? The problem of 'git pull' and 'git status' and 'git stash' is presented in details here: https://stackoverflow.com/questions/51883227/git-pull-stash-conflicts-with-a-git-filter and more here: https://github.com/kynan/nbstripout/issues/65 https://github.com/jupyter/nbdime/issues/410#issuecomment-412999758 The problem of configuration disappearing I sadly have no input. It doesn't happen to me, and all I get from others is that it first works, and then it doesn't. Perhaps it has something to do with some of them using windows. I don't know, sorry. >> The bottom line it sucks and I hope that you can help with offering a >> programmatic solution, rather than recommending creating a police >> department. > > I think it would be great to have .gitconfig in-repo support, but a lot > of security & UI problems need to be surmounted before that would > happen, here's some previous ramblings of mine on that issue: > https://public-inbox.org/git/?q=87zi6eakkt.fsf%40evledraar.gmail.com > > It now occurs to me that a rather minimal proof-of-concept version of > that would be: > > 1. On repository clone, detect if HEAD:.gitconfig exists > > 2. If it does, and we trust $(git config -f -l) > enough, emit some advice() output saying the project suggests > setting config so-and-so, and that you could run the following one > liner(s) to set it if you agree. > > 3. Once we have that we could eventually nudge our way towards > something like what I suggested in the linked threads above, > i.e. consuming some subset of that config directly from the repo's > HEAD:.gitconfig I like it, Ævar! I feel doing the check and prompting the user on the first push/commit after clone would be a better. It'd be too easy for the user to skip that step on git clone. In our particular case we want it where the problem is introduced, which is on commit, and not on clone. I hope it makes sense. -- Stas Bekman <'>< <'>< https://stasosphere.com https://chestofbooks.com https://experientialsexlab.com https://stason.org https://stasosphere.com/experience-life/my-books
Re: [PATCH v4 9/9] Documentation/config: add odb..promisorRemote
> 1. Teaching partial clone to attempt to fetch missing objects from > multiple remotes instead of only one. This is useful because you > can have a server that is nearby and cheaper to serve from (some > kind of local cache server) that you make requests to first before > falling back to the canonical source of objects. Quoting the above definition of (1) for reference. I think Jonathan Nieder has covered the relevant points well - I'll just expand on (1). > So much for the current setup. For (1), I believe you are proposing to > still have only one effective , so it doesn't necessarily > require modifying the extensions.* configuration. Instead, the idea is > that when trying to access an object, we would follow one of a list of > steps: > > 1. First, check the local object store. If it's there, we're done. > 2. Second, try alternates --- maybe the object is in one of those! > 3. Now, try promisor remotes, one at a time, in user-configured order. > > In other words, I think that for (1) all we would need is a new > configuration > > [object] > missingObjectRemote = local-cache-remote > missingObjectRemote = origin > > The semantics would be that when trying to access a promised object, > we attempt to fetch from these remotes one at a time, in the order > specified. We could require that the remote named in > extensions.partialClone be one of the listed remotes, without having > to care where it shows up in the list. Or allow extensions.partialClone= wherein is not in the missingObjectRemote, in which case is tried first, so that we don't have to reject some configurations. > That way, we get the benefit (1) without having to change the > semantics of extensions.partialClone and without having to care about > the order of sections in the config. What do you think? Let's define the promisor remotes of a repository as those in missingObjectRemote or extensions.partialClone (currently, we talk about "the promisor remote" (singular), defined in extensions.partialClone). Overall, this seems like a reasonable idea to me, if we keep the restriction that we can only fetch with filter from a promisor remote. This allows us to extend the definition of a promisor object in a manner consistent to the current definition - to say "a promisor object is one promised by at least one promisor remote" (currently, "a promisor object is promised by the promisor remote"). In the presence of missingObjectRemote, old versions of Git, when lazily fetching, would only know to try the extensions.partialClone remote. But this is safe because existing data wouldn't be clobbered (since we're not using ideas like adding meaning to the contents of the .promisor file). Also, other things like fsck and gc still work.
Re:Business proposition for you
Hello, Business proposition for you. I have a client from Syrian who will like to invest with your company. My client is willing to invest $4 Million. Can I have your company website to show to my client your company so that they will check and decide if they will invest there funds with you as joint partner. This information is needed urgently. Please reply. Best Regards, Agent Melvin Greg Tel:+1 4045966532
Re: On overriding make variables from the environment...
On Tue, Oct 16, 2018 at 02:54:56PM -0700, Jonathan Nieder wrote: > SZEDER Gábor wrote: > > Our Makefile has lines like these: > > > > CFLAGS = -g -O2 -Wall > > CC = cc > > AR = ar > > SPATCH = spatch > > > > Note the use of '=', not '?='. > [...] > > I'm not sure what to do. I'm fine with updating our 'ci/' scripts to > > explicitly respect CC in the environment (either by running 'make > > CC=$CC' or by writing $CC into 'config.mak'). Or I could update our > > Makefile to use '?=' for specific variables, but: > > That's a good question. I don't have a strong opinion myself, so I > tend to trust larger projects like Linux to have thought this through > more, and they use 'CC = cc' as well. I don't think Linux is a good example to follow in this case, see e.g. 6d62c983f7 (Makefile: Change the default compiler from "gcc" to "cc", 2011-12-20) (in short: Git is supposed to be buildable with compilers other than GCC as well, while Linux not really, so they can hardcode 'CC = gcc'). Also, the projects I have on hand usually have 'CC = gcc' hardcoded in their Makefiles, too, but those Makefiles were generated by their ./configure script (which in turn by ./autogen.sh...), and those tend to write CC from the environment into the generated Makefiles.
[PATCH v4 2/7] test-reach: add run_three_modes method
From: Derrick Stolee The 'test_three_modes' method assumes we are using the 'test-tool reach' command for our test. However, we may want to use the data shape of our commit graph and the three modes (no commit-graph, full commit-graph, partial commit-graph) for other git commands. Split test_three_modes to be a simple translation on a more general run_three_modes method that executes the given command and tests the actual output to the expected output. Signed-off-by: Derrick Stolee --- t/t6600-test-reach.sh | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index d139a00d1d..9d65b8b946 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -53,18 +53,22 @@ test_expect_success 'setup' ' git config core.commitGraph true ' -test_three_modes () { +run_three_modes () { test_when_finished rm -rf .git/objects/info/commit-graph && - test-tool reach $1 actual && + "$@" actual && test_cmp expect actual && cp commit-graph-full .git/objects/info/commit-graph && - test-tool reach $1 actual && + "$@" actual && test_cmp expect actual && cp commit-graph-half .git/objects/info/commit-graph && - test-tool reach $1 actual && + "$@" actual && test_cmp expect actual } +test_three_modes () { + run_three_modes test-tool reach "$@" +} + test_expect_success 'ref_newer:miss' ' cat >input <<-\EOF && A:commit-5-7 -- gitgitgadget
[PATCH v4 4/7] revision.c: begin refactoring --topo-order logic
From: Derrick Stolee When running 'git rev-list --topo-order' and its kin, the topo_order setting in struct rev_info implies the limited setting. This means that the following things happen during prepare_revision_walk(): * revs->limited implies we run limit_list() to walk the entire reachable set. There are some short-cuts here, such as if we perform a range query like 'git rev-list COMPARE..HEAD' and we can stop limit_list() when all queued commits are uninteresting. * revs->topo_order implies we run sort_in_topological_order(). See the implementation of that method in commit.c. It implies that the full set of commits to order is in the given commit_list. These two methods imply that a 'git rev-list --topo-order HEAD' command must walk the entire reachable set of commits _twice_ before returning a single result. If we have a commit-graph file with generation numbers computed, then there is a better way. This patch introduces some necessary logic redirection when we are in this situation. In v2.18.0, the commit-graph file contains zero-valued bytes in the positions where the generation number is stored in v2.19.0 and later. Thus, we use generation_numbers_enabled() to check if the commit-graph is available and has non-zero generation numbers. When setting revs->limited only because revs->topo_order is true, only do so if generation numbers are not available. There is no reason to use the new logic as it will behave similarly when all generation numbers are INFINITY or ZERO. In prepare_revision_walk(), if we have revs->topo_order but not revs->limited, then we trigger the new logic. It breaks the logic into three pieces, to fit with the existing framework: 1. init_topo_walk() fills a new struct topo_walk_info in the rev_info struct. We use the presence of this struct as a signal to use the new methods during our walk. In this patch, this method simply calls limit_list() and sort_in_topological_order(). In the future, this method will set up a new data structure to perform that logic in-line. 2. next_topo_commit() provides get_revision_1() with the next topo- ordered commit in the list. Currently, this simply pops the commit from revs->commits. 3. expand_topo_walk() provides get_revision_1() with a way to signal walking beyond the latest commit. Currently, this calls add_parents_to_list() exactly like the old logic. While this commit presents method redirection for performing the exact same logic as before, it allows the next commit to focus only on the new logic. Signed-off-by: Derrick Stolee --- revision.c | 42 ++ revision.h | 4 2 files changed, 42 insertions(+), 4 deletions(-) diff --git a/revision.c b/revision.c index e18bd530e4..2dcde8a8ac 100644 --- a/revision.c +++ b/revision.c @@ -25,6 +25,7 @@ #include "worktree.h" #include "argv-array.h" #include "commit-reach.h" +#include "commit-graph.h" volatile show_early_output_fn_t show_early_output; @@ -2454,7 +2455,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s if (revs->diffopt.objfind) revs->simplify_history = 0; - if (revs->topo_order) + if (revs->topo_order && !generation_numbers_enabled(the_repository)) revs->limited = 1; if (revs->prune_data.nr) { @@ -2892,6 +2893,33 @@ static int mark_uninteresting(const struct object_id *oid, return 0; } +struct topo_walk_info {}; + +static void init_topo_walk(struct rev_info *revs) +{ + struct topo_walk_info *info; + revs->topo_walk_info = xmalloc(sizeof(struct topo_walk_info)); + info = revs->topo_walk_info; + memset(info, 0, sizeof(struct topo_walk_info)); + + limit_list(revs); + sort_in_topological_order(&revs->commits, revs->sort_order); +} + +static struct commit *next_topo_commit(struct rev_info *revs) +{ + return pop_commit(&revs->commits); +} + +static void expand_topo_walk(struct rev_info *revs, struct commit *commit) +{ + if (add_parents_to_list(revs, commit, &revs->commits, NULL) < 0) { + if (!revs->ignore_missing_links) + die("Failed to traverse parents of commit %s", + oid_to_hex(&commit->object.oid)); + } +} + int prepare_revision_walk(struct rev_info *revs) { int i; @@ -2928,11 +2956,13 @@ int prepare_revision_walk(struct rev_info *revs) commit_list_sort_by_date(&revs->commits); if (revs->no_walk) return 0; - if (revs->limited) + if (revs->limited) { if (limit_list(revs) < 0) return -1; - if (revs->topo_order) - sort_in_topological_order(&revs->commits, revs->sort_order); + if (revs->topo_order) + sort_in_topological_order(&revs->commits, revs->sort_order); + } else if (revs->topo_order) +
[PATCH v4 6/7] revision.c: generation-based topo-order algorithm
From: Derrick Stolee The current --topo-order algorithm requires walking all reachable commits up front, topo-sorting them, all before outputting the first value. This patch introduces a new algorithm which uses stored generation numbers to incrementally walk in topo-order, outputting commits as we go. This can dramatically reduce the computation time to write a fixed number of commits, such as when limiting with "-n " or filling the first page of a pager. When running a command like 'git rev-list --topo-order HEAD', Git performed the following steps: 1. Run limit_list(), which parses all reachable commits, adds them to a linked list, and distributes UNINTERESTING flags. If all unprocessed commits are UNINTERESTING, then it may terminate without walking all reachable commits. This does not occur if we do not specify UNINTERESTING commits. 2. Run sort_in_topological_order(), which is an implementation of Kahn's algorithm. It first iterates through the entire set of important commits and computes the in-degree of each (plus one, as we use 'zero' as a special value here). Then, we walk the commits in priority order, adding them to the priority queue if and only if their in-degree is one. As we remove commits from this priority queue, we decrement the in-degree of their parents. 3. While we are peeling commits for output, get_revision_1() uses pop_commit on the full list of commits computed by sort_in_topological_order(). In the new algorithm, these three steps correspond to three different commit walks. We run these walks simultaneously, and advance each only as far as necessary to satisfy the requirements of the 'higher order' walk. We know when we can pause each walk by using generation numbers from the commit- graph feature. Recall that the generation number of a commit satisfies: * If the commit has at least one parent, then the generation number is one more than the maximum generation number among its parents. * If the commit has no parent, then the generation number is one. There are two special generation numbers: * GENERATION_NUMBER_INFINITY: this value is 0x and indicates that the commit is not stored in the commit-graph and the generation number was not previously calculated. * GENERATION_NUMBER_ZERO: this value (0) is a special indicator to say that the commit-graph was generated by a version of Git that does not compute generation numbers (such as v2.18.0). Since we use generation_numbers_enabled() before using the new algorithm, we do not need to worry about GENERATION_NUMBER_ZERO. However, the existence of GENERATION_NUMBER_INFINITY implies the following weaker statement than the usual we expect from generation numbers: If A and B are commits with generation numbers gen(A) and gen(B) and gen(A) < gen(B), then A cannot reach B. Thus, we will walk in each of our stages until the "maximum unexpanded generation number" is strictly lower than the generation number of a commit we are about to use. The walks are as follows: 1. EXPLORE: using the explore_queue priority queue (ordered by maximizing the generation number), parse each reachable commit until all commits in the queue have generation number strictly lower than needed. During this walk, update the UNINTERESTING flags as necessary. 2. INDEGREE: using the indegree_queue priority queue (ordered by maximizing the generation number), add one to the in- degree of each parent for each commit that is walked. Since we walk in order of decreasing generation number, we know that discovering an in-degree value of 0 means the value for that commit was not initialized, so should be initialized to two. (Recall that in-degree value "1" is what we use to say a commit is ready for output.) As we iterate the parents of a commit during this walk, ensure the EXPLORE walk has walked beyond their generation numbers. 3. TOPO: using the topo_queue priority queue (ordered based on the sort_order given, which could be commit-date, author- date, or typical topo-order which treats the queue as a LIFO stack), remove a commit from the queue and decrement the in-degree of each parent. If a parent has an in-degree of one, then we add it to the topo_queue. Before we decrement the in-degree, however, ensure the INDEGREE walk has walked beyond that generation number. The implementations of these walks are in the following methods: * explore_walk_step and explore_to_depth * indegree_walk_step and compute_indegrees_to_depth * next_topo_commit and expand_topo_walk These methods have some patterns that may seem strange at first, but they are probably carry-overs from their equivalents in limit_list and sort_in_topological_order. One thing that is missing from this implementation is a proper way to stop walking when the entire queue is UNINTERESTING, so this implementation is not enabled by comparisions, such as in 'git rev-list --t
[PATCH v4 3/7] test-reach: add rev-list tests
From: Derrick Stolee The rev-list command is critical to Git's functionality. Ensure it works in the three commit-graph environments constructed in t6600-test-reach.sh. Here are a few important types of rev-list operations: * Basic: git rev-list --topo-order HEAD * Range: git rev-list --topo-order compare..HEAD * Ancestry: git rev-list --topo-order --ancestry-path compare..HEAD * Symmetric Difference: git rev-list --topo-order compare...HEAD Signed-off-by: Derrick Stolee --- t/t6600-test-reach.sh | 84 +++ 1 file changed, 84 insertions(+) diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh index 9d65b8b946..288f703b7b 100755 --- a/t/t6600-test-reach.sh +++ b/t/t6600-test-reach.sh @@ -243,4 +243,88 @@ test_expect_success 'commit_contains:miss' ' test_three_modes commit_contains --tag ' +test_expect_success 'rev-list: basic topo-order' ' + git rev-parse \ + commit-6-6 commit-5-6 commit-4-6 commit-3-6 commit-2-6 commit-1-6 \ + commit-6-5 commit-5-5 commit-4-5 commit-3-5 commit-2-5 commit-1-5 \ + commit-6-4 commit-5-4 commit-4-4 commit-3-4 commit-2-4 commit-1-4 \ + commit-6-3 commit-5-3 commit-4-3 commit-3-3 commit-2-3 commit-1-3 \ + commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \ + commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ + >expect && + run_three_modes git rev-list --topo-order commit-6-6 +' + +test_expect_success 'rev-list: first-parent topo-order' ' + git rev-parse \ + commit-6-6 \ + commit-6-5 \ + commit-6-4 \ + commit-6-3 \ + commit-6-2 \ + commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \ + >expect && + run_three_modes git rev-list --first-parent --topo-order commit-6-6 +' + +test_expect_success 'rev-list: range topo-order' ' + git rev-parse \ + commit-6-6 commit-5-6 commit-4-6 commit-3-6 commit-2-6 commit-1-6 \ + commit-6-5 commit-5-5 commit-4-5 commit-3-5 commit-2-5 commit-1-5 \ + commit-6-4 commit-5-4 commit-4-4 commit-3-4 commit-2-4 commit-1-4 \ + commit-6-3 commit-5-3 commit-4-3 \ + commit-6-2 commit-5-2 commit-4-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && + run_three_modes git rev-list --topo-order commit-3-3..commit-6-6 +' + +test_expect_success 'rev-list: range topo-order' ' + git rev-parse \ + commit-6-6 commit-5-6 commit-4-6 \ + commit-6-5 commit-5-5 commit-4-5 \ + commit-6-4 commit-5-4 commit-4-4 \ + commit-6-3 commit-5-3 commit-4-3 \ + commit-6-2 commit-5-2 commit-4-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && + run_three_modes git rev-list --topo-order commit-3-8..commit-6-6 +' + +test_expect_success 'rev-list: first-parent range topo-order' ' + git rev-parse \ + commit-6-6 \ + commit-6-5 \ + commit-6-4 \ + commit-6-3 \ + commit-6-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + >expect && + run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6 +' + +test_expect_success 'rev-list: ancestry-path topo-order' ' + git rev-parse \ + commit-6-6 commit-5-6 commit-4-6 commit-3-6 \ + commit-6-5 commit-5-5 commit-4-5 commit-3-5 \ + commit-6-4 commit-5-4 commit-4-4 commit-3-4 \ + commit-6-3 commit-5-3 commit-4-3 \ + >expect && + run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6 +' + +test_expect_success 'rev-list: symmetric difference topo-order' ' + git rev-parse \ + commit-6-6 commit-5-6 commit-4-6 \ + commit-6-5 commit-5-5 commit-4-5 \ + commit-6-4 commit-5-4 commit-4-4 \ + commit-6-3 commit-5-3 commit-4-3 \ + commit-6-2 commit-5-2 commit-4-2 \ + commit-6-1 commit-5-1 commit-4-1 \ + commit-3-8 commit-2-8 commit-1-8 \ + commit-3-7 commit-2-7 commit-1-7 \ + >expect && + run_three_modes git rev-list --topo-order commit-3-8...commit-6-6 +' + test_done -- gitgitgadget
[PATCH v4 1/7] prio-queue: add 'peek' operation
From: Derrick Stolee When consuming a priority queue, it can be convenient to inspect the next object that will be dequeued without actually dequeueing it. Our existing library did not have such a 'peek' operation, so add it as prio_queue_peek(). Add a reference-level comparison in t/helper/test-prio-queue.c so this method is exercised by t0009-prio-queue.sh. Further, add a test that checks the behavior when the compare function is NULL (i.e. the queue becomes a stack). Signed-off-by: Derrick Stolee --- prio-queue.c | 9 + prio-queue.h | 6 ++ t/helper/test-prio-queue.c | 26 ++ t/t0009-prio-queue.sh | 14 ++ 4 files changed, 47 insertions(+), 8 deletions(-) diff --git a/prio-queue.c b/prio-queue.c index a078451872..d3f488cb05 100644 --- a/prio-queue.c +++ b/prio-queue.c @@ -85,3 +85,12 @@ void *prio_queue_get(struct prio_queue *queue) } return result; } + +void *prio_queue_peek(struct prio_queue *queue) +{ + if (!queue->nr) + return NULL; + if (!queue->compare) + return queue->array[queue->nr - 1].data; + return queue->array[0].data; +} diff --git a/prio-queue.h b/prio-queue.h index d030ec9dd6..682e51867a 100644 --- a/prio-queue.h +++ b/prio-queue.h @@ -46,6 +46,12 @@ extern void prio_queue_put(struct prio_queue *, void *thing); */ extern void *prio_queue_get(struct prio_queue *); +/* + * Gain access to the "thing" that would be returned by + * prio_queue_get, but do not remove it from the queue. + */ +extern void *prio_queue_peek(struct prio_queue *); + extern void clear_prio_queue(struct prio_queue *); /* Reverse the LIFO elements */ diff --git a/t/helper/test-prio-queue.c b/t/helper/test-prio-queue.c index 9807b649b1..5bc9c46ea5 100644 --- a/t/helper/test-prio-queue.c +++ b/t/helper/test-prio-queue.c @@ -22,14 +22,24 @@ int cmd__prio_queue(int argc, const char **argv) struct prio_queue pq = { intcmp }; while (*++argv) { - if (!strcmp(*argv, "get")) - show(prio_queue_get(&pq)); - else if (!strcmp(*argv, "dump")) { - int *v; - while ((v = prio_queue_get(&pq))) - show(v); - } - else { + if (!strcmp(*argv, "get")) { + void *peek = prio_queue_peek(&pq); + void *get = prio_queue_get(&pq); + if (peek != get) + BUG("peek and get results do not match"); + show(get); + } else if (!strcmp(*argv, "dump")) { + void *peek; + void *get; + while ((peek = prio_queue_peek(&pq))) { + get = prio_queue_get(&pq); + if (peek != get) + BUG("peek and get results do not match"); + show(get); + } + } else if (!strcmp(*argv, "stack")) { + pq.compare = NULL; + } else { int *v = malloc(sizeof(*v)); *v = atoi(*argv); prio_queue_put(&pq, v); diff --git a/t/t0009-prio-queue.sh b/t/t0009-prio-queue.sh index e56dfce668..3941ad2528 100755 --- a/t/t0009-prio-queue.sh +++ b/t/t0009-prio-queue.sh @@ -47,4 +47,18 @@ test_expect_success 'notice empty queue' ' test_cmp expect actual ' +cat >expect <<'EOF' +3 +2 +6 +4 +5 +1 +8 +EOF +test_expect_success 'stack order' ' + test-tool prio-queue stack 8 1 5 4 6 2 3 dump >actual && + test_cmp expect actual +' + test_done -- gitgitgadget
[PATCH v4 0/7] Use generation numbers for --topo-order
This patch series performs a decently-sized refactoring of the revision-walk machinery. Well, "refactoring" is probably the wrong word, as I don't actually remove the old code. Instead, when we see certain options in the 'rev_info' struct, we redirect the commit-walk logic to a new set of methods that distribute the workload differently. By using generation numbers in the commit-graph, we can significantly improve 'git log --graph' commands (and the underlying 'git rev-list --topo-order'). On the Linux repository, I got the following performance results when comparing to the previous version with or without a commit-graph: Test: git rev-list --topo-order -100 HEAD HEAD~1, no commit-graph: 6.80 s HEAD~1, w/ commit-graph: 0.77 s HEAD, w/ commit-graph: 0.02 s Test: git rev-list --topo-order -100 HEAD -- tools HEAD~1, no commit-graph: 9.63 s HEAD~1, w/ commit-graph: 6.06 s HEAD, w/ commit-graph: 0.06 s If you want to read this series but are unfamiliar with the commit-graph and generation numbers, then I recommend reading Documentation/technical/commit-graph.txt or a blob post [1] I wrote on the subject. In particular, the three-part walk described in "revision.c: refactor basic topo-order logic" is present (but underexplained) as an animated PNG [2]. Since revision.c is an incredibly important (and old) portion of the codebase -- and because there are so many orthogonal options in 'struct rev_info' -- I consider this submission to be "RFC quality". That is, I am not confident that I am not missing anything, or that my solution is the best it can be. I did merge this branch with ds/commit-graph-with-grafts and the "DO-NOT-MERGE: write and read commit-graph always" commit that computes a commit-graph with every 'git commit' command. The test suite passed with that change, available on GitHub [3]. To ensure that I cover at least the case I think are interesting, I added tests to t6600-test-reach.sh to verify the walks report the correct results for the three cases there (no commit-graph, full commit-graph, and a partial commit-graph so the walk starts at GENERATION_NUMBER_INFINITY). One notable case that is not included in this series is the case of a history comparison such as 'git rev-list --topo-order A..B'. The existing code in limit_list() has ways to cut the walk short when all pending commits are UNINTERESTING. Since this code depends on commit_list instead of the prio_queue we are using here, I chose to leave it untouched for now. We can revisit it in a separate series later. Since handle_commit() turns on revs->limited when a commit is UNINTERESTING, we do not hit the new code in this case. Removing this 'revs->limited = 1;' line yields correct results, but the performance is worse. This series was based on ds/reachable, but is now based on 'master' to not conflict with 182070 "commit: use timestamp_t for author_date_slab". There is a small conflict with md/filter-trees, because it renamed a flag in revisions.h in the line before I add new flags. Hopefully this conflict is not too difficult to resolve. Changes in V3: I added a new patch that updates the tab-alignment for flags in revision.h before adding new ones (Thanks, Ævar!). Also, I squashed the recommended changes to run_three_modes and test_three_modes from Szeder and Junio. Thanks! Changes in V4: I'm sending a V4 to respond to the feedback so far. Still looking forward to more on the really big commit! * Removed the whitespace changes to the flags in revision.c that caused merge pain. * The prio-queue peek function is now covered by tests when in "stack" mode. * The "add_parents_to_list()" function is now renamed to "process_parents()" * Added a new commit that expands test coverage with alternate orders and file history (use GIT_TEST_COMMIT_GRAPH to have t6012-rev-list-simplify.sh cover the new logic). These tests found a problem with author dates (I forgot to record them during the explore walk). * Commit message edits. Thanks, -Stolee [1] https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/ Supercharging the Git Commit Graph III: Generations and Graph Algorithms [2] https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png Animation showing three-part walk [3] https://github.com/derrickstolee/git/tree/topo-order/testA branch containing this series along with commits to compute commit-graph in entire test suite. Cc: avarab@gmail.comCc: szeder@gmail.com Derrick Stolee (7): prio-queue: add 'peek' operation test-reach: add run_three_modes method test-reach: add rev-list tests revision.c: begin refactoring --topo-order logic commit/revisions: bookkeeping before refactoring revision.c: generation-based topo-order algorithm t6012: make rev-list tests more interesting commit.c | 11 +- commit.h | 8 ++ object.h
[PATCH v4 7/7] t6012: make rev-list tests more interesting
From: Derrick Stolee As we are working to rewrite some of the revision-walk machinery, there could easily be some interesting interactions between the options that force topological constraints (--topo-order, --date-order, and --author-date-order) along with specifying a path. Add extra tests to t6012-rev-list-simplify.sh to add coverage of these interactions. To ensure interesting things occur, alter the repo data shape to have different orders depending on topo-, date-, or author-date-order. When testing using GIT_TEST_COMMIT_GRAPH, this assists in covering the new logic for topo-order walks using generation numbers. The extra tests can be added indepently. Signed-off-by: Derrick Stolee --- t/t6012-rev-list-simplify.sh | 45 1 file changed, 36 insertions(+), 9 deletions(-) diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index b5a1190ffe..a10f0df02b 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -12,6 +12,22 @@ unnote () { git name-rev --tags --stdin | sed -e "s|$OID_REGEX (tags/\([^)]*\)) |\1 |g" } +# +# Create a test repo with interesting commit graph: +# +# A--B--G--H--I--K--L +# \ \ / / +# \ \ / / +#C--E---F J +#\_/ +# +# The commits are laid out from left-to-right starting with +# the root commit A and terminating at the tip commit L. +# +# There are a few places where we adjust the commit date or +# author date to make the --topo-order, --date-order, and +# --author-date-order flags produce different output. + test_expect_success setup ' echo "Hi there" >file && echo "initial" >lost && @@ -21,10 +37,18 @@ test_expect_success setup ' git branch other-branch && + git symbolic-ref HEAD refs/heads/unrelated && + git rm -f "*" && + echo "Unrelated branch" >side && + git add side && + test_tick && git commit -m "Side root" && + note J && + git checkout master && + echo "Hello" >file && echo "second" >lost && git add file lost && - test_tick && git commit -m "Modified file and lost" && + test_tick && GIT_AUTHOR_DATE=$(($test_tick + 120)) git commit -m "Modified file and lost" && note B && git checkout other-branch && @@ -63,13 +87,6 @@ test_expect_success setup ' test_tick && git commit -a -m "Final change" && note I && - git symbolic-ref HEAD refs/heads/unrelated && - git rm -f "*" && - echo "Unrelated branch" >side && - git add side && - test_tick && git commit -m "Side root" && - note J && - git checkout master && test_tick && git merge --allow-unrelated-histories -m "Coolest" unrelated && note K && @@ -103,14 +120,24 @@ check_result () { check_outcome success "$@" } -check_result 'L K J I H G F E D C B A' --full-history +check_result 'L K J I H F E D C G B A' --full-history --topo-order +check_result 'L K I H G F E D C B J A' --full-history +check_result 'L K I H G F E D C B J A' --full-history --date-order +check_result 'L K I H G F E D B C J A' --full-history --author-date-order check_result 'K I H E C B A' --full-history -- file check_result 'K I H E C B A' --full-history --topo-order -- file check_result 'K I H E C B A' --full-history --date-order -- file +check_result 'K I H E B C A' --full-history --author-date-order -- file check_result 'I E C B A' --simplify-merges -- file +check_result 'I E C B A' --simplify-merges --topo-order -- file +check_result 'I E C B A' --simplify-merges --date-order -- file +check_result 'I E B C A' --simplify-merges --author-date-order -- file check_result 'I B A' -- file check_result 'I B A' --topo-order -- file +check_result 'I B A' --date-order -- file +check_result 'I B A' --author-date-order -- file check_result 'H' --first-parent -- another-file +check_result 'H' --first-parent --topo-order -- another-file check_result 'E C B A' --full-history E -- lost test_expect_success 'full history simplification without parent' ' -- gitgitgadget