Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases
On Fri, Sep 07 2018, Jonathan Nieder wrote: > Jeff King wrote: > >> I don't see any point in generating a sorted list and _then_ making an >> auxiliary hashmap. My idea was that if you're using a sorted string-list >> for lookup, then you can replace the whole thing with a hash (inserting >> as you go, rather than sorting at the end). > > What if I'm sorting a string list in preparation for emitting a sorted > list, and I *also* want to perform lookups in that same list? In > other words: If this turns out to be a common use-case perhaps the easiest way to support that would be to make the hashmap (optionally?) ordered, as Ruby 1.9 did with their hash implementation: https://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/
Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases
Ævar Arnfjörð Bjarmason wrote: > On Fri, Sep 07 2018, Jonathan Nieder wrote: >> Jeff King wrote: >>> I don't see any point in generating a sorted list and _then_ making an >>> auxiliary hashmap. My idea was that if you're using a sorted string-list >>> for lookup, then you can replace the whole thing with a hash (inserting >>> as you go, rather than sorting at the end). >> >> What if I'm sorting a string list in preparation for emitting a sorted >> list, and I *also* want to perform lookups in that same list? In >> other words: > > If this turns out to be a common use-case perhaps the easiest way to > support that would be to make the hashmap (optionally?) ordered, as Ruby > 1.9 did with their hash implementation: > https://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/ That's about recording the order of insertion. I'm talking about something much simpler: sorting an array (as preparation for emitting it) and binary searching to find an entry in that array.
Re: [PATCH 14/14] format-patch: allow --range-diff to apply to a lone-patch
On Wed, Jul 25, 2018 at 5:07 PM Junio C Hamano wrote: > Eric Sunshine writes: > > + if (cmit_fmt_is_mail(ctx.fmt) && opt->rdiff1) { > > + struct diff_queue_struct dq; > > + > > + memcpy(&dq, &diff_queued_diff, sizeof(diff_queued_diff)); > > + DIFF_QUEUE_CLEAR(&diff_queued_diff); > > + > > + next_commentary_block(opt, NULL); > > + fprintf_ln(opt->diffopt.file, "%s", opt->rdiff_title); > > + show_range_diff(opt->rdiff1, opt->rdiff2, > > + opt->creation_factor, 1, &opt->diffopt); > > + > > + memcpy(&diff_queued_diff, &dq, sizeof(diff_queued_diff)); > > + } > > } > > This essentially repeats what is already done for "interdiff". Yes, the two blocks are very similar, although they access different members of 'rev_info' and call different functions to perform the actual diff. I explored ways of avoiding the repeated boilerplate (using macros or passing a function pointer to a driver function containing the boilerplate), but the end result was uglier and harder to understand due to the added abstraction. Introducing next_commentary_block()[1] reduced the repetition a bit. > Does the global diff_queued_diff gets cleaned up when > show_interdiff() and show_range_diff() return, like diff_flush() > does? Otherwise we'd be leaking the filepairs accumulated in the > diff_queued_diff. Both show_interdiff() and show_range_diff() call diff_flush(). So, the "temporary" diff_queued_diff set up here does get cleaned up by diff_flush(), as far as I understand (though this is my first foray into the diff-generation code, so I may be missing something). And, the diff_queued_diff which gets "interrupted" by this excursion into interdiff/range-diff gets cleaned up normally when the interrupted diff operation completes. [1]: https://public-inbox.org/git/20180722095717.17912-6-sunsh...@sunshineco.com/
Re: [PATCH 11/14] format-patch: extend --range-diff to accept revision range
On Wed, Jul 25, 2018 at 4:53 PM Junio C Hamano wrote: > Eric Sunshine writes: > > When submitting a revised a patch series, the --range-diff option embeds > > a range-diff in the cover letter showing changes since the previous > > version of the patch series. The argument to --range-diff is a simple > > revision naming the tip of the previous series [...] > > > > However, it fails if the revision ranges of the old and new versions of > > the series are disjoint. To address this shortcoming, extend > > --range-diff to also accept an explicit revision range for the previous > > series. [...] > > > static void infer_range_diff_ranges(struct strbuf *r1, > > { > > - strbuf_addf(r1, "%s..%s", head_oid, prev); > > - strbuf_addf(r2, "%s..%s", prev, head_oid); > > I thought "git range-diff" also took the three-dot notation as a > short-hand but there is no point using that in this application, > which wants the RHS and the LHS range in separate output variables. Correct. > I actually would feel better to see the second range for the normal > case to be computed exactly the same way, i.e. > > int prev_is_range = strstr(prev, ".."); > > if (prev_is_range) > strbuf_addstr(r1, prev); > else > strbuf_addf(r1, "%s..%s", head_oid, prev); > > if (origin) > strbuf_addf(r2, "%s..%s", oid_to_hex(&origin->object.oid, > head_oid); > else if (prev_is_range) > die("cannot figure out the origin of new series"); > else { > warning("falling back to use '%s' as the origin of new > series", prev); > strbuf_addf(r2, "%s..%s", prev, head_oid); > } > > because origin..HEAD is always the set of the "new" series, no > matter what "prev" the user chooses to compare that series against, > when there is a single "origin". I plan to submit a short series as incremental changes atop es/format-patch-{interdiff,rangediff} to address the few review comments[1] (including my own[2]), so I can make this change, as well. [1]: https://public-inbox.org/git/CAPig+cSuYUYSPTuKx08wcmQM-G12_-W2T4BS07fA=6grm1b...@mail.gmail.com/T/#rfaf5a563e81b862bd8ed69232040056ab9a86dd8 [2]: https://public-inbox.org/git/CAPig+cSuYUYSPTuKx08wcmQM-G12_-W2T4BS07fA=6grm1b...@mail.gmail.com/
Re: [PATCH] http-backend: allow empty CONTENT_LENGTH
Max Kirillov writes: > Actually, another reason for the latest issue was that CONTENT_LENGTH > is parsed for GET requests at all. It should be parsed only for POST > requests, or, rather, only for upoad-pack and receive-pack requests. Not really. The layered design of the HTTP protocol means that any request type can have non-empty body, but request types for which no semantics of the body is defined must ignore what is in the body, which in turn means we need to parse and pay attention to the content length etc. to find the end of the body, if only to ignore it. In any case, hopefully we can fix this before the final, as this is a regression introduced during this cycle?
Re: [PATCH] diff: allow --recurse-submodules as an synonym for --submodule
Jonathan Nieder writes: > When I think about it this way, I suspect that (B) will provide a > better experience than (A), so this diff change doesn't seem like a > step in the wrong direction. OK, that's fair.
Old submodules broken in 2.19rc1 and 2.19rc2
Submodules checked out with older versions of git not longer works in the latest 2.19 releases. A "git submodule update --recursive" command wil fail for each submodule with a line saying "fatal: could not open '/.git' for writing> Is a directory. I checked the release notes so far, and they do not say the old submodules from git 2.16- were no longer supported, so I assume it is a bug? 'Allan
Re: [PATCH 01/11] string_list: print_string_list to use trace_printf
Jeff King writes: > I guess the question is: is this a thing we would want to make available > to code to leave in all the time? Or is it just for sticking in > temporarily for a quick dump? > > If the former, then I think it needs the early-return at the least (and > probably _should_ have the key parameter). > > But I get the feeling that you really just want the latter, and are only > grudgingly being pushed into the former by Junio's suggestion. Well, if that is the case, I'd change "my suggestion" (although I didn't mean to suggest anything concrete). If this was needed and/or was useful only during the defvelopment of the remainder of the series, and if this is known to be a half-hearted change that is not meant to be useful as a general solution, then let's not take the main part of the series a hostage to this step; rather, we should drop this change and leave it to another series that is willing to do it right. It would save both author's and reviewers' time if we did so.
Re: [PATCH 1/2] commit-graph write: add progress output
On Tue, Sep 04 2018, Ævar Arnfjörð Bjarmason wrote: > Before this change the "commit-graph write" command didn't report any > progress. On my machine this command takes more than 10 seconds to > write the graph for linux.git, and around 1m30s on the > 2015-04-03-1M-git.git[1] test repository, which is a test case for > larger monorepos. There's a fun issue with this code that I'll fix, but thought was informative to send a mail about. Because the graph verification happens in the main "git gc" process, as opposed to everything else via external commands, so all this progress output gets written to .git/gc.log. Then next time we do a "gc --auto" we vomit out a couple of KB of progress bar output at the user, since spot that the gc.log isn't empty.
2.19.0.rc2.windows.1: stash fails with dirty submodule
Hi, (reported at [1] already but reposting here as suggested) I'm using git with stash and rebase builtins. $ git --version --build-options git version 2.19.0.rc2.windows.1 cpu: x86_64 built from commit: 425f414f8e04123eacb5597776d6a8de445a8d8b sizeof-long: 4 sizeof-size_t: 8 With the following recipe mkdir test cd test git init echo 1 > file git add file git commit file -m "message" git submodule add ./ mysubmod git commit -m "Add submodule" echo 2 > mysubmod/file git checkout -b mybranch git rebase -i --autosquash master I get Initialized empty Git repository in E:/projekte/test/.git/ [master (root-commit) 97c0108] message 1 file changed, 1 insertion(+) create mode 100644 file Cloning into 'E:/projekte/test/mysubmod'... done. [master 282a50f] Add submodule 2 files changed, 4 insertions(+) create mode 100644 .gitmodules create mode 16 mysubmod Switched to a new branch 'mybranch' M mysubmod fatal: Unexpected stash response: '' and that used to work with older git versions. Thanks for reading, Thomas [1]: https://github.com/git-for-windows/git/issues/1820#issuecomment-419411808
Re: [PATCH 09/11] multi-pack-index: verify object offsets
On 9/6/2018 8:34 PM, Eric Sunshine wrote: On Wed, Sep 5, 2018 at 10:46 AM Derrick Stolee via GitGitGadget wrote: Replace the BUG() statement with a die() statement, now that we may hit a bad pack-int-id during a 'verify' command on a corrupt multi-pack-index, and it is covered by a test. Signed-off-by: Derrick Stolee --- diff --git a/midx.c b/midx.c @@ -197,7 +197,7 @@ int prepare_midx_pack(struct multi_pack_index *m, uint32_t pack_int_id) if (pack_int_id >= m->num_packs) - BUG("bad pack-int-id"); + die(_("bad pack-int-id")); For someone trying to diagnose this issue, would it be helpful to know (that is, print out) the values of pack_int_id and num_packs? Can do! Thanks.
Re: [PATCH 1/2] commit-graph write: add progress output
On 9/7/2018 8:40 AM, Ævar Arnfjörð Bjarmason wrote: On Tue, Sep 04 2018, Ævar Arnfjörð Bjarmason wrote: Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the 2015-04-03-1M-git.git[1] test repository, which is a test case for larger monorepos. There's a fun issue with this code that I'll fix, but thought was informative to send a mail about. Because the graph verification happens in the main "git gc" process, as opposed to everything else via external commands, so all this progress output gets written to .git/gc.log. Then next time we do a "gc --auto" we vomit out a couple of KB of progress bar output at the user, since spot that the gc.log isn't empty. Good catch! (I do want to clarify that the graph _writing_ happens during 'git gc' since 'git commit-graph verify' is a different thing.)
Re: [PATCH v3 4/4] read-cache: speed up index load through parallelization
On 9/7/2018 12:16 AM, Torsten Bögershausen wrote: diff --git a/read-cache.c b/read-cache.c index fcc776aaf0..8537a55750 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1941,20 +1941,212 @@ static void *load_index_extensions(void *_data) return NULL; } +/* + * A helper function that will load the specified range of cache entries + * from the memory mapped file and add them to the given index. + */ +static unsigned long load_cache_entry_block(struct index_state *istate, + struct mem_pool *ce_mem_pool, int offset, int nr, void *mmap, + unsigned long start_offset, struct strbuf *previous_name) +{ + int i; + unsigned long src_offset = start_offset; I read an unsigned long here: should that be a size_t instead ? (And probably even everywhere else in this patch) It's a fair question. The pre-patch code had a mix of unsigned long and size_t. Both src_offset and consumed were unsigned long but mmap_size was a size_t. I stuck with that pattern for consistency. While it would be possible to convert everything to size_t as a step to enable index files >4 GB, I have a hard time believing that will be necessary for a very long time and would likely require more substantial changes to enable that to work. + + for (i = offset; i < offset + nr; i++) { + struct ondisk_cache_entry *disk_ce; + struct cache_entry *ce; + unsigned long consumed; + + disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset); + ce = create_from_disk(ce_mem_pool, disk_ce, &consumed, previous_name); + set_index_entry(istate, i, ce); + + src_offset += consumed; + } + return src_offset - start_offset; +} + +static unsigned long load_all_cache_entries(struct index_state *istate, + void *mmap, size_t mmap_size, unsigned long src_offset) +{ + struct strbuf previous_name_buf = STRBUF_INIT, *previous_name; + unsigned long consumed; + + if (istate->version == 4) { + previous_name = &previous_name_buf; + mem_pool_init(&istate->ce_mem_pool, + estimate_cache_size_from_compressed(istate->cache_nr)); + } else { + previous_name = NULL; + mem_pool_init(&istate->ce_mem_pool, + estimate_cache_size(mmap_size, istate->cache_nr)); + } + + consumed = load_cache_entry_block(istate, istate->ce_mem_pool, + 0, istate->cache_nr, mmap, src_offset, previous_name); + strbuf_release(&previous_name_buf); + return consumed; +} + +#ifndef NO_PTHREADS +
Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases
On Thu, Sep 06, 2018 at 11:32:41PM -0700, Jonathan Nieder wrote: > > I think Stefan pointed out a "case 4" in the other part of the thread: > > ones where we really care not just about fast lookup, but actual > > iteration order. > > I had assumed that that was the whole point of this data structure. > Anything else that is using it for lookups should indeed use a hash > map instead, and I can take my share of blame for missing this kind of > thing in review. Keep in mind we didn't have a decent generic hashmap for many years. So I think string-list got used in its place. > > I think I like the hashmap way, if the conversion isn't too painful. > > If we don't have any callers that actually need the sort-and-lookup > thing, then yay, let's get rid of it. But I don't actually think of > this as the hashmap way. It's the get-rid-of-the-unneeded-feature > way. > > In other words, *regardless* of what else we should do, we should > update any callers that want a hashmap to use a hashmap. Please go > ahead, even if it doesn't let us simplify the string list API at all. Great, I think we're on the same page. Thanks! -Peff
Re: Old submodules broken in 2.19rc1 and 2.19rc2
On Fri, Sep 07, 2018 at 11:52:58AM +0200, Allan Sandfeld Jensen wrote: > Submodules checked out with older versions of git not longer works in the > latest 2.19 releases. A "git submodule update --recursive" command wil fail > for each submodule with a line saying "fatal: could not open > '/.git' for writing> Is a directory. I couldn't reproduce after cloning with v1.6.6.3 (which creates ".git" as a directory in the tree). Is it possible for you to bisect (or perhaps share with us the broken example)? > I checked the release notes so far, and they do not say the old submodules > from git 2.16- were no longer supported, so I assume it is a bug? I don't think it was intentional. +cc Stefan for submodule expertise. -Peff
Re: [PATCH 1/2] commit-graph write: add progress output
On Wed, Sep 05 2018, Derrick Stolee wrote: > On 9/4/2018 6:07 PM, Junio C Hamano wrote: >> Ævar Arnfjörð Bjarmason writes: >> >>> With --stdin-packs we don't show any estimation of how much is left to >>> do. This is because we might be processing more than one pack. We >>> could be less lazy here and show progress, either detect by detecting >>> that we're only processing one pack, or by first looping over the >>> packs to discover how many commits they have. I don't see the point in >> I do not know if there is no point, but if we were to do it, I think >> slurping the list of packs and computing the number of objects is >> not all that bad. > > If you want to do that, I have nothing against it. However, I don't > expect users to use that option directly. That option is used by VFS > for Git to compute the commit-graph in the background after receiving > a pack of commits and trees, but not by 'git gc' which I expect is how > most users will compute commit-graphs. > >>> static void compute_generation_numbers(struct packed_commit_list* commits) >>> { >>> int i; >>> struct commit_list *list = NULL; >>> + struct progress *progress = NULL; >>> + progress = start_progress( >>> + _("Computing commit graph generation numbers"), commits->nr); >>> for (i = 0; i < commits->nr; i++) { >>> + display_progress(progress, i); >>> if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY >>> && >>> commits->list[i]->generation != GENERATION_NUMBER_ZERO) >>> continue; >> I am wondering if the progress call should be moved after this >> conditional continue; would we want to count the entry whose >> generation is already known here? Of course, as we give commits->nr >> as the 100% ceiling, we cannot avoid doing so, but it somehow smells >> wrong. > > If we wanted to be completely right, we would count the commits in the > list that do not have a generation number and report that as the 100% > ceiling. > > Something like the diff below would work. I tested it in Linux by > first deleting my commit-graph and running the following: > > stolee@stolee-linux:~/linux$ rm .git/objects/info/commit-graph > stolee@stolee-linux:~/linux$ git rev-parse v4.6 | ~/git/git > commit-graph write --stdin-commits > Annotating commits in commit graph: 1180333, done. > Computing commit graph generation numbers: 100% (590166/590166), done. > stolee@stolee-linux:~/linux$ ~/git/git commit-graph write --reachable > Annotating commits in commit graph: 1564087, done. > Computing commit graph generation numbers: 100% (191590/191590), done. > > -->8-- > > From: Derrick Stolee > Date: Wed, 5 Sep 2018 11:55:42 + > Subject: [PATCH] fixup! commit-graph write: add progress output > > Signed-off-by: Derrick Stolee > --- > commit-graph.c | 15 +++ > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/commit-graph.c b/commit-graph.c > index 1a02fe019a..b933bc9f00 100644 > --- a/commit-graph.c > +++ b/commit-graph.c > @@ -634,14 +634,20 @@ static void close_reachable(struct > packed_oid_list *oids) > > static void compute_generation_numbers(struct packed_commit_list* commits) > { > - int i; > + int i, count_uncomputed = 0; > struct commit_list *list = NULL; > struct progress *progress = NULL; > > + for (i = 0; i < commits->nr; i++) > + if (commits->list[i]->generation == > GENERATION_NUMBER_INFINITY || > + commits->list[i]->generation == GENERATION_NUMBER_ZERO) > + count_uncomputed++; > + > progress = start_progress( > - _("Computing commit graph generation numbers"), > commits->nr); > + _("Computing commit graph generation numbers"), > count_uncomputed); > + count_uncomputed = 0; > + > for (i = 0; i < commits->nr; i++) { > - display_progress(progress, i); > if (commits->list[i]->generation != > GENERATION_NUMBER_INFINITY && > commits->list[i]->generation != GENERATION_NUMBER_ZERO) > continue; > @@ -670,10 +676,11 @@ static void compute_generation_numbers(struct > packed_commit_list* commits) > > if (current->generation > > GENERATION_NUMBER_MAX) > current->generation = > GENERATION_NUMBER_MAX; > + > + display_progress(progress, > ++count_uncomputed); > } > } > } > - display_progress(progress, i); > stop_progress(&progress); > } One of the things I was trying to do with this series was to make sure that whenever we run "git gc" there's always some indication that if you set gc.writeCommitGraph=true that it's actualy doing work. This modifies that, which I think is actually fine, just something I wanted to note. I.e. if you run "git commit-graph write" twice in a row, the second time will have no output. Unless that is, your repo is big enough that some of the delayed timers kick in. So e.g. on git.git we get no output the second time around, but do get output the first time around, and on linux.git we always get output. But in the common case people aren't running this in a loop, and it's useful to see how many new th
Re: Old submodules broken in 2.19rc1 and 2.19rc2
On Freitag, 7. September 2018 17:03:27 CEST Jeff King wrote: > On Fri, Sep 07, 2018 at 11:52:58AM +0200, Allan Sandfeld Jensen wrote: > > Submodules checked out with older versions of git not longer works in the > > latest 2.19 releases. A "git submodule update --recursive" command wil > > fail > > for each submodule with a line saying "fatal: could not open > > '/.git' for writing> Is a directory. > > I couldn't reproduce after cloning with v1.6.6.3 (which creates ".git" > as a directory in the tree). Is it possible for you to bisect (or > perhaps share with us the broken example)? > I discovered it by using Debian testing, and it is shipping the 2.17rcs for some reason. The example is just an old checkout of qt5.git with submodules, it is rather large. I could try bisecting, but I am not sure I have the time anytime soon, I just wanted to report it to you first incase you knew of a change that suddenly assumed the new structure. 'Allan
Re: Mailsplit
On Wed, Sep 05, 2018 at 09:17:29PM -0700, Stephen & Linda Smith wrote: > I thought I would use "git mailsplit" to split a mbox file (which downloaded > from public inbox) so that I could attemp to resurrect a patch series for > from > a year ago. > > The motivation is that I downloaded the series [1] and applied to a tag from > about the time period that the patch was sent out [2]. > > The "git am -3 patch.mbox quit 2/3 of the way though. I resolved the fix. > and ran "git am --continue" which didn't apply the rest of the patches in the > mbox. > > So two questions: > 1) why would git version 2.18.0 not appear to continue applying the patches. > > > 2) where do I find the command "git mailsplit". The onlything in my > installed tree is: > > $ find /usr/local/ -name '*mailsplit*' > /usr/local/share/doc/git-doc/git-mailsplit.txt > /usr/local/share/doc/git-doc/git-mailsplit.html > /usr/local/share/man/man1/git-mailsplit.1 > /usr/local/libexec/git-core/git-mailsplit This is the mailsplit command, and should be executed when running `git mailsplit`. What does git --exec-dir return? Kevin
Re: [PATCH 1/2] commit-graph write: add progress output
On Fri, Sep 07 2018, Ævar Arnfjörð Bjarmason wrote: > On Wed, Sep 05 2018, Derrick Stolee wrote: > >> On 9/4/2018 6:07 PM, Junio C Hamano wrote: >>> Ævar Arnfjörð Bjarmason writes: >>> With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We could be less lazy here and show progress, either detect by detecting that we're only processing one pack, or by first looping over the packs to discover how many commits they have. I don't see the point in >>> I do not know if there is no point, but if we were to do it, I think >>> slurping the list of packs and computing the number of objects is >>> not all that bad. >> >> If you want to do that, I have nothing against it. However, I don't >> expect users to use that option directly. That option is used by VFS >> for Git to compute the commit-graph in the background after receiving >> a pack of commits and trees, but not by 'git gc' which I expect is how >> most users will compute commit-graphs. >> static void compute_generation_numbers(struct packed_commit_list* commits) { int i; struct commit_list *list = NULL; + struct progress *progress = NULL; +progress = start_progress( + _("Computing commit graph generation numbers"), commits->nr); for (i = 0; i < commits->nr; i++) { + display_progress(progress, i); if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY && commits->list[i]->generation != GENERATION_NUMBER_ZERO) continue; >>> I am wondering if the progress call should be moved after this >>> conditional continue; would we want to count the entry whose >>> generation is already known here? Of course, as we give commits->nr >>> as the 100% ceiling, we cannot avoid doing so, but it somehow smells >>> wrong. >> >> If we wanted to be completely right, we would count the commits in the >> list that do not have a generation number and report that as the 100% >> ceiling. >> >> Something like the diff below would work. I tested it in Linux by >> first deleting my commit-graph and running the following: >> >> stolee@stolee-linux:~/linux$ rm .git/objects/info/commit-graph >> stolee@stolee-linux:~/linux$ git rev-parse v4.6 | ~/git/git >> commit-graph write --stdin-commits >> Annotating commits in commit graph: 1180333, done. >> Computing commit graph generation numbers: 100% (590166/590166), done. >> stolee@stolee-linux:~/linux$ ~/git/git commit-graph write --reachable >> Annotating commits in commit graph: 1564087, done. >> Computing commit graph generation numbers: 100% (191590/191590), done. >> >> -->8-- >> >> From: Derrick Stolee >> Date: Wed, 5 Sep 2018 11:55:42 + >> Subject: [PATCH] fixup! commit-graph write: add progress output >> >> Signed-off-by: Derrick Stolee >> --- >> commit-graph.c | 15 +++ >> 1 file changed, 11 insertions(+), 4 deletions(-) >> >> diff --git a/commit-graph.c b/commit-graph.c >> index 1a02fe019a..b933bc9f00 100644 >> --- a/commit-graph.c >> +++ b/commit-graph.c >> @@ -634,14 +634,20 @@ static void close_reachable(struct >> packed_oid_list *oids) >> >> static void compute_generation_numbers(struct packed_commit_list* commits) >> { >> - int i; >> + int i, count_uncomputed = 0; >> struct commit_list *list = NULL; >> struct progress *progress = NULL; >> >> + for (i = 0; i < commits->nr; i++) >> + if (commits->list[i]->generation == >> GENERATION_NUMBER_INFINITY || >> + commits->list[i]->generation == GENERATION_NUMBER_ZERO) >> + count_uncomputed++; >> + >> progress = start_progress( >> - _("Computing commit graph generation numbers"), >> commits->nr); >> + _("Computing commit graph generation numbers"), >> count_uncomputed); >> + count_uncomputed = 0; >> + >> for (i = 0; i < commits->nr; i++) { >> - display_progress(progress, i); >> if (commits->list[i]->generation != >> GENERATION_NUMBER_INFINITY && >> commits->list[i]->generation != GENERATION_NUMBER_ZERO) >> continue; >> @@ -670,10 +676,11 @@ static void compute_generation_numbers(struct >> packed_commit_list* commits) >> >> if (current->generation > >> GENERATION_NUMBER_MAX) >> current->generation = >> GENERATION_NUMBER_MAX; >> + >> + display_progress(progress, >> ++count_uncomputed); >> } >> } >> } >> - display_progress(progress, i); >> stop_progress(&progress); >> } > > One of the things I was trying to do with this series was to make sure > that whenever we run "git gc" there's always some indication that if you > set gc.writeCommitGraph=true that it's actualy doing work. > > This modifies that, which I think is actually fine, just something I > wanted to note. I.e. if you run "git commit-graph write" twice in a row, > the second time will have no output. > > Unless that is, your repo is big enough that some of the delayed timers > kick in. So e.g. on git.git we get no output the second t
[PATCH] status: show progress bar if refreshing the index takes too long
Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one, or something else silly (*). In this case, it's good to show something to let the user know "git status" is not hanging, it's just busy doing something. (*) I got called by my colleague because her "git status" took very long and looked pretty much like hanging. After a bit of strace, it looks to me that git was trying to rehash every single file, and this was a big repository. This process could take minutes. In this case, I think it was probably because she copied this repository to a new place and stat data did not match anymore. So git fell back to hashing. Signed-off-by: Nguyễn Thái Ngọc Duy --- I need to get this out of my head before I forget. I obviously think this is a good idea and could be done in more places, even just to deal with cold cache. The hint about "git status -uno" for example, could be accompanied by a progress bar for scanning for untracked files... Another note about rehashing files as part of refresh. We probably could do better by hashing in parallel. Or perhaps not because having a big lock around object database pretty much kills performance, and if I remember correctly none of my core was 100% consumed (i.e. CPU bottleneck as an indication that multithread is a good idea...) Anyway that's it! Weekend after a long week! I'll read mails and respond tomorrow. builtin/am.c | 2 +- builtin/commit.c | 6 -- cache.h | 7 +-- preload-index.c | 44 +++- read-cache.c | 10 ++ sequencer.c | 2 +- 6 files changed, 60 insertions(+), 11 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index 5e866d17c7..22a93cfef3 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -2324,7 +2324,7 @@ int cmd_am(int argc, const char **argv, const char *prefix) /* Ensure a valid committer ident can be constructed */ git_committer_info(IDENT_STRICT); - if (read_index_preload(&the_index, NULL) < 0) + if (read_index_preload(&the_index, NULL, 0) < 0) die(_("failed to read the index")); if (in_progress) { diff --git a/builtin/commit.c b/builtin/commit.c index 0d9828e29e..eaf639ece6 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1355,8 +1355,10 @@ int cmd_status(int argc, const char **argv, const char *prefix) PATHSPEC_PREFER_FULL, prefix, argv); - read_cache_preload(&s.pathspec); - refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, NULL, NULL); + read_index_preload(&the_index, &s.pathspec, REFRESH_PROGRESS); + refresh_index(&the_index, + REFRESH_QUIET|REFRESH_UNMERGED|REFRESH_PROGRESS, + &s.pathspec, NULL, NULL); if (use_optional_locks()) fd = hold_locked_index(&index_lock, 0); diff --git a/cache.h b/cache.h index 4d014541ab..35da02be90 100644 --- a/cache.h +++ b/cache.h @@ -410,7 +410,7 @@ void validate_cache_entries(const struct index_state *istate); #define read_cache() read_index(&the_index) #define read_cache_from(path) read_index_from(&the_index, (path), (get_git_dir())) -#define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec)) +#define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec), 0) #define is_cache_unborn() is_index_unborn(&the_index) #define read_cache_unmerged() read_index_unmerged(&the_index) #define discard_cache() discard_index(&the_index) @@ -659,7 +659,9 @@ extern int daemonize(void); /* Initialize and use the cache information */ struct lock_file; extern int read_index(struct index_state *); -extern int read_index_preload(struct index_state *, const struct pathspec *pathspec); +extern int read_index_preload(struct index_state *, + const struct pathspec *pathspec, + unsigned int refresh_flags); extern int do_read_index(struct index_state *istate, const char *path, int must_exist); /* for testting only! */ extern int read_index_from(struct index_state *, const char *path, @@ -814,6 +816,7 @@ extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st); #define REFRESH_IGNORE_MISSING 0x0008 /* ignore non-existent */ #define REFRESH_IGNORE_SUBMODULES 0x0010 /* ignore submodules */ #define REFRESH_IN_PORCELAIN 0x0020 /* user friendly output, not "needs update" */ +#define REFRESH_PROGRESS 0x0040 /* show progress bar if stderr is tty */ extern int refresh_index(struct index_state *, unsigned int flags, const struct pathspec *pathspec, char *seen, const char *header_msg); extern struct cache_entry *refresh_cache_entry(struct index_state *, struct cache_entry *, unsigned int); diff --git a/preload-index.c b/preload-index.c index 71cd2437a3..bf7dbed779 100644 --- a/pre
Re: Temporary git files for the gitdir created on a separate drive in workdir
Hultqvist writes: > Considering that the gitdir could be located on a different drive than > the workdir wouldn't it make more sense to create the temporary files > in a subdirectory inside the gitdir rather tan in the workdir? I do not think we intend to create temporary files, whose final destination is somewhere under $GIT_DIR/, in any working tree; rather, I think we try to create them inside $GIT_DIR (or possibly if the destination is a file in a subdirectory of $GIT_DIR, then in the same subdirectory). What you are seeing definitely smells like a bug in the worktree code, perhaps getting confused by the fact that the full path to these places look "unusual" by starting with a single alphabet followed by a colon (IOW, this may manifest only in Windows port).
Re: Old submodules broken in 2.19rc1 and 2.19rc2
On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen wrote: > > Submodules checked out with older versions of git not longer works in the > latest 2.19 releases. A "git submodule update --recursive" command wil fail > for each submodule with a line saying "fatal: could not open > '/.git' for writing> Is a directory. Can you run the update again with GIT_TRACE=1 git submodule update and post the output? I have the suspicion that e98317508c0 (submodule: ensure core.worktree is set after update, 2018-06-18) might be the offender. Could you try reverting that commit and check as well? git clone https://github.com/git/git && cd git git revert e98317508c0 make install # installs to you home dir at ~/bin and then try again, as well? (though bisection may be more fruitful if this doesn't pan out)
Re: Mailsplit
On Friday, September 7, 2018 8:15:43 AM MST Kevin Daudt wrote: > On Wed, Sep 05, 2018 at 09:17:29PM -0700, Stephen & Linda Smith wrote: > > This is the mailsplit command, and should be executed when running `git > mailsplit`. What does git --exec-dir return? > The other night when I ran "git mailsplit", I recieved an unknown command response. Since then I have upated to 2.19.0 plush my submitted patches for git commit. With the new build mailsplit is found. I don't know what was wrong before.
Re: [PATCH 1/2] commit-graph write: add progress output
On Fri, Sep 07, 2018 at 05:23:31PM +0200, Ævar Arnfjörð Bjarmason wrote: > Hrm, no. I spoke too soon because I was conflating "commit-graph write" > v.s. "gc". For "gc" we're now with this change just e.g. spending 6 > seconds on 2015-04-03-1M-git displaying nothing, because we're looping > through the commits and finding that we have no new work. > > So I'm on the fence about this, but leaning towards just taking my > initial approch. I.e. it sucks if you're e.g. testing different "git gc" > options that we're churning in the background doing nothing, just > because we're trying to report how many *new* things we added to the > graph. > > After all, the main point IMNSHO is not to show some diagnostic output > of exactly how much work we're doing, that I have 200 new commits with > generation numbers or whatever is just useless trivia, but rather to not > leave the user thinking the command is hanging. I think there's some precedent for your view of things, too. For example, "writing objects" counts _all_ of the objects, even though many of them are just copying bytes straight from disk, and some are actually generating a delta and/or zlib-deflating content. So it's not the most precise measurement we could give, but it shows there's activity, and the "average" movement over many objects tends to be reasonably smooth. > So I think I'll just do what I was doing to begin with and change the > message to "Refreshing commit graph generation numbers" or something to > indicate that it's a find/verify/compute operation, not just a compute > operation. So basically yes, I agree with this. :) -Peff
Re: [PATCH 01/11] string_list: print_string_list to use trace_printf
On Fri, Sep 7, 2018 at 2:53 AM Junio C Hamano wrote: > > Jeff King writes: > > > I guess the question is: is this a thing we would want to make available > > to code to leave in all the time? Or is it just for sticking in > > temporarily for a quick dump? > > > > If the former, then I think it needs the early-return at the least (and > > probably _should_ have the key parameter). > > > > But I get the feeling that you really just want the latter, and are only > > grudgingly being pushed into the former by Junio's suggestion. That is not quite the case. I did take all but the one suggestion as I think they are good suggestions and I appreciate it. However having the trace key is not useful I would think. (I did not need it for debugging, but presented an API of what I would have found useful) If the key is really that useful, I would just think we can add it on top. > Well, if that is the case, I'd change "my suggestion" (although I > didn't mean to suggest anything concrete). If this was needed > and/or was useful only during the defvelopment of the remainder of > the series, and if this is known to be a half-hearted change that is > not meant to be useful as a general solution, then let's not take > the main part of the series a hostage to this step; rather, we > should drop this change and leave it to another series that is > willing to do it right. It would save both author's and reviewers' > time if we did so. In that case, would we be interested in a separate patch that kills off that function instead? Stefan
Re: [PATCH v3 0/4] read-cache: speed up index load through parallelization
Ben Peart writes: > On further investigation with the previous patch, I noticed that my test > repos didn't contain the cache tree extension in their index. After doing a > commit to ensure they existed, I realized that in some instances, the time > to load the cache tree exceeded the time to load all the cache entries in > parallel. Because the thread to read the cache tree was started last (due > to having to parse through all the cache entries first) we weren't always > getting optimal performance. > > To better optimize for this case, I decided to write the EOIE extension > as suggested by Junio [1] in response to my earlier multithreading patch > series [2]. This enables me to spin up the thread to load the extensions > earlier as it no longer has to parse through all the cache entries first. Hmph. I kinda liked the simplicity of the previous one, but if we need to start reading the extension sections sooner by eliminating the overhead to scan the cache entries, perhaps we should bite the bullet now. > The big changes in this iteration are: > > - add the EOIE extension > - update the index extension worker thread to start first I guess I'd need to see the actual patch to find this out, but once we rely on a new extension, then we could omit scanning the main index even to partition the work among workers (i.e. like the topic long ago, you can have list of pointers into the main index to help partitioning, plus reset the prefix compression used in v4). I think you didn't get that far in this round, which is good. If the gain with EOIE alone (and starting the worker for the extension section early) is large enough without such a pre-computed work partition table, the simplicity of this round may give us a good stopping point. > This patch conflicts with Duy's patch to remove the double memory copy and > pass in the previous ce instead. The two will need to be merged/reconciled > once they settle down a bit. Thanks. I have a feeling that 67922abb ("read-cache.c: optimize reading index format v4", 2018-09-02) is already 'next'-worthy and ready to be built on, but I'd prefer to hear from Duy to double check.
Re: [PATCH 1/2] commit-graph write: add progress output
On 9/7/2018 1:15 PM, Jeff King wrote: On Fri, Sep 07, 2018 at 05:23:31PM +0200, Ævar Arnfjörð Bjarmason wrote: Hrm, no. I spoke too soon because I was conflating "commit-graph write" v.s. "gc". For "gc" we're now with this change just e.g. spending 6 seconds on 2015-04-03-1M-git displaying nothing, because we're looping through the commits and finding that we have no new work. So I'm on the fence about this, but leaning towards just taking my initial approch. I.e. it sucks if you're e.g. testing different "git gc" options that we're churning in the background doing nothing, just because we're trying to report how many *new* things we added to the graph. After all, the main point IMNSHO is not to show some diagnostic output of exactly how much work we're doing, that I have 200 new commits with generation numbers or whatever is just useless trivia, but rather to not leave the user thinking the command is hanging. I think there's some precedent for your view of things, too. For example, "writing objects" counts _all_ of the objects, even though many of them are just copying bytes straight from disk, and some are actually generating a delta and/or zlib-deflating content. So it's not the most precise measurement we could give, but it shows there's activity, and the "average" movement over many objects tends to be reasonably smooth. So I think I'll just do what I was doing to begin with and change the message to "Refreshing commit graph generation numbers" or something to indicate that it's a find/verify/compute operation, not just a compute operation. So basically yes, I agree with this. :) Same here. Thanks for the discussion. -Stolee
Re: [PATCH] status: show progress bar if refreshing the index takes too long
On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy wrote: > Refreshing the index is usually very fast, but it can still take a > long time sometimes. Cold cache is one, or something else silly (*). > In this case, it's good to show something to let the user know "git > status" is not hanging, it's just busy doing something. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > diff --git a/read-cache.c b/read-cache.c > @@ -1516,6 +1522,8 @@ int refresh_index(struct index_state *istate, unsigned > int flags, > + if (progress) > + display_progress(progress, i); > @@ -1547,6 +1555,8 @@ int refresh_index(struct index_state *istate, unsigned > int flags, > + if (progress) > + stop_progress(&progress); Nit: Both display_progress() and stop_progress() behave sanely when 'progress' is NULL, so no need for the conditional.
Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension
Ben Peart writes: > The extension consists of: > > - 32-bit offset to the end of the index entries > > - 160-bit SHA-1 over the extension types and their sizes (but not > their contents). E.g. if we have "TREE" extension that is N-bytes > long, "REUC" extension that is M-bytes long, followed by "EOIE", > then the hash would be: > > SHA-1("TREE" + + > "REUC" + ) I didn't look at the documentation patch in the larger context, but please make sure that it is clear to the readers that these fixed width integers "binary representations" use network byte order. I briefly wondered if the above should include + "EOIE" + as it is pretty much common file format design to include the header part of the checksum record (with checksum values padded out with NUL bytes) when you define a record to hold the checksum of the entire file. Since this does not protect the contents of each section from bit-flipping, adding the data for EOIE itself in the sum does not give us much (iow, what I am adding above is a constant that does not contribute any entropy). How big is a typical TREE extension in _your_ work repository housing Windows sources? I am guessing that replacing SHA-1 with something faster (as this is not about security but is about disk corruption) and instead hash also the contents of these sections would NOT help all that much in the performance department, as having to page them in to read through would already consume non-trivial amount of time, and that is why you are not hashing the contents. > + /* > + * CACHE_EXT_ENDOFINDEXENTRIES must be written as the last entry before > the SHA1 s/SHA1/trailing checksum/ or something so that we can withstand NewHash world order? > + * so that it can be found and processed before all the index entries > are > + * read. > + */ > + if (!strip_extensions && offset && > !git_env_bool("GIT_TEST_DISABLE_EOIE", 0)) { > + struct strbuf sb = STRBUF_INIT; > + > + write_eoie_extension(&sb, &eoie_c, offset); > + err = write_index_ext_header(&c, NULL, newfd, > CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0 > || ce_write(&c, newfd, sb.buf, sb.len) < 0; > strbuf_release(&sb); > if (err) OK. > +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */ > +#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + > <4-byte length> + EOIE_SIZE */ > + > +#ifndef NO_PTHREADS > +static unsigned long read_eoie_extension(void *mmap, size_t mmap_size) > +{ > + /* > + * The end of index entries (EOIE) extension is guaranteed to be last > + * so that it can be found by scanning backwards from the EOF. > + * > + * "EOIE" > + * <4-byte length> > + * <4-byte offset> > + * <20-byte hash> > + */ > + const char *index, *eoie = (const char *)mmap + mmap_size - > GIT_SHA1_RAWSZ - EOIE_SIZE_WITH_HEADER; > + uint32_t extsize; > + unsigned long offset, src_offset; > + unsigned char hash[GIT_MAX_RAWSZ]; > + git_hash_ctx c; > + > + /* validate the extension signature */ > + index = eoie; > + if (CACHE_EXT(index) != CACHE_EXT_ENDOFINDEXENTRIES) > + return 0; > + index += sizeof(uint32_t); > + > + /* validate the extension size */ > + extsize = get_be32(index); > + if (extsize != EOIE_SIZE) > + return 0; > + index += sizeof(uint32_t); Do we know we have at least 8-byte to consume to perform the above two checks, or is that something we need to verify at the beginning of this function? Better yet, as we know that a correct EOIE with its own header is 28-byte long, we probably should abort if mmap_size is smaller than that. > + /* > + * Validate the offset we're going to look for the first extension > + * signature is after the index header and before the eoie extension. > + */ > + offset = get_be32(index); > + if ((const char *)mmap + offset < (const char *)mmap + sizeof(struct > cache_header)) > + return 0; Claims that the end is before the beginning, which we reject as bogus. Good. > + if ((const char *)mmap + offset >= eoie) > + return 0; Claims that the end is beyond the EOIE marker we should have placed after it, which we reject as bogus. Good. > + index += sizeof(uint32_t); > + > + /* > + * The hash is computed over extension types and their sizes (but not > + * their contents). E.g. if we have "TREE" extension that is N-bytes > + * long, "REUC" extension that is M-bytes long, followed by "EOIE", > + * then the hash would be: > + * > + * SHA-1("TREE" + + > + * "REUC" + ) > + */ > + src_offset = offset; > + the_hash_algo->init_fn(&c); > + while (src_offset < mmap_size - the_hash_algo->rawsz - > EOIE_SIZE_WITH_HEADER) { > + /* After an array of active_nr index entries, (Style nit). > +
[PATCH] config.mak.dev: add -Wformat-security
We currently build cleanly with -Wformat-security, and it's a good idea to make sure we continue to do so (since calls that trigger the warning may be security vulnerabilities). Note that we cannot use the stronger -Wformat-nonliteral, as there are case where we are clever with passing around pointers to string literals. E.g., bisect_rev_setup() takes bad_format and good_format parameters. These ultimately come from literals, but they still trigger the warning. Some of these might be fixable (e.g., by passing flags from which we locally select a format), and might even be worth fixing (not because of security, but just because it's an easy mistake to pass the wrong format). But there are other cases which are likely quite hard to fix (we actually generate formats in a local buffer in some cases). So let's punt on that for now and start with -Wformat-security, which is supposed to catch the most important cases. Signed-off-by: Jeff King --- config.mak.dev | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.dev b/config.mak.dev index 9a998149d9..f832752454 100644 --- a/config.mak.dev +++ b/config.mak.dev @@ -14,6 +14,7 @@ CFLAGS += -Wpointer-arith CFLAGS += -Wstrict-prototypes CFLAGS += -Wunused CFLAGS += -Wvla +CFLAGS += -Wformat-security ifndef COMPILER_FEATURES COMPILER_FEATURES := $(shell ./detect-compiler $(CC)) -- 2.19.0.rc2.594.gc4806cd463
[PATCH 2/2] mingw: fix mingw_open_append to work with named pipes
From: Jeff Hostetler Signed-off-by: Jeff Hostetler --- compat/mingw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 858ca14a57..ef03bbe5d2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -355,7 +355,7 @@ static int mingw_open_append(wchar_t const *wfilename, int oflags, ...) * FILE_SHARE_WRITE is required to permit child processes * to append to the file. */ - handle = CreateFileW(wfilename, FILE_APPEND_DATA, + handle = CreateFileW(wfilename, FILE_WRITE_DATA | FILE_APPEND_DATA, FILE_SHARE_WRITE | FILE_SHARE_READ, NULL, create, FILE_ATTRIBUTE_NORMAL, NULL); if (handle == INVALID_HANDLE_VALUE) -- gitgitgadget
[PATCH 1/2] t0051: test GIT_TRACE to a windows named pipe
From: Jeff Hostetler Create a test-tool helper to create the server side of a windows named pipe, wait for a client connection, and copy data written to the pipe to stdout. Create t0051 test to route GIT_TRACE output of a command to a named pipe using the above test-tool helper. Signed-off-by: Jeff Hostetler --- Makefile | 1 + t/helper/test-tool.c | 3 ++ t/helper/test-tool.h | 3 ++ t/helper/test-windows-named-pipe.c | 72 ++ t/t0051-windows-named-pipe.sh | 17 +++ 5 files changed, 96 insertions(+) create mode 100644 t/helper/test-windows-named-pipe.c create mode 100755 t/t0051-windows-named-pipe.sh diff --git a/Makefile b/Makefile index e4b503d259..b1b934d295 100644 --- a/Makefile +++ b/Makefile @@ -731,6 +731,7 @@ TEST_BUILTINS_OBJS += test-submodule-config.o TEST_BUILTINS_OBJS += test-subprocess.o TEST_BUILTINS_OBJS += test-urlmatch-normalization.o TEST_BUILTINS_OBJS += test-wildmatch.o +TEST_BUILTINS_OBJS += test-windows-named-pipe.o TEST_BUILTINS_OBJS += test-write-cache.o TEST_PROGRAMS_NEED_X += test-dump-fsmonitor diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 805a45de9c..7a9764cd5c 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -41,6 +41,9 @@ static struct test_cmd cmds[] = { { "subprocess", cmd__subprocess }, { "urlmatch-normalization", cmd__urlmatch_normalization }, { "wildmatch", cmd__wildmatch }, +#ifdef GIT_WINDOWS_NATIVE + { "windows-named-pipe", cmd__windows_named_pipe }, +#endif { "write-cache", cmd__write_cache }, }; diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 7116ddfb94..01c34fe5e7 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -35,6 +35,9 @@ int cmd__submodule_config(int argc, const char **argv); int cmd__subprocess(int argc, const char **argv); int cmd__urlmatch_normalization(int argc, const char **argv); int cmd__wildmatch(int argc, const char **argv); +#ifdef GIT_WINDOWS_NATIVE +int cmd__windows_named_pipe(int argc, const char **argv); +#endif int cmd__write_cache(int argc, const char **argv); #endif diff --git a/t/helper/test-windows-named-pipe.c b/t/helper/test-windows-named-pipe.c new file mode 100644 index 00..b4b752b01a --- /dev/null +++ b/t/helper/test-windows-named-pipe.c @@ -0,0 +1,72 @@ +#include "test-tool.h" +#include "git-compat-util.h" +#include "strbuf.h" + +#ifdef GIT_WINDOWS_NATIVE +static const char *usage_string = ""; + +#define TEST_BUFSIZE (4096) + +int cmd__windows_named_pipe(int argc, const char **argv) +{ + const char *filename; + struct strbuf pathname = STRBUF_INIT; + int err; + HANDLE h; + BOOL connected; + char buf[TEST_BUFSIZE + 1]; + + if (argc < 2) + goto print_usage; + filename = argv[1]; + if (strchr(filename, '/') || strchr(filename, '\\')) + goto print_usage; + strbuf_addf(&pathname, "//./pipe/%s", filename); + + /* +* Create a single instance of the server side of the named pipe. +* This will allow exactly one client instance to connect to it. +*/ + h = CreateNamedPipeA( + pathname.buf, + PIPE_ACCESS_INBOUND | FILE_FLAG_FIRST_PIPE_INSTANCE, + PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT, + PIPE_UNLIMITED_INSTANCES, + TEST_BUFSIZE, TEST_BUFSIZE, 0, NULL); + if (h == INVALID_HANDLE_VALUE) { + err = err_win_to_posix(GetLastError()); + fprintf(stderr, "CreateNamedPipe failed: %s\n", + strerror(err)); + return err; + } + + connected = ConnectNamedPipe(h, NULL) + ? TRUE + : (GetLastError() == ERROR_PIPE_CONNECTED); + if (!connected) { + err = err_win_to_posix(GetLastError()); + fprintf(stderr, "ConnectNamedPipe failed: %s\n", + strerror(err)); + CloseHandle(h); + return err; + } + + while (1) { + DWORD nbr; + BOOL success = ReadFile(h, buf, TEST_BUFSIZE, &nbr, NULL); + if (!success || nbr == 0) + break; + buf[nbr] = 0; + + write(1, buf, nbr); + } + + DisconnectNamedPipe(h); + CloseHandle(h); + return 0; + +print_usage: + fprintf(stderr, "usage: %s %s\n", argv[0], usage_string); + return 1; +} +#endif diff --git a/t/t0051-windows-named-pipe.sh b/t/t0051-windows-named-pipe.sh new file mode 100755 index 00..10ac92d225 --- /dev/null +++ b/t/t0051-windows-named-pipe.sh @@ -0,0 +1,17 @@ +#!/bin/sh + +test_description='Windows named pipes' + +. ./test-lib.sh + +test_expect_success MINGW 'o_append write to named pipe' ' + GIT_TRACE="$(pwd)/expect" git status >/dev/null 2>&1 && + { tes
[PATCH 0/2] Fixup for js/mingw-o-append
The recent change mingw O_APPEND change breaks writing to named pipes on Windows. The first commit adds a new test to confirm the breakage and the second commit fixes the problem. These could be squashed together or we can just keep the fix and omit the test if that would be better. d641097589 (js/mingw-o-append) mingw: enable atomic O_APPEND The new mingw_open_append() routine successfully opens the client side of the named pipe, but the first write() to it fails with EBADF. Adding the FILE_WRITE_DATA corrects the problem. Signed-off-by: Jeff Hostetler jeffh...@microsoft.com [jeffh...@microsoft.com] Cc: j6t@kdbg.orgCc: johannes.schinde...@gmx.decc: gitster@pobox.comCc: p...@peff.net Jeff Hostetler (2): t0051: test GIT_TRACE to a windows named pipe mingw: fix mingw_open_append to work with named pipes Makefile | 1 + compat/mingw.c | 2 +- t/helper/test-tool.c | 3 ++ t/helper/test-tool.h | 3 ++ t/helper/test-windows-named-pipe.c | 72 ++ t/t0051-windows-named-pipe.sh | 17 +++ 6 files changed, 97 insertions(+), 1 deletion(-) create mode 100644 t/helper/test-windows-named-pipe.c create mode 100755 t/t0051-windows-named-pipe.sh base-commit: d641097589160eb795127d8dbcb14c877c217b60 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-35%2Fjeffhostetler%2Ffixup-mingw-o-append-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-35/jeffhostetler/fixup-mingw-o-append-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/35 -- gitgitgadget
[PATCH v2 0/2] commit-graph: add progress output
Based on feedback on v1, and the "this is yelling at my users through gc.log" bug I found. Range-diff with v1: 1: e0a09ad641 ! 1: b2dcfa0f55 commit-graph write: add progress output @@ -5,8 +5,8 @@ Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the -2015-04-03-1M-git.git[1] test repository, which is a test case for -larger monorepos. +2015-04-03-1M-git.git[1] test repository (a test case for a large +monorepository). Furthermore, since the gc.writeCommitGraph setting was added in d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27), @@ -19,7 +19,6 @@ $ git -c gc.writeCommitGraph=true gc Enumerating objects: 2821, done. [...] -Total 2821 (delta 1670), reused 2821 (delta 1670) Computing commit graph generation numbers: 100% (867/867), done. On larger repositories, such as linux.git the delayed progress bar(s) @@ -27,6 +26,7 @@ previously happening, printing nothing while we write the graph: $ git -c gc.writeCommitGraph=true gc +[...] Annotating commits in commit graph: 1565573, done. Computing commit graph generation numbers: 100% (782484/782484), done. @@ -42,20 +42,90 @@ With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We -could be less lazy here and show progress, either detect by detecting -that we're only processing one pack, or by first looping over the -packs to discover how many commits they have. I don't see the point in -doing that work. So instead we get (on 2015-04-03-1M-git.git): +could be less lazy here and show progress, either by detecting that +we're only processing one pack, or by first looping over the packs to +discover how many commits they have. I don't see the point in doing +that work. So instead we get (on 2015-04-03-1M-git.git): $ echo pack-.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs Finding commits for commit graph: 13064614, done. Annotating commits in commit graph: 3001341, done. Computing commit graph generation numbers: 100% (1000447/1000447), done. +No GC mode uses --stdin-packs. It's what they use at Microsoft to +manually compute the generation numbers for their collection of large +packs which are never coalesced. + +The reason we need a "report_progress" variable passed down from "git +gc" is so that we don't report this output when we're running in the +process "git gc --auto" detaches from the terminal. + +Since we write the commit graph from the "git gc" process itself (as +opposed to what we do with say the "git repack" phase), we'd end up +writing the output to .git/gc.log and reporting it to the user next +time as part of the "The last gc run reported the following[...]" +error, see 329e6e8794 ("gc: save log from daemonized gc --auto and +print it next time", 2015-09-19). + +So we must keep track of whether or not we're running in that +demonized mode, and if so print no progress. + +See [2] and subsequent replies for a discussion of an approach not +taken in compute_generation_numbers(). I.e. we're saying "Computing +commit graph generation numbers", even though on an established +history we're mostly skipping over all the work we did in the +past. This is similar to the white lie we tell in the "Writing +objects" phase (not all are objects being written). + +Always showing progress is considered more important than +accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang +for 6 seconds with no output on the second "git gc" if no changes were +made to any objects in the interim if we'd take the approach in [2]. + 1. https://github.com/avar/2015-04-03-1M-git +2. + (https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com/) + Signed-off-by: Ævar Arnfjörð Bjarmason + diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c + --- a/builtin/commit-graph.c + +++ b/builtin/commit-graph.c +@@ + opts.obj_dir = get_object_directory(); + + if (opts.reachable) { +- write_commit_graph_reachable(opts.obj_dir, opts.append); ++ write_commit_graph_reachable(opts.obj_dir, opts.append, 1); + return 0; + } + +
[PATCH v2 2/2] commit-graph verify: add progress output
For the reasons explained in the "commit-graph write: add progress output" commit leading up to this one, emit progress on "commit-graph verify". Since e0fd51e1d7 ("fsck: verify commit-graph", 2018-06-27) "git fsck" has called this command if core.commitGraph=true, but there's been no progress output to indicate that anything was different. Now there is (on my tiny dotfiles.git repository): $ git -c core.commitGraph=true -C ~/ fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (2821/2821), done. dangling blob 5b8bbdb9b788ed90459f505b0934619c17cc605b Verifying commits in commit graph: 100% (867/867), done. And on a larger repository, such as the 2015-04-03-1M-git.git test repository: $ time git -c core.commitGraph=true -C ~/g/2015-04-03-1M-git/ commit-graph verify Verifying commits in commit graph: 100% (1000447/1000447), done. real0m7.813s [...] Since the "commit-graph verify" subcommand is never called from "git gc", we don't have to worry about passing some some "report_progress" progress variable around for this codepath. Signed-off-by: Ævar Arnfjörð Bjarmason --- commit-graph.c | 5 + 1 file changed, 5 insertions(+) diff --git a/commit-graph.c b/commit-graph.c index 2c5d996194..0bfb8c180e 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -922,6 +922,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) int generation_zero = 0; struct hashfile *f; int devnull; + struct progress *progress = NULL; if (!g) { graph_report("no commit-graph file loaded"); @@ -989,11 +990,14 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH) return verify_commit_graph_error; + progress = start_progress("Verifying commits in commit graph", + g->num_commits); for (i = 0; i < g->num_commits; i++) { struct commit *graph_commit, *odb_commit; struct commit_list *graph_parents, *odb_parents; uint32_t max_generation = 0; + display_progress(progress, i + 1); hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i); graph_commit = lookup_commit(r, &cur_oid); @@ -1070,6 +1074,7 @@ int verify_commit_graph(struct repository *r, struct commit_graph *g) graph_commit->date, odb_commit->date); } + stop_progress(&progress); return verify_commit_graph_error; } -- 2.19.0.rc1.350.ge57e33dbd1
[PATCH v2 1/2] commit-graph write: add progress output
Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the 2015-04-03-1M-git.git[1] test repository (a test case for a large monorepository). Furthermore, since the gc.writeCommitGraph setting was added in d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27), there was no indication at all from a "git gc" run that anything was different. This why one of the progress bars being added here uses start_progress() instead of start_delayed_progress(), so that it's guaranteed to be seen. E.g. on my tiny 867 commit dotfiles.git repository: $ git -c gc.writeCommitGraph=true gc Enumerating objects: 2821, done. [...] Computing commit graph generation numbers: 100% (867/867), done. On larger repositories, such as linux.git the delayed progress bar(s) will kick in, and we'll show what's going on instead of, as was previously happening, printing nothing while we write the graph: $ git -c gc.writeCommitGraph=true gc [...] Annotating commits in commit graph: 1565573, done. Computing commit graph generation numbers: 100% (782484/782484), done. Note that here we don't show "Finding commits for commit graph", this is because under "git gc" we seed the search with the commit references in the repository, and that set is too small to show any progress, but would e.g. on a smaller repo such as git.git with --stdin-commits: $ git rev-list --all | git -c gc.writeCommitGraph=true write --stdin-commits Finding commits for commit graph: 100% (162576/162576), done. Computing commit graph generation numbers: 100% (162576/162576), done. With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We could be less lazy here and show progress, either by detecting that we're only processing one pack, or by first looping over the packs to discover how many commits they have. I don't see the point in doing that work. So instead we get (on 2015-04-03-1M-git.git): $ echo pack-.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs Finding commits for commit graph: 13064614, done. Annotating commits in commit graph: 3001341, done. Computing commit graph generation numbers: 100% (1000447/1000447), done. No GC mode uses --stdin-packs. It's what they use at Microsoft to manually compute the generation numbers for their collection of large packs which are never coalesced. The reason we need a "report_progress" variable passed down from "git gc" is so that we don't report this output when we're running in the process "git gc --auto" detaches from the terminal. Since we write the commit graph from the "git gc" process itself (as opposed to what we do with say the "git repack" phase), we'd end up writing the output to .git/gc.log and reporting it to the user next time as part of the "The last gc run reported the following[...]" error, see 329e6e8794 ("gc: save log from daemonized gc --auto and print it next time", 2015-09-19). So we must keep track of whether or not we're running in that demonized mode, and if so print no progress. See [2] and subsequent replies for a discussion of an approach not taken in compute_generation_numbers(). I.e. we're saying "Computing commit graph generation numbers", even though on an established history we're mostly skipping over all the work we did in the past. This is similar to the white lie we tell in the "Writing objects" phase (not all are objects being written). Always showing progress is considered more important than accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang for 6 seconds with no output on the second "git gc" if no changes were made to any objects in the interim if we'd take the approach in [2]. 1. https://github.com/avar/2015-04-03-1M-git 2. (https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com/) Signed-off-by: Ævar Arnfjörð Bjarmason --- builtin/commit-graph.c | 5 ++-- builtin/gc.c | 3 ++- commit-graph.c | 60 -- commit-graph.h | 5 ++-- 4 files changed, 60 insertions(+), 13 deletions(-) diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index 0bf0c48657..bc0fa9ba52 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -151,7 +151,7 @@ static int graph_write(int argc, const char **argv) opts.obj_dir = get_object_directory(); if (opts.reachable) { - write_commit_graph_reachable(opts.obj_dir, opts.append); + write_commit_graph_reachable(opts.obj_dir, opts.append, 1); return 0; } @@ -171,7 +171,8 @@ static int graph_write(int argc, const char **argv) write_commit_graph(opts.obj_dir, pack_indexes,
Re: [PATCH v3 0/4] read-cache: speed up index load through parallelization
On 9/7/2018 1:21 PM, Junio C Hamano wrote: Ben Peart writes: On further investigation with the previous patch, I noticed that my test repos didn't contain the cache tree extension in their index. After doing a commit to ensure they existed, I realized that in some instances, the time to load the cache tree exceeded the time to load all the cache entries in parallel. Because the thread to read the cache tree was started last (due to having to parse through all the cache entries first) we weren't always getting optimal performance. To better optimize for this case, I decided to write the EOIE extension as suggested by Junio [1] in response to my earlier multithreading patch series [2]. This enables me to spin up the thread to load the extensions earlier as it no longer has to parse through all the cache entries first. Hmph. I kinda liked the simplicity of the previous one, but if we need to start reading the extension sections sooner by eliminating the overhead to scan the cache entries, perhaps we should bite the bullet now. I preferred the simplicity as well but when I was profiling the code and found out that loading the extensions was most often the last thread to complete, I took this intermediate step to speed things up. The big changes in this iteration are: - add the EOIE extension - update the index extension worker thread to start first I guess I'd need to see the actual patch to find this out, but once we rely on a new extension, then we could omit scanning the main index even to partition the work among workers (i.e. like the topic long ago, you can have list of pointers into the main index to help partitioning, plus reset the prefix compression used in v4). I think you didn't get that far in this round, which is good. If the gain with EOIE alone (and starting the worker for the extension section early) is large enough without such a pre-computed work partition table, the simplicity of this round may give us a good stopping point. Agreed. I didn't go that far in this series as it doesn't appear to be necessary. We could always add that later if it turned out to be worth the additional complexity. This patch conflicts with Duy's patch to remove the double memory copy and pass in the previous ce instead. The two will need to be merged/reconciled once they settle down a bit. Thanks. I have a feeling that 67922abb ("read-cache.c: optimize reading index format v4", 2018-09-02) is already 'next'-worthy and ready to be built on, but I'd prefer to hear from Duy to double check. I'll take a closer look at what this will entail. It gets more complicated as we don't actually have a previous expanded cache entry when starting each thread. I'll see how complex it makes the code and how much additional performance it gives.
Re: [PATCH 0/2] Fixup for js/mingw-o-append
GitGitGadget botched the CCs when I submitted this. Replying here to add them. Sorry, Jeff https://github.com/gitgitgadget/gitgitgadget/issues/35 On 9/7/2018 2:19 PM, Jeff Hostetler via GitGitGadget wrote: The recent change mingw O_APPEND change breaks writing to named pipes on Windows. The first commit adds a new test to confirm the breakage and the second commit fixes the problem. These could be squashed together or we can just keep the fix and omit the test if that would be better. d641097589 (js/mingw-o-append) mingw: enable atomic O_APPEND The new mingw_open_append() routine successfully opens the client side of the named pipe, but the first write() to it fails with EBADF. Adding the FILE_WRITE_DATA corrects the problem. Signed-off-by: Jeff Hostetler jeffh...@microsoft.com [jeffh...@microsoft.com] Cc: j6t@kdbg.orgCc: johannes.schinde...@gmx.decc: gitster@pobox.comCc: p...@peff.net Jeff Hostetler (2): t0051: test GIT_TRACE to a windows named pipe mingw: fix mingw_open_append to work with named pipes Makefile | 1 + compat/mingw.c | 2 +- t/helper/test-tool.c | 3 ++ t/helper/test-tool.h | 3 ++ t/helper/test-windows-named-pipe.c | 72 ++ t/t0051-windows-named-pipe.sh | 17 +++ 6 files changed, 97 insertions(+), 1 deletion(-) create mode 100644 t/helper/test-windows-named-pipe.c create mode 100755 t/t0051-windows-named-pipe.sh base-commit: d641097589160eb795127d8dbcb14c877c217b60 Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-35%2Fjeffhostetler%2Ffixup-mingw-o-append-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-35/jeffhostetler/fixup-mingw-o-append-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/35
Re: [PATCH 2/2] submodule.c: warn about missing submodule git directories
On Wed, Sep 5, 2018 at 12:18 PM Jonathan Nieder wrote: > > Hi, > > Stefan Beller wrote: > > > This is the continuation of f2d48994dc1 (submodule.c: submodule_move_head > > works with broken submodules, 2017-04-18), which tones down the case of > > "broken submodule" in case of a missing git directory of the submodule to > > be only a warning. > > > > Signed-off-by: Stefan Beller > > --- > > submodule.c | 16 > > t/t2013-checkout-submodule.sh | 2 +- > > 2 files changed, 17 insertions(+), 1 deletion(-) > > I don't understand what workflow this is a part of. > > If the submodule is missing, shouldn't we make it non-missing instead > of producing a partial checkout that doesn't build? No. checkout and friends do not want to touch the network (unless we are in a partial clone world; that is the user is fully aware that commands can use the network at totally unexpected times) So for that, all we can do is better error messages. Stefan
Re: [PATCH 2/2] submodule.c: warn about missing submodule git directories
Hi, Stefan Beller wrote: > On Wed, Sep 5, 2018 at 12:18 PM Jonathan Nieder wrote: >> Stefan Beller wrote: >>> This is the continuation of f2d48994dc1 (submodule.c: submodule_move_head >>> works with broken submodules, 2017-04-18), which tones down the case of >>> "broken submodule" in case of a missing git directory of the submodule to >>> be only a warning. [...] >> I don't understand what workflow this is a part of. >> >> If the submodule is missing, shouldn't we make it non-missing instead >> of producing a partial checkout that doesn't build? > > No. checkout and friends do not want to touch the network > (unless we are in a partial clone world; that is the user is fully > aware that commands can use the network at totally unexpected > times) > > So for that, all we can do is better error messages. Thanks. This patch doesn't just improve error messages, though, but it makes the operation report success instead of failing. Isn't that likely to produce more confusion when I run additional commands afterward? In other words, instead of $ git checkout --recurse-submodules -B master origin/new-fancy-branch Branch 'master' set up to track remote branch 'new-fancy-branch' from 'origin'. Switched to a new branch 'master' warning: Submodule 'new-fancy-submodule' is missing $ git status [some unclean state] I would prefer to experience $ git checkout --recurse-submodules -B master origin/new-fancy-branch fatal: missing submodule 'new-fancy-submodule' hint: run "git fetch --recurse-submodules" to fetch it $ git status [clean state] $ git fetch --recurse-submodules [...] $ git checkout --recurse-submodules -B master origin/new-fancy-branch Branch 'master' set up to track remote branch 'new-fancy-branch' from 'origin'. Switched to a new branch 'master' $ git status [clean state] Thanks, Jonathan
Re: Old submodules broken in 2.19rc1 and 2.19rc2
Hi, Allan Sandfeld Jensen wrote: > I discovered it by using Debian testing, and it is shipping the 2.17rcs for > some reason. I believe you mean Debian unstable. Debian testing has 2.18.0. > The example is just an old checkout of qt5.git with submodules, > it is rather large. Do you have a reproduction recipe I can use (starting with a "git clone" command I would run using an old version of Git, etc)? > I could try bisecting, but I am not sure I have the time anytime soon, I just > wanted to report it to you first incase you knew of a change that suddenly > assumed the new structure. This is definitely not an intentional change, so more details would be very welcome. Thanks, Jonathan
Re: Old submodules broken in 2.19rc1 and 2.19rc2
Stefan Beller wrote: > On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen > wrote: >> A "git submodule update --recursive" command wil fail >> for each submodule with a line saying "fatal: could not open >> '/.git' for writing> Is a directory. [...] > I have the suspicion that e98317508c0 (submodule: > ensure core.worktree is set after update, 2018-06-18) > might be the offender. Oh! That seems likely. Allan, output from "strace -f git submodule update --init" would also be interesting. Jonathan
Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension
On 9/7/2018 1:55 PM, Junio C Hamano wrote: Ben Peart writes: The extension consists of: - 32-bit offset to the end of the index entries - 160-bit SHA-1 over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: SHA-1("TREE" + + "REUC" + ) I didn't look at the documentation patch in the larger context, but please make sure that it is clear to the readers that these fixed width integers "binary representations" use network byte order. At the top of the documentation it says "All binary numbers are in network byte order" and that is not repeated for any of the other sections that are documenting the file format. I briefly wondered if the above should include + "EOIE" + as it is pretty much common file format design to include the header part of the checksum record (with checksum values padded out with NUL bytes) when you define a record to hold the checksum of the entire file. Since this does not protect the contents of each section from bit-flipping, adding the data for EOIE itself in the sum does not give us much (iow, what I am adding above is a constant that does not contribute any entropy). How big is a typical TREE extension in _your_ work repository housing Windows sources? I am guessing that replacing SHA-1 with something faster (as this is not about security but is about disk corruption) and instead hash also the contents of these sections would NOT help all that much in the performance department, as having to page them in to read through would already consume non-trivial amount of time, and that is why you are not hashing the contents. The purpose of the SHA isn't to detect disk corruption (we already have a SHA for the entire index that can serve that purpose) but to help ensure that this was actually a valid EOIE extension and not a lucky random set of bytes. I had used leading and trailing signature bytes along with the length and version bytes to validate it was an actual EOIE extension but you suggested [1] that I use a SHA of the 4 byte extension type + 4 byte extension length instead so I rewrote it that way. [1] https://public-inbox.org/git/xmqq1sl017dw@gitster.mtv.corp.google.com/ + /* +* CACHE_EXT_ENDOFINDEXENTRIES must be written as the last entry before the SHA1 s/SHA1/trailing checksum/ or something so that we can withstand NewHash world order? I thought about this but in the document elsewhere it refers to it as "160-bit SHA-1 over the content of the index file before this checksum." and there are at least a dozen other references to "SHA-1" so I figured we can fix them all at the same time when we have a new/better name. :-) +* so that it can be found and processed before all the index entries are +* read. +*/ + if (!strip_extensions && offset && !git_env_bool("GIT_TEST_DISABLE_EOIE", 0)) { + struct strbuf sb = STRBUF_INIT; + + write_eoie_extension(&sb, &eoie_c, offset); + err = write_index_ext_header(&c, NULL, newfd, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0 || ce_write(&c, newfd, sb.buf, sb.len) < 0; strbuf_release(&sb); if (err) OK. +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */ +#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + <4-byte length> + EOIE_SIZE */ + +#ifndef NO_PTHREADS +static unsigned long read_eoie_extension(void *mmap, size_t mmap_size) +{ + /* +* The end of index entries (EOIE) extension is guaranteed to be last +* so that it can be found by scanning backwards from the EOF. +* +* "EOIE" +* <4-byte length> +* <4-byte offset> +* <20-byte hash> +*/ + const char *index, *eoie = (const char *)mmap + mmap_size - GIT_SHA1_RAWSZ - EOIE_SIZE_WITH_HEADER; + uint32_t extsize; + unsigned long offset, src_offset; + unsigned char hash[GIT_MAX_RAWSZ]; + git_hash_ctx c; + + /* validate the extension signature */ + index = eoie; + if (CACHE_EXT(index) != CACHE_EXT_ENDOFINDEXENTRIES) + return 0; + index += sizeof(uint32_t); + + /* validate the extension size */ + extsize = get_be32(index); + if (extsize != EOIE_SIZE) + return 0; + index += sizeof(uint32_t); Do we know we have at least 8-byte to consume to perform the above two checks, or is that something we need to verify at the beginning of this function? Better yet, as we know that a correct EOIE with its own header is 28-byte long, we probably should abort if mmap_size is smaller than that. I'll add that additional test. + /* +* Validate the offset we're going to look for the first extension +* signature is after the index he
Re: [PATCH] status: show progress bar if refreshing the index takes too long
On 9/7/2018 1:38 PM, Eric Sunshine wrote: On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy wrote: Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one, or something else silly (*). In this case, it's good to show something to let the user know "git status" is not hanging, it's just busy doing something. Signed-off-by: Nguyễn Thái Ngọc Duy --- diff --git a/read-cache.c b/read-cache.c @@ -1516,6 +1522,8 @@ int refresh_index(struct index_state *istate, unsigned int flags, + if (progress) + display_progress(progress, i); @@ -1547,6 +1555,8 @@ int refresh_index(struct index_state *istate, unsigned int flags, + if (progress) + stop_progress(&progress); Nit: Both display_progress() and stop_progress() behave sanely when 'progress' is NULL, so no need for the conditional. Don't forget this one in preload-index.c:preload_index(): + if (pd.progress) + stop_progress(&pd.progress); I found this extra one by creating the following rules in a Coccinelle script: @@ expression e; @@ - if (e) { stop_progress(&e); } + stop_progress(&e); @@ expression e; expression i; @@ - if (e) { display_progress(e, i); } + display_progress(e, i); Not sure if we want to put these in a .cocci script or not. Thanks, -Stolee
Re: [PATCH v3 3/4] read-cache: load cache extensions on a worker thread
Ben Peart writes: > +struct load_index_extensions > +{ > +#ifndef NO_PTHREADS > + pthread_t pthread; > +#endif > + struct index_state *istate; > + void *mmap; > + size_t mmap_size; > + unsigned long src_offset; If the file format only allows uint32_t on any platform, perhaps this is better specified as uint32_t? Or if this is offset into a mmap'ed region of memory, size_t may be more appropriate. Same comment applies to "extension_offset" we see below (which in turn means the returned type of read_eoie_extension() function may want to match). > + }; Space before '}'?? > + > +static void *load_index_extensions(void *_data) > +{ > + struct load_index_extensions *p = _data; Perhaps we are being superstitious, but I think our code try to avoid leading underscore when able, i.e. load_index_extensions(void *data_) { struct load_index_extensions *p = data; > + unsigned long src_offset = p->src_offset; > + > + while (src_offset <= p->mmap_size - the_hash_algo->rawsz - 8) { > + /* After an array of active_nr index entries, > + * there can be arbitrary number of extended > + * sections, each of which is prefixed with > + * extension name (4-byte) and section length > + * in 4-byte network byte order. > + */ > + uint32_t extsize; > + memcpy(&extsize, (char *)p->mmap + src_offset + 4, 4); > + extsize = ntohl(extsize); The same "ntohl(), not get_be32()?" question as the one for the previous step applies here, too. I think the answer is "the original was written that way" and that is acceptable, but once this series lands, we may want to review the whole file and see if it is worth making them consistent with a separate clean-up patch. I think mmap() and munmap() are the only places that wants p->mmap and mmap parameters passed around in various callchains to be of type "void *"---I wonder if it is simpler to use "const char *" throughout and only cast it to "void *" when necessary (I suspect that there is nowhere we need to cast to or from "void *" explicitly if we did so---assignment and argument passing would give us an appropriate cast for free)? > + if (read_index_extension(p->istate, > + (const char *)p->mmap + src_offset, > + (char *)p->mmap + src_offset + 8, > + extsize) < 0) { > + munmap(p->mmap, p->mmap_size); > + die("index file corrupt"); > + } > + ... > @@ -1907,6 +1951,11 @@ ... > ... > + p.mmap = mmap; > + p.mmap_size = mmap_size; > + > +#ifndef NO_PTHREADS > + nr_threads = git_config_get_index_threads(); > + if (!nr_threads) > + nr_threads = online_cpus(); > + > + if (nr_threads >= 2) { > + extension_offset = read_eoie_extension(mmap, mmap_size); > + if (extension_offset) { > + /* create a thread to load the index extensions */ > + p.src_offset = extension_offset; > + if (pthread_create(&p.pthread, NULL, > load_index_extensions, &p)) > + die(_("unable to create > load_index_extensions_thread")); > + } > + } > +#endif Makes sense.
Re: [PATCH] status: show progress bar if refreshing the index takes too long
On Fri, Sep 7, 2018 at 4:29 PM Derrick Stolee wrote: > On 9/7/2018 1:38 PM, Eric Sunshine wrote: > > On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy > > wrote: > >> + if (progress) > >> + display_progress(progress, i); > >> + if (progress) > >> + stop_progress(&progress); > > Nit: Both display_progress() and stop_progress() behave sanely when > > 'progress' is NULL, so no need for the conditional. > > Don't forget this one in preload-index.c:preload_index(): > + if (pd.progress) > + stop_progress(&pd.progress); > > I found this extra one by creating the following rules in a Coccinelle script: > - if (e) { stop_progress(&e); } > - if (e) { display_progress(e, i); } > Not sure if we want to put these in a .cocci script or not. If so, we'd want to match display_throughput() and stop_progress_msg(), as well.
Re: [PATCH v3 2/4] wt-status: rename commitable to committable
"Stephen P. Smith" writes: > Fix variable spelling error. > > Signed-off-by: Stephen P. Smith > --- Thanks ;-)
Re: [PATCH v3 0/4] wt-status.c: commitable flag
"Stephen P. Smith" writes: > A couple of years ago, during a patch review Junio found that the > commitable bit as implemented in wt-status.c was broken. > > Stephen P. Smith (4): > Move has_unmerged earlier in the file. > wt-status: rename commitable to committable > t7501: add test of "commit --dry-run --short" > wt-status.c: Set the committable flag in the collect phase. > > builtin/commit.c | 18 +- > t/t7501-commit.sh | 10 -- > wt-status.c | 45 +++-- > wt-status.h | 2 +- > 4 files changed, 45 insertions(+), 30 deletions(-) Thanks.
Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.
"Stephen P. Smith" writes: > void wt_status_collect(struct wt_status *s) > { > + struct wt_status_state state; > wt_status_collect_changes_worktree(s); > > if (s->is_initial) > @@ -746,6 +752,11 @@ void wt_status_collect(struct wt_status *s) > else > wt_status_collect_changes_index(s); > wt_status_collect_untracked(s); > + > + memset(&state, 0, sizeof(state)); > + wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD")); > + if (state.merge_in_progress && !has_unmerged(s)) > + s->committable = 1; > } I do not think this is wrong per-se, but now we have three calls to wt_status_get_state() in wt-status.c, and it smells (at least to me) that each of these callsites does so only because their callers do not give them enough information, forcing them to find the state out for themselves. Given that the ideal paradigm to come up with the "work tree status" is to do the collection followed by printing, and among three callers of get_state(), two appear in the "printing" side of the callchain [*1*], I wonder if it makes a better organization to - embed struct wt_status_state in struct wt_status - make the new call to wt_status_get_state() added above in this patch to populate the wt_status_state embedded in 's' - change the other two callers of wt_status_get_state() in wt_longstatus_print() and wt_porcelain_v2_print_tracking(), both of which will receive 's' that has been populated by a previous call to wt_status_collect(), so that they do *not* call get_state() themselves, but instead use the result recorded in wt_status_state embedded in 's', which was populated by wt_status_collect() before they got called. That would bring the resulting code even closer to the ideal, i.e. the "collect" phase learns _everything_ we need about the current state that is necessary in order to later show to the user, and the "print" phase does not do its own separate discovery. What do you think? Thanks. [Reference] *1* Here are the selected functions and what other functions they call. wt_status_collect() -> wt_status_collect_changes_initial() -> wt_status_collect_changes_index() -> wt_status_collect_untracked() -> wt_status_get_state() wt_longstatus_print() -> wt_status_get_state() wt_porcelain_v2_print_tracking() -> wt_status_get_state() wt_status_print() -> wt_porcelain_v2_print() -> wt_porcelain_v2_print_tracking() -> wt_longstatus_print() run_status() -> wt_status_collect() -> wt_status_print() cmd_status() -> wt_status_collect() -> wt_status_print() prepare_to_commit(), dry_run_commit() -> run_status() Most notably, wt_status_collect() always happens before wt_status_print(), which is natural because the former is to collect information in 's' that is used by the latter to print. And in various functions wt_status_print() calls indirectly, the two other callers of wt_status_get_state() appear.
Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.
Junio C Hamano writes: > "Stephen P. Smith" writes: > >> void wt_status_collect(struct wt_status *s) >> { >> +struct wt_status_state state; >> wt_status_collect_changes_worktree(s); >> >> if (s->is_initial) >> @@ -746,6 +752,11 @@ void wt_status_collect(struct wt_status *s) >> else >> wt_status_collect_changes_index(s); >> wt_status_collect_untracked(s); >> + >> +memset(&state, 0, sizeof(state)); >> +wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD")); >> +if (state.merge_in_progress && !has_unmerged(s)) >> +s->committable = 1; >> } > > I do not think this is wrong per-se, but now we have three calls to > wt_status_get_state() in wt-status.c, and it smells (at least to me) > that each of these callsites does so only because their callers > do not give them enough information, forcing them to find the state > out for themselves. > > Given that the ideal paradigm to come up with the "work tree status" > is to do the collection followed by printing, and among three > callers of get_state(), two appear in the "printing" side of the > callchain [*1*], I wonder if it makes a better organization to > > - embed struct wt_status_state in struct wt_status > > - make the new call to wt_status_get_state() added above in this >patch to populate the wt_status_state embedded in 's' > > - change the other two callers of wt_status_get_state() in >wt_longstatus_print() and wt_porcelain_v2_print_tracking(), both >of which will receive 's' that has been populated by a previous >call to wt_status_collect(), so that they do *not* call >get_state() themselves, but instead use the result recorded in >wt_status_state embedded in 's', which was populated by >wt_status_collect() before they got called. > > That would bring the resulting code even closer to the ideal, > i.e. the "collect" phase learns _everything_ we need about the > current state that is necessary in order to later show to the user, > and the "print" phase does not do its own separate discovery. > > What do you think? > > Thanks. Such a "clean-up" may look like this patch: - add .state field to wt_status to embed a wt_status_state instance - remove a parameter of type struct wt_status_state from all functions where wt_status is already passed; we'd use .state field of the wt_status instead The patch is mostly for illustration of the idea. The result seems to compile and pass the test suite, but I haven't carefully thought about what else I may be breaking with this mechanical change. For example, I noticed that both of the old callsites of wt_status_get_state() have free() of a few fiedls in the structure, and I kept the code as close to the original, but I suspect they should not be freed there in the functions in the "print" phase, but rather the caller of the "collect" and "print" should be made responsible for deciding when to dispose the entire wt_status (and wt_status_state as part of it). This illustration patch does not address that kind of details (yet). wt-status.c | 132 ++-- wt-status.h | 37 - 2 files changed, 77 insertions(+), 92 deletions(-) diff --git a/wt-status.c b/wt-status.c index c7f76d4758..69f2cbdca9 100644 --- a/wt-status.c +++ b/wt-status.c @@ -744,18 +744,15 @@ static int has_unmerged(struct wt_status *s) void wt_status_collect(struct wt_status *s) { - struct wt_status_state state; wt_status_collect_changes_worktree(s); - if (s->is_initial) wt_status_collect_changes_initial(s); else wt_status_collect_changes_index(s); wt_status_collect_untracked(s); - memset(&state, 0, sizeof(state)); - wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD")); - if (state.merge_in_progress && !has_unmerged(s)) + wt_status_get_state(&s->state, s->branch && !strcmp(s->branch, "HEAD")); + if (s->state.merge_in_progress && !has_unmerged(s)) s->committable = 1; } @@ -1087,8 +1084,7 @@ static void wt_longstatus_print_tracking(struct wt_status *s) } static void show_merge_in_progress(struct wt_status *s, - struct wt_status_state *state, - const char *color) + const char *color) { if (has_unmerged(s)) { status_printf_ln(s, color, _("You have unmerged paths.")); @@ -1109,16 +1105,15 @@ static void show_merge_in_progress(struct wt_status *s, } static void show_am_in_progress(struct wt_status *s, - struct wt_status_state *state, const char *color) { status_printf_ln(s, color, _("You are in the middle of an am session.")); - if (state->am_empty_patch) + if (s->state.am_empty_patch) status_pri
Re: Old submodules broken in 2.19rc1 and 2.19rc2
Stefan Beller wrote: > On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen > wrote: >> Submodules checked out with older versions of git not longer works in the >> latest 2.19 releases. A "git submodule update --recursive" command wil fail >> for each submodule with a line saying "fatal: could not open >> '/.git' for writing> Is a directory. [...] > I have the suspicion that e98317508c0 (submodule: > ensure core.worktree is set after update, 2018-06-18) > might be the offender. I still was not able to reproduce it, but after a bit of staring at the code, I'm pretty sure I just did something wrong in the reproduction process. That commit is indeed the offender. It introduces the following code (rewrapped for clarity) in git-submodule.sh: if ! $( git config -f \ "$(git rev-parse --git-common-dir)/modules/$name/config" \ core.worktree ) 2>/dev/null then git submodule--helper connect-gitdir-workingtree "$name" "$sm_path" fi Staring at it for a while, you can see one problem: the 'if ! $(git config)' should be simply 'if ! git config'. This ends up trying to run the core.worktree value as a command, which would usually fail. That brings us into connect_work_tree_and_git_dir, which does /* Prepare .git file */ strbuf_addf(&gitfile_sb, "%s/.git", work_tree_); if (safe_create_leading_directories_const(gitfile_sb.buf)) die(_("could not create directories for %s"), gitfile_sb.buf); /* Prepare config file */ strbuf_addf(&cfg_sb, "%s/config", git_dir_); if (safe_create_leading_directories_const(cfg_sb.buf)) die(_("could not create directories for %s"), cfg_sb.buf); git_dir = real_pathdup(git_dir_, 1); work_tree = real_pathdup(work_tree_, 1); /* Write .git file */ write_file(gitfile_sb.buf, "gitdir: %s", relative_path(git_dir, work_tree, &rel_path)); The write_file runs into .git already existing as a directory, failing with the message Allan saw. This would happen in at least two cases: - if the submodule exists both in .git/modules/ *and* in the worktree (due to flipping between Git versions and branches with and without the submodule), the above will happen - likewise if the submodule exists only in the worktree, like for Allan. In "next" there is 74d4731d (submodule--helper: replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13) which uses robust helpers in C that handle this much better. I think we should revert e98317508c0 in "master" (for 2.19) and keep making use of that 'second try' in "next" (for 2.20). I'll try to pin down a reproduction case and send a revert + testsuite patch. Thanks again, Jonathan
Re: Old submodules broken in 2.19rc1 and 2.19rc2
On Freitag, 7. September 2018 19:08:43 CEST Stefan Beller wrote: > On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen wrote: > > Submodules checked out with older versions of git not longer works in the > > latest 2.19 releases. A "git submodule update --recursive" command wil > > fail > > for each submodule with a line saying "fatal: could not open > > '/.git' for writing> Is a directory. > > Can you run the update again with > > GIT_TRACE=1 git submodule update > > and post the output? > > I have the suspicion that e98317508c0 (submodule: > ensure core.worktree is set after update, 2018-06-18) > might be the offender. > > Could you try reverting that commit and check as well? > > git clone https://github.com/git/git && cd git > git revert e98317508c0 > make install # installs to you home dir at ~/bin > > and then try again, as well? > (though bisection may be more fruitful if this doesn't pan out) Okay. I had the issue on my workstation at work which I won't be back to until friday next week, but I managed to reproduce the exact same issue on separate machine running Ubuntu, and a freshly built git from git master, on another roughly one year old checkout of qt5.git with submodules [127] carewolf@twilight% GIT_TRACE=1 ~src/git/git submodule update --recursive [5.11.2] ~qt5 00:28:32.234453 git.c:659 trace: exec: git-submodule update --recursive 00:28:32.234491 run-command.c:637 trace: run_command: git-submodule update --recursive 00:28:32.240792 git.c:415 trace: built-in: git rev-parse --git-dir 00:28:32.241981 git.c:415 trace: built-in: git rev-parse --git-path objects 00:28:32.242905 git.c:415 trace: built-in: git rev-parse -q --git-dir 00:28:32.244794 git.c:415 trace: built-in: git rev-parse --show-prefix 00:28:32.245699 git.c:415 trace: built-in: git rev-parse --show-toplevel 00:28:32.247181 git.c:415 trace: built-in: git submodule--helper update-clone 00:28:32.247463 run-command.c:1553 run_processes_parallel: preparing to run up to 1 tasks 00:28:32.247764 run-command.c:1585 run_processes_parallel: done 00:28:32.248626 git.c:415 trace: built-in: git submodule--helper name qt3d 00:28:32.249751 git.c:415 trace: built-in: git config submodule.qt3d.update 00:28:32.250608 git.c:415 trace: built-in: git submodule--helper relative-path qt3d 00:28:32.251399 git.c:415 trace: built-in: git rev-parse --local-env-vars 00:28:32.252120 git.c:415 trace: built-in: git rev-parse --verify HEAD 00:28:32.253096 git.c:415 trace: built-in: git rev-parse --git-common-dir 00:28:32.253982 git.c:415 trace: built-in: git config -f .git/modules/qt3d/config core.worktree 00:28:32.254771 git.c:415 trace: built-in: git submodule--helper connect-gitdir-workingtree qt3d qt3d fatal: could not open 'qt3d/.git' for writing: Is a directory 00:28:32.255930 git.c:415 trace: built-in: git submodule--helper relative-path qt3d/ 00:28:32.256712 git.c:415 trace: built-in: git rev-parse --local-env-vars 00:28:32.257878 git.c:415 trace: built-in: git submodule--helper update-clone --recursive-prefix qt3d/ 00:28:32.258510 run-command.c:1553 run_processes_parallel: preparing to run up to 1 tasks 00:28:32.258544 run-command.c:1585 run_processes_parallel: done 00:28:32.260356 git.c:415 trace: built-in: git submodule--helper name qtactiveqt 00:28:32.261475 git.c:415 trace: built-in: git config submodule.qtactiveqt.update 00:28:32.262321 git.c:415 trace: built-in: git submodule--helper relative-path qtactiveqt 00:28:32.263259 git.c:415 trace: built-in: git rev-parse --local-env-vars 00:28:32.263987 git.c:415 trace: built-in: git rev-parse --verify HEAD 00:28:32.264929 git.c:415 trace: built-in: git rev-parse --git-common-dir 00:28:32.265775 git.c:415 trace: built-in: git config -f .git/modules/qtactiveqt/config core.worktree 00:28:32.266563 git.c:415 trace: built-in: git submodule--helper connect-gitdir-workingtree qtactiveqt qtactiveqt fatal: could not open 'qtactiveqt/.git' for writing: Is a directory 00:28:32.267595 git.c:415 trace: built-in: git submodule--helper relative-path qtactiveqt/ 00:28:32.268310 git.c:415 trace: built-in: git rev-parse --local-env-vars 00:28:32.269481 git.c:415 trace: built-in: git submodule--helper update-clone --recursive-prefix qtactiveqt/ 00:28:32.269626 run-command.c:1553 run_processes_parallel: preparing to run up to 1 tasks 00:28:32.269652 run-command.c:1585 run_processes_parallel: done 00:28:32.270893 git.c:415 trace: built-in: git submodule--helper
[RFC PATCH v4 3/3] t0014: Introduce alias testing suite
Introduce a testing suite that is dedicated to aliases. For now, check only if nested aliases work and if looping aliases are detected successfully. The looping aliases check for mixed execution is there but expected to fail because there is no check in place yet. Signed-off-by: Tim Schumacher --- Those are the tests that I've come up with. It consists of tests for nested aliases and looping aliases, both with internal calls and external calls. Unfortunately I don't have a fix for the last one yet, so I marked it as expect_failure. The problem is that the test suite is waiting a full minute until it aborts the running command (which I guess should not take that long, as it blocks the whole test suite for that span of time). Should I try to decrease the timeout or should I remove that test completely until I manage to get external calls fixed? As a last thing, is there any better way to use single quotes than to write '"'"'? It isn't that bad, but it is hard to read, especially for bash newcomers. t/t0014-alias.sh | 38 ++ 1 file changed, 38 insertions(+) create mode 100755 t/t0014-alias.sh diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh new file mode 100755 index 0..6c1e34694 --- /dev/null +++ b/t/t0014-alias.sh @@ -0,0 +1,38 @@ +#!/bin/sh + +test_description='git command aliasing' + +. ./test-lib.sh + +test_expect_success 'setup environment' ' + git init +' + +test_expect_success 'nested aliases - internal execution' ' + git config alias.nested-internal-1 nested-internal-2 && + git config alias.nested-internal-2 status +' + +test_expect_success 'nested aliases - mixed execution' ' + git config alias.nested-external-1 "!git nested-external-2" && + git config alias.nested-external-2 status +' + +test_expect_success 'looping aliases - internal execution' ' + git config alias.loop-internal-1 loop-internal-2 && + git config alias.loop-internal-2 loop-internal-3 && + git config alias.loop-internal-3 loop-internal-2 && + test_must_fail git loop-internal-1 2>output && + grep -q "fatal: alias loop detected: expansion of '"'"'loop-internal-1'"'"' does not terminate" output && + rm output +' + +test_expect_failure 'looping aliases - mixed execution' ' + git config alias.loop-mixed-1 loop-mixed-2 && + git config alias.loop-mixed-2 "!git loop-mixed-1" && + test_must_fail git loop-mixed-1 2>output && + grep -q "fatal: alias loop detected: expansion of '"'"'loop-mixed-1'"'"' does not terminate" output && + rm output +' + +test_done -- 2.19.0.rc2.1.g4c98b8d69.dirty
[RFC PATCH v4 2/3] Show the call history when an alias is looping
Just printing the command that the user entered is not particularly helpful when trying to find the alias that causes the loop. Print the history of substituted commands to help the user find the offending alias. Mark the entrypoint of the loop with "<==" and the last command (which looped back to the entrypoint) with "==>". Signed-off-by: Tim Schumacher --- I now went with Peff's suggested code and I added in an arrow that points away from the last command (which caused the loop). A "full" arrow (i.e. starts at the last command, goes upwards and ends at the entrypoint) would be more obvious/better, but adding much more code just for having a vertical line wasn't worth it for me. git.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/git.c b/git.c index 15727c17f..a20eb4fa1 100644 --- a/git.c +++ b/git.c @@ -675,6 +675,7 @@ static int run_argv(int *argcp, const char ***argv) { int done_alias = 0; struct string_list cmd_list = STRING_LIST_INIT_NODUP; + struct string_list_item *seen; while (1) { /* @@ -692,9 +693,21 @@ static int run_argv(int *argcp, const char ***argv) /* .. then try the external ones */ execv_dashed_external(*argv); - if (unsorted_string_list_has_string(&cmd_list, *argv[0])) { + seen = unsorted_string_list_lookup(&cmd_list, *argv[0]); + if (seen) { + int i; + struct strbuf sb = STRBUF_INIT; + for (i = 0; i < cmd_list.nr; i++) { + struct string_list_item *item = &cmd_list.items[i]; + + strbuf_addf(&sb, "\n %s", item->string); + if (item == seen) + strbuf_addstr(&sb, " <=="); + else if (i == cmd_list.nr - 1) + strbuf_addstr(&sb, " ==>"); + } die(_("alias loop detected: expansion of '%s' does" - " not terminate"), cmd_list.items[0].string); + " not terminate:%s"), cmd_list.items[0].string, sb.buf); } string_list_append(&cmd_list, *argv[0]); -- 2.19.0.rc2.1.g4c98b8d69.dirty
[RFC PATCH v4 1/3] Add support for nested aliases
Aliases can only contain non-alias git commands and their arguments, not other user-defined aliases. Resolving further (nested) aliases is prevented by breaking the loop after the first alias was processed. Git then fails with a command-not-found error. Allow resolving nested aliases by not breaking the loop in run_argv() after the first alias was processed. Instead, continue the loop until `handle_alias()` fails, which means that there are no further aliases that can be processed. Prevent looping aliases by storing substituted commands in `cmd_list` and checking if a command has been substituted previously. While we're at it, fix a styling issue just below the added code. Signed-off-by: Tim Schumacher --- Changes since v3: - Print the command that the user entered instead of the command which caused the loop (and a nicer, more explanatory error message) - Use unsorted_string_list_has_string() instead of the sorted version - Fix a code style issue just below the modified code - done_alias is a simple boolean again (instead of a counter) git.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/git.c b/git.c index c27c38738..15727c17f 100644 --- a/git.c +++ b/git.c @@ -674,6 +674,7 @@ static void execv_dashed_external(const char **argv) static int run_argv(int *argcp, const char ***argv) { int done_alias = 0; + struct string_list cmd_list = STRING_LIST_INIT_NODUP; while (1) { /* @@ -691,17 +692,25 @@ static int run_argv(int *argcp, const char ***argv) /* .. then try the external ones */ execv_dashed_external(*argv); - /* It could be an alias -- this works around the insanity + if (unsorted_string_list_has_string(&cmd_list, *argv[0])) { + die(_("alias loop detected: expansion of '%s' does" + " not terminate"), cmd_list.items[0].string); + } + + string_list_append(&cmd_list, *argv[0]); + + /* +* It could be an alias -- this works around the insanity * of overriding "git log" with "git show" by having * alias.log = show */ - if (done_alias) - break; if (!handle_alias(argcp, argv)) break; done_alias = 1; } + string_list_clear(&cmd_list, 0); + return done_alias; } -- 2.19.0.rc2.1.g4c98b8d69.dirty
Re: Old submodules broken in 2.19rc1 and 2.19rc2
I think we > should revert e98317508c0 in "master" (for 2.19) and keep making use > of that 'second try' in "next" (for 2.20). Actually I'd rather revert the whole topic leading up to 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18) as the last patch in there doesn't work well without e98317508c0 IIRC. And having only the first patch would bring an inconsistent state as then different commands behave differently w.r.t. setting core.worktree. So for this release we'd git revert 984cd77ddbf0 e98317508c 4fa4f90ccd85 and then can git cherry-pick 4fa4f90ccd8 984cd77ddbf0 on top of sb/submodule-update-in-c, as that re-instates the behavior going forward. Thoughts? Thanks, Stefan
[PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing
With curl-7.61.1 cookies are sorted by creation-time¹. Sort the output used in the 'cookies stored in http.cookiefile when http.savecookies set' test before comparing it to the expected cookies. ¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support creation-time attribute for cookies", 2018-08-28) Signed-off-by: Todd Zullinger --- [Resending with the list in Cc; sorry for spamming you, Junio, Jeff, and Gábor.] The in-development version of Fedora updated to the recently released curl-7.61.1 in the past few days. This isn't breakage from the 2.19.0 cycle, but if the fix looks good to everyone it would be nice to include it. That way other distributions and users who update git and curl to the most recent releases won't run into this test failure. I tested this against Fedora 30 (curl-7.61.1) as well as previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and Fedora 27/28/29 (7.55.1/7.59.0/7.61.0). The verbose output is: expecting success: git config http.cookiefile cookies.txt && git config http.savecookies true && git ls-remote $HTTPD_URL/smart_cookies/repo.git master && tail -3 cookies.txt >cookies_tail.txt && test_cmp expect_cookies.txt cookies_tail.txt ++ git config http.cookiefile cookies.txt ++ git config http.savecookies true ++ git ls-remote http://127.0.0.1:5551/smart_cookies/repo.git master 7ae89caac6c721f16555e981eaeed64abc165c5drefs/heads/master 263207bb5fbfbefbdf1c9c3fa4ae5d9663323217 refs/namespaces/ns/refs/heads/master ++ tail -3 cookies.txt ++ test_cmp expect_cookies.txt cookies_tail.txt ++ diff -u expect_cookies.txt cookies_tail.txt --- expect_cookies.txt 2018-09-07 07:29:05.231532462 + +++ cookies_tail.txt2018-09-07 07:29:05.306532366 + @@ -1,3 +1,3 @@ -127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue 127.0.0.1 FALSE /smart_cookies/repo.git/info/ FALSE 0 name value +127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue error: last command exited with $?=1 not ok 22 - cookies stored in http.cookiefile when http.savecookies set t/t5551-http-fetch-smart.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh index 771f36f9ff..538656bfef 100755 --- a/t/t5551-http-fetch-smart.sh +++ b/t/t5551-http-fetch-smart.sh @@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile when http.savecookies set git config http.cookiefile cookies.txt && git config http.savecookies true && git ls-remote $HTTPD_URL/smart_cookies/repo.git master && - tail -3 cookies.txt >cookies_tail.txt && + tail -3 cookies.txt | sort >cookies_tail.txt && test_cmp expect_cookies.txt cookies_tail.txt ' -- 2.19.0.rc2
Re: [RFC PATCH v4 3/3] t0014: Introduce alias testing suite
On Fri, Sep 7, 2018 at 6:44 PM Tim Schumacher wrote: > Introduce a testing suite that is dedicated to aliases. > For now, check only if nested aliases work and if looping > aliases are detected successfully. > > The looping aliases check for mixed execution is there but > expected to fail because there is no check in place yet. > > Signed-off-by: Tim Schumacher > --- > Unfortunately I don't have a fix for the last one yet, so I > marked it as expect_failure. The problem is that the test suite > is waiting a full minute until it aborts the running command > (which I guess should not take that long, as it blocks the whole > test suite for that span of time). > > Should I try to decrease the timeout or should I remove that > test completely until I manage to get external calls fixed? Perhaps just comment out that test for now and add a comment above it explaining why it's commented out. > As a last thing, is there any better way to use single quotes > than to write '"'"'? It isn't that bad, but it is hard to read, > especially for bash newcomers. You should backslash-escape the quotes ("foo \'bar\' baz"), however, in this case, it would make sense to use regex's with 'grep' to check that you got the expected error message rather than reproducing the message literally here in the script. More below. > diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh > @@ -0,0 +1,38 @@ > +#!/bin/sh > + > +test_description='git command aliasing' > + > +. ./test-lib.sh > + > +test_expect_success 'setup environment' ' > + git init > +' "git init" is invoked automatically by the test framework, so no need for this test. You can drop it. > +test_expect_success 'nested aliases - internal execution' ' > + git config alias.nested-internal-1 nested-internal-2 && > + git config alias.nested-internal-2 status > +' This isn't actually testing anything, is it? It's setting up the aliases but never actually invoking them. I would have expected the next line to actually run a command ("git nested-internal-1") and the line after that to check that you got the expected output (whatever "git status" would emit). Output from "git status" isn't necessarily the easiest to test, though, so perhaps pick a different Git command for testing (something for which the result can be very easily checked -- maybe "git rm" or such). > +test_expect_success 'nested aliases - mixed execution' ' > + git config alias.nested-external-1 "!git nested-external-2" && > + git config alias.nested-external-2 status > +' Same observation. > +test_expect_success 'looping aliases - internal execution' ' > + git config alias.loop-internal-1 loop-internal-2 && > + git config alias.loop-internal-2 loop-internal-3 && > + git config alias.loop-internal-3 loop-internal-2 && > + test_must_fail git loop-internal-1 2>output && > + grep -q "fatal: alias loop detected: expansion of > '"'"'loop-internal-1'"'"' does not terminate" output && Don't bother using -q with 'grep'. Output is hidden already by the test framework in normal mode, and not hidden when running in verbose mode. And, the output of 'grep' might be helpful when debugging the test if something goes wrong. As noted above, you can use regex to match the expected error rather than exactly duplicating the text of the message. Finally, use 'test_i18ngrep' instead of 'grep' in order to play nice with localization. > + rm output Tests don't normally bother cleaning up their output files like this since such output can be helpful when debugging the test if something goes wrong. (You'd want to use test_when_finished to cleanup anyhow, but you don't need it in this case.) > +'
Re: [PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing
On Fri, Sep 07, 2018 at 07:22:05PM -0400, Todd Zullinger wrote: > With curl-7.61.1 cookies are sorted by creation-time¹. Sort the output > used in the 'cookies stored in http.cookiefile when http.savecookies > set' test before comparing it to the expected cookies. > > ¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support > creation-time attribute for cookies", 2018-08-28) According to that commit message, the creation-time sort is only for cookies of the same length. But it's not clear to me if that just means on-the-wire, and curl always stores by creation-time in the cookie file. Either way, though, I guess it wouldn't matter for us as long as we choose some arbitrary re-ordering for what curl produces (i.e., the output of `sort`) and then make sure our "expect" output is in the same order. Which is basically what your patch does. One question, though: > diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh > index 771f36f9ff..538656bfef 100755 > --- a/t/t5551-http-fetch-smart.sh > +++ b/t/t5551-http-fetch-smart.sh > @@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile > when http.savecookies set > git config http.cookiefile cookies.txt && > git config http.savecookies true && > git ls-remote $HTTPD_URL/smart_cookies/repo.git master && > - tail -3 cookies.txt >cookies_tail.txt && > + tail -3 cookies.txt | sort >cookies_tail.txt && > test_cmp expect_cookies.txt cookies_tail.txt > ' We pick the bottom 3 before sorting. How do we know those are the three we want to see? ...Ah, OK. The lines we are skipping are not actually cookies at all, but just header cruft. I wonder if: grep "^[^#]" cookies.txt would be a better way of doing that, but that is certainly not something new. So this fix looks fine. It might be worth a comment above the creation of expect_cookies.txt to mention it must be in sorted order (of course anybody modifying it would see a test failure). > The in-development version of Fedora updated to the recently > released curl-7.61.1 in the past few days. This isn't > breakage from the 2.19.0 cycle, but if the fix looks good to > everyone it would be nice to include it. That way other > distributions and users who update git and curl to the most > recent releases won't run into this test failure. > > I tested this against Fedora 30 (curl-7.61.1) as well as > previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and > Fedora 27/28/29 (7.55.1/7.59.0/7.61.0). You're pretty late in the 2.19 cycle, since the release is tentatively scheduled for Sunday. Though since this is just touching the test script, and since it looks Obviously Correct, I'm not opposed. -Peff
Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.
On Friday, September 7, 2018 3:31:55 PM MST you wrote: > Junio C Hamano writes: > The patch is mostly for illustration of the idea. > > The result seems to compile and pass the test suite, but I haven't > carefully thought about what else I may be breaking with this > mechanical change. For example, I noticed that both of the old > callsites of wt_status_get_state() have free() of a few fiedls in > the structure, and I kept the code as close to the original, but I > suspect they should not be freed there in the functions in the > "print" phase, but rather the caller of the "collect" and "print" > should be made responsible for deciding when to dispose the entire > wt_status (and wt_status_state as part of it). This illustration > patch does not address that kind of details (yet). If we use this as a basis of a follow on patch, how do I handle credit. You obviously wrote this patch and I did not. So how is the mechanics of that normally done? Thanks for the patch I will work with it. sps
[PATCH] Revert "Merge branch 'sb/submodule-core-worktree'" (was Re: Old submodules broken in 2.19rc1 and 2.19rc2)
Subject: Revert "Merge branch 'sb/submodule-core-worktree'" This reverts commit 7e25437d35a70791b345872af202eabfb3e1a8bc, reversing changes made to 00624d608cc69bd62801c93e74d1ea7a7ddd6598. v2.19.0-rc0~165^2~1 (submodule: ensure core.worktree is set after update, 2018-06-18) assumes an "absorbed" submodule layout, where the submodule's Git directory is in the superproject's .git/modules/ directory and .git in the submodule worktree is a .git file pointing there. In particular, it uses $GIT_DIR/modules/$name to find the submodule to find out whether it already has core.worktree set, and it uses connect_work_tree_and_git_dir if not, resulting in fatal: could not open sub/.git for writing The context behind that patch: v2.19.0-rc0~165^2~2 (submodule: unset core.worktree if no working tree is present, 2018-06-12) unsets core.worktree when running commands like "git checkout --recurse-submodules" to switch to a branch without the submodule. If a user then uses "git checkout --no-recurse-submodules" to switch back to a branch with the submodule and runs "git submodule update", this patch is needed to ensure that commands using the submodule directly are aware of the path to the worktree. It is late in the release cycle, so revert the whole 3-patch series. We can try again later for 2.20. Reported-by: Allan Sandfeld Jensen Helped-by: Stefan Beller Signed-off-by: Jonathan Nieder --- Stefan Beller wrote: > Jonathan Nieder wrote: >> I think we >> should revert e98317508c0 in "master" (for 2.19) and keep making use >> of that 'second try' in "next" (for 2.20). > > Actually I'd rather revert the whole topic leading up to > 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18) > as the last patch in there doesn't work well without e98317508c0 IIRC. > > And having only the first patch would bring an inconsistent state as > then different commands behave differently w.r.t. setting core.worktree. Like this (generated using "git revert -m1)? builtin/submodule--helper.c | 26 -- git-submodule.sh| 5 - submodule.c | 14 -- submodule.h | 2 -- t/lib-submodule-update.sh | 5 ++--- t/t7400-submodule-basic.sh | 5 - 6 files changed, 2 insertions(+), 55 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index b56028ba9d..f6fb8991f3 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1123,8 +1123,6 @@ static void deinit_submodule(const char *path, const char *prefix, if (!(flags & OPT_QUIET)) printf(format, displaypath); - submodule_unset_core_worktree(sub); - strbuf_release(&sb_rm); } @@ -2005,29 +2003,6 @@ static int check_name(int argc, const char **argv, const char *prefix) return 0; } -static int connect_gitdir_workingtree(int argc, const char **argv, const char *prefix) -{ - struct strbuf sb = STRBUF_INIT; - const char *name, *path; - char *sm_gitdir; - - if (argc != 3) - BUG("submodule--helper connect-gitdir-workingtree "); - - name = argv[1]; - path = argv[2]; - - strbuf_addf(&sb, "%s/modules/%s", get_git_dir(), name); - sm_gitdir = absolute_pathdup(sb.buf); - - connect_work_tree_and_git_dir(path, sm_gitdir, 0); - - strbuf_release(&sb); - free(sm_gitdir); - - return 0; -} - #define SUPPORT_SUPER_PREFIX (1<<0) struct cmd_struct { @@ -2041,7 +2016,6 @@ static struct cmd_struct commands[] = { {"name", module_name, 0}, {"clone", module_clone, 0}, {"update-clone", update_clone, 0}, - {"connect-gitdir-workingtree", connect_gitdir_workingtree, 0}, {"relative-path", resolve_relative_path, 0}, {"resolve-relative-url", resolve_relative_url, 0}, {"resolve-relative-url-test", resolve_relative_url_test, 0}, diff --git a/git-submodule.sh b/git-submodule.sh index f7fd80345c..1cb2c0a31b 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -580,11 +580,6 @@ cmd_update() die "$(eval_gettext "Unable to find current \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" fi - if ! $(git config -f "$(git rev-parse --git-common-dir)/modules/$name/config" core.worktree) 2>/dev/null - then - git submodule--helper connect-gitdir-workingtree "$name" "$sm_path" - fi - if test "$subsha1" != "$sha1" || test -n "$force" then subforce=$force diff --git a/submodule.c b/submodule.c index 50cbf5f13e..a2b266fbfa 100644 --- a/submodule.c +++ b/submodule.c @@ -1534,18 +1534,6 @@ int bad_to_remove_submodule(const char *path, unsigned flags) return ret; } -void submodule_unset_core_worktree(const struct submodule *sub) -{ - char *config_pa
Re: [PATCH v2] http-backend: allow empty CONTENT_LENGTH
Hi, Max Kirillov wrote: > According to RFC3875, empty environment variable is equivalent to unset, > and for CONTENT_LENGTH it should mean zero body to read. > > However, unset CONTENT_LENGTH is also used for chunked encoding to indicate > reading until EOF. At least, the test "large fetch-pack requests can be split > across POSTs" from t5551 starts faliing, if unset or empty CONTENT_LENGTH is > treated as zero length body. So keep the existing behavior as much as > possible. > > Add a test for the case. > > Reported-By: Jelmer Vernooij > Signed-off-by: Max Kirillov > --- > Added the "reported-by" and explained inline the reason to keep existing > behavior Lovely, thanks. To me, "keep the existing behavior as much as possible" isn't comforting because it doesn't tell me *which* existing behavior. Fortunately the patch itself is comforting: it makes us treat "" the same way as unset, which is exactly what the RFC requires. So I'm happy with this version. Thanks for your patient work. Reviewed-by: Jonathan Nieder
Re: 2.19.0.rc2.windows.1: stash fails with dirty submodule
Hi, Thomas Braun wrote: > I'm using git with stash and rebase builtins. > > $ git --version --build-options > > git version 2.19.0.rc2.windows.1 [...] > mkdir test > cd test > git init > echo 1 > file > git add file > git commit file -m "message" > git submodule add ./ mysubmod > git commit -m "Add submodule" > echo 2 > mysubmod/file > git checkout -b mybranch > git rebase -i --autosquash master [...] > fatal: Unexpected stash response: '' > > and that used to work with older git versions. Thanks for reporting. I'm cc-ing Dscho, who has been looking for reports of issues with the new experimental stash and rebase code[2]. Thanks, Jonathan > [1]: > https://github.com/git-for-windows/git/issues/1820#issuecomment-419411808 [2] https://public-inbox.org/git/nycvar.qro.7.76.6.1808241320540...@tvgsbejvaqbjf.bet/
Re: [PATCH] Revert "Merge branch 'sb/submodule-core-worktree'" (was Re: Old submodules broken in 2.19rc1 and 2.19rc2)
Jonathan Nieder writes: > It is late in the release cycle, so revert the whole 3-patch series. > We can try again later for 2.20. > > Reported-by: Allan Sandfeld Jensen > Helped-by: Stefan Beller > Signed-off-by: Jonathan Nieder > --- > Stefan Beller wrote: >> Jonathan Nieder wrote: > >>> I think we >>> should revert e98317508c0 in "master" (for 2.19) and keep making use >>> of that 'second try' in "next" (for 2.20). >> >> Actually I'd rather revert the whole topic leading up to >> 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18) >> as the last patch in there doesn't work well without e98317508c0 IIRC. >> >> And having only the first patch would bring an inconsistent state as >> then different commands behave differently w.r.t. setting core.worktree. > > Like this (generated using "git revert -m1)? OK. Thanks for taking care of it. > > builtin/submodule--helper.c | 26 -- > git-submodule.sh| 5 - > submodule.c | 14 -- > submodule.h | 2 -- > t/lib-submodule-update.sh | 5 ++--- > t/t7400-submodule-basic.sh | 5 - > 6 files changed, 2 insertions(+), 55 deletions(-) > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index b56028ba9d..f6fb8991f3 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -1123,8 +1123,6 @@ static void deinit_submodule(const char *path, const > char *prefix, > if (!(flags & OPT_QUIET)) > printf(format, displaypath); > > - submodule_unset_core_worktree(sub); > - > strbuf_release(&sb_rm); > } > > @@ -2005,29 +2003,6 @@ static int check_name(int argc, const char **argv, > const char *prefix) > return 0; > } > > -static int connect_gitdir_workingtree(int argc, const char **argv, const > char *prefix) > -{ > - struct strbuf sb = STRBUF_INIT; > - const char *name, *path; > - char *sm_gitdir; > - > - if (argc != 3) > - BUG("submodule--helper connect-gitdir-workingtree > "); > - > - name = argv[1]; > - path = argv[2]; > - > - strbuf_addf(&sb, "%s/modules/%s", get_git_dir(), name); > - sm_gitdir = absolute_pathdup(sb.buf); > - > - connect_work_tree_and_git_dir(path, sm_gitdir, 0); > - > - strbuf_release(&sb); > - free(sm_gitdir); > - > - return 0; > -} > - > #define SUPPORT_SUPER_PREFIX (1<<0) > > struct cmd_struct { > @@ -2041,7 +2016,6 @@ static struct cmd_struct commands[] = { > {"name", module_name, 0}, > {"clone", module_clone, 0}, > {"update-clone", update_clone, 0}, > - {"connect-gitdir-workingtree", connect_gitdir_workingtree, 0}, > {"relative-path", resolve_relative_path, 0}, > {"resolve-relative-url", resolve_relative_url, 0}, > {"resolve-relative-url-test", resolve_relative_url_test, 0}, > diff --git a/git-submodule.sh b/git-submodule.sh > index f7fd80345c..1cb2c0a31b 100755 > --- a/git-submodule.sh > +++ b/git-submodule.sh > @@ -580,11 +580,6 @@ cmd_update() > die "$(eval_gettext "Unable to find current > \${remote_name}/\${branch} revision in submodule path '\$sm_path'")" > fi > > - if ! $(git config -f "$(git rev-parse > --git-common-dir)/modules/$name/config" core.worktree) 2>/dev/null > - then > - git submodule--helper connect-gitdir-workingtree > "$name" "$sm_path" > - fi > - > if test "$subsha1" != "$sha1" || test -n "$force" > then > subforce=$force > diff --git a/submodule.c b/submodule.c > index 50cbf5f13e..a2b266fbfa 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -1534,18 +1534,6 @@ int bad_to_remove_submodule(const char *path, unsigned > flags) > return ret; > } > > -void submodule_unset_core_worktree(const struct submodule *sub) > -{ > - char *config_path = xstrfmt("%s/modules/%s/config", > - get_git_common_dir(), sub->name); > - > - if (git_config_set_in_file_gently(config_path, "core.worktree", NULL)) > - warning(_("Could not unset core.worktree setting in submodule > '%s'"), > - sub->path); > - > - free(config_path); > -} > - > static const char *get_super_prefix_or_empty(void) > { > const char *s = get_super_prefix(); > @@ -1711,8 +1699,6 @@ int submodule_move_head(const char *path, > > if (is_empty_dir(path)) > rmdir_or_warn(path); > - > - submodule_unset_core_worktree(sub); > } > } > out: > diff --git a/submodule.h b/submodule.h > index 7d476cefa7..e452919aa4 100644 > --- a/submodule.h > +++ b/submodule.h > @@ -127,8 +127,6 @@ int submodule_move_head(const char *path, > const char *new_head, > unsigned flags); > > -v
Re: [PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing
Jeff King wrote: > On Fri, Sep 07, 2018 at 07:22:05PM -0400, Todd Zullinger wrote: > >> With curl-7.61.1 cookies are sorted by creation-time¹. Sort the output >> used in the 'cookies stored in http.cookiefile when http.savecookies >> set' test before comparing it to the expected cookies. >> >> ¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support >> creation-time attribute for cookies", 2018-08-28) > > According to that commit message, the creation-time sort is only for > cookies of the same length. But it's not clear to me if that just means > on-the-wire, and curl always stores by creation-time in the cookie file. Yeah, I didn't dig into the curl code deeply to try and understand it. I did test with the only the curl package downgraded to 7.61.0 to confirm the test worked without sorting. And I saw that the curl commit updated existing tests to sort the test data. > Either way, though, I guess it wouldn't matter for us as long as we > choose some arbitrary re-ordering for what curl produces (i.e., the > output of `sort`) and then make sure our "expect" output is in the same > order. Which is basically what your patch does. One question, though: > >> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh >> index 771f36f9ff..538656bfef 100755 >> --- a/t/t5551-http-fetch-smart.sh >> +++ b/t/t5551-http-fetch-smart.sh >> @@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile >> when http.savecookies set >> git config http.cookiefile cookies.txt && >> git config http.savecookies true && >> git ls-remote $HTTPD_URL/smart_cookies/repo.git master && >> -tail -3 cookies.txt >cookies_tail.txt && >> +tail -3 cookies.txt | sort >cookies_tail.txt && >> test_cmp expect_cookies.txt cookies_tail.txt >> ' > > We pick the bottom 3 before sorting. How do we know those are the three > we want to see? > > ...Ah, OK. The lines we are skipping are not actually cookies at all, > but just header cruft. I wonder if: > > grep "^[^#]" cookies.txt > > would be a better way of doing that, but that is certainly not something > new. > > So this fix looks fine. It might be worth a comment above the creation > of expect_cookies.txt to mention it must be in sorted order (of course > anybody modifying it would see a test failure). I thought about running the expect_cookies.txt file through sort as well. That would ensure that both files were using the same sorting. Whether that's needed on any platform now, I don't know. Maybe that would be a useful way to protect against future edits to the expect_cookies.txt file catching the editor? I thought there might be a test function to sort the output, but I was (incorrectly) thinking of check_access_log() which Gábor added in e8b3b2e275 ("t/lib-httpd: avoid occasional failures when checking access.log", 2018-07-12). Perhaps it would be useful to have a test_cmp_sorted() to do the simple dance of sorting the actual & expected. I haven't looked through the tests to see how often such a function might be useful. >> The in-development version of Fedora updated to the recently >> released curl-7.61.1 in the past few days. This isn't >> breakage from the 2.19.0 cycle, but if the fix looks good to >> everyone it would be nice to include it. That way other >> distributions and users who update git and curl to the most >> recent releases won't run into this test failure. >> >> I tested this against Fedora 30 (curl-7.61.1) as well as >> previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and >> Fedora 27/28/29 (7.55.1/7.59.0/7.61.0). > > You're pretty late in the 2.19 cycle, since the release is tentatively > scheduled for Sunday. Though since this is just touching the test > script, and since it looks Obviously Correct, I'm not opposed. Yep, I knew the final was coming very soon. I would not have been surprised to see it land tonight while I was finishing my testing of this patch. :) I'm certainly covered for the Fedora packages. It's hard to say whether there are many other users/packagers who might upgrade both git and curl and run into this. So it may not be worth even a small risk of making the change at this point. On the other hand, the change only affects one test and may be safe enough to apply. I'll leave that choice in the capable hands of our maintainer and the good folks here. Thanks for a thoughtful review, as always. -- Todd ~~ That which seems the height of absurdity in one generation often becomes the height of wisdom in the next. -- John Stuart Mill (1806-1873)
Re: [PATCH] http-backend: allow empty CONTENT_LENGTH
On Fri, Sep 07, 2018 at 02:49:23AM -0700, Junio C Hamano wrote: > Max Kirillov writes: > >> Actually, another reason for the latest issue was that CONTENT_LENGTH >> is parsed for GET requests at all. It should be parsed only for POST >> requests, or, rather, only for upoad-pack and receive-pack requests. > > Not really. The layered design of the HTTP protocol means that any > request type can have non-empty body, but request types for which > no semantics of the body is defined must ignore what is in the body, > which in turn means we need to parse and pay attention to the > content length etc. to find the end of the body, if only to ignore > it. I don't think it is git's job to police web server implementations, especially considering that there is a gap between letter of RFC and actual behavior. Anyway, it only runs the check for "*/info/refs" GET request, which ends up in get_info_refs(). Other GET requests do not check CONTENT_LENGTH. Also, the version of service which is started from get_info_refs() do not consume input (I think, actually, the "--stateless-rpc" argument is not needed there). > In any case, hopefully we can fix this before the final, as this is > a regression introduced during this cycle? Yes, I'm working on it.
[PATCH v3] http-backend: allow empty CONTENT_LENGTH
Before 817f7dc223, CONTENT_LENGTH variable was never considered, http-backend was just reading request body from standard input until EOF when it, or a command started by it, needed it. Then it was discovered that some HTTP do not close standard input, instead expecting CGI scripts to obey CONTENT_LENGTH. In 817f7dc223, behavior was changed to consider the CONTENT_LENGTH variable when it is set. Case of unset CONTENT_LENGTH was kept to mean "read until EOF" which is not compliant to the RFC3875 (which treats it as empty body), but practically is used when client uses chunked encoding to submit big request. Case of empty CONTENT_LENGTH has slept through this conditions. Apparently, it is used for GET requests, and RFC3875 does specify that it also means empty body. Current implementation, however, fails to parse it and aborts the request. Fix the case of empty CONTENT_LENGTH to also be treated as "read until EOF". It does not actually matter what does it mean because body is never read anyway, it just should not cause parse error. Add a test for the case. Reported-By: Jelmer Vernooij Signed-off-by: Max Kirillov --- Provided more thorough message, also fix test (it did not test actually the error before) There will be more versions later, at least the one which suggested by Jeff PS: did I write v2, it should be v3 of course! http-backend.c | 2 +- t/t5562-http-backend-content-length.sh | 11 +++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/http-backend.c b/http-backend.c index e88d29f62b..a1230d7ead 100644 --- a/http-backend.c +++ b/http-backend.c @@ -353,7 +353,7 @@ static ssize_t get_content_length(void) ssize_t val = -1; const char *str = getenv("CONTENT_LENGTH"); - if (str && !git_parse_ssize_t(str, &val)) + if (str && *str && !git_parse_ssize_t(str, &val)) die("failed to parse CONTENT_LENGTH: %s", str); return val; } diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index 057dcb85d6..b28c3c4765 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -152,4 +152,15 @@ test_expect_success 'CONTENT_LENGTH overflow ssite_t' ' grep "fatal:.*CONTENT_LENGTH" err ' +test_expect_success 'empty CONTENT_LENGTH' ' + env \ + QUERY_STRING="/repo.git/info/refs?service=git-receive-pack" \ + PATH_TRANSLATED="$PWD"/.git/info/refs \ + GIT_HTTP_EXPORT_ALL=TRUE \ + REQUEST_METHOD=GET \ + CONTENT_LENGTH="" \ + git http-backend act.out 2>act.err && + verify_http_result "200 OK" +' + test_done -- 2.17.0.1185.g782057d875
[PATCH v2] http-backend: allow empty CONTENT_LENGTH
Before 817f7dc223, CONTENT_LENGTH variable was never considered, http-backend was just reading request body from standard input until EOF when it, or a command started by it, needed it. Then it was discovered that some HTTP do not close standard input, instead expecting CGI scripts to obey CONTENT_LENGTH. In 817f7dc223, behavior was changed to consider the CONTENT_LENGTH variable when it is set. Case of unset CONTENT_LENGTH was kept to mean "read until EOF" which is not compliant to the RFC3875 (which treats it as empty body), but practically is used when client uses chunked encoding to submit big request. Case of empty CONTENT_LENGTH has slept through this conditions. Apparently, it is used for GET requests, and RFC3875 does specify that it also means empty body. Current implementation, however, fails to parse it and aborts the request. Fix the case of empty CONTENT_LENGTH to also be treated as "read until EOF". It does not actually matter what does it mean because body is never read anyway, it just should not cause parse error. Add a test for the case. Reported-By: Jelmer Vernooij Signed-off-by: Max Kirillov --- Provided more thorough message, also fix test (it did not test actually the error before) There will be more versions later, at least the one which suggested by Jeff http-backend.c | 2 +- t/t5562-http-backend-content-length.sh | 11 +++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/http-backend.c b/http-backend.c index e88d29f62b..a1230d7ead 100644 --- a/http-backend.c +++ b/http-backend.c @@ -353,7 +353,7 @@ static ssize_t get_content_length(void) ssize_t val = -1; const char *str = getenv("CONTENT_LENGTH"); - if (str && !git_parse_ssize_t(str, &val)) + if (str && *str && !git_parse_ssize_t(str, &val)) die("failed to parse CONTENT_LENGTH: %s", str); return val; } diff --git a/t/t5562-http-backend-content-length.sh b/t/t5562-http-backend-content-length.sh index 057dcb85d6..b28c3c4765 100755 --- a/t/t5562-http-backend-content-length.sh +++ b/t/t5562-http-backend-content-length.sh @@ -152,4 +152,15 @@ test_expect_success 'CONTENT_LENGTH overflow ssite_t' ' grep "fatal:.*CONTENT_LENGTH" err ' +test_expect_success 'empty CONTENT_LENGTH' ' + env \ + QUERY_STRING="/repo.git/info/refs?service=git-receive-pack" \ + PATH_TRANSLATED="$PWD"/.git/info/refs \ + GIT_HTTP_EXPORT_ALL=TRUE \ + REQUEST_METHOD=GET \ + CONTENT_LENGTH="" \ + git http-backend act.out 2>act.err && + verify_http_result "200 OK" +' + test_done -- 2.17.0.1185.g782057d875
Re: Temporary git files for the gitdir created on a separate drive in workdir
On Fri, Sep 7, 2018 at 6:48 PM Junio C Hamano wrote: > > Hultqvist writes: > > > Considering that the gitdir could be located on a different drive than > > the workdir wouldn't it make more sense to create the temporary files > > in a subdirectory inside the gitdir rather tan in the workdir? > > I do not think we intend to create temporary files, whose final > destination is somewhere under $GIT_DIR/, in any working tree; > rather, I think we try to create them inside $GIT_DIR (or possibly > if the destination is a file in a subdirectory of $GIT_DIR, then in > the same subdirectory). What you are seeing definitely smells like > a bug in the worktree code, perhaps getting confused by the fact > that the full path to these places look "unusual" by starting with a > single alphabet followed by a colon (IOW, this may manifest only in > Windows port). I agree. Auditing the setup code did not reveal anything though. Our code should recognize these unusual Windows paths as absolute and while I spotted an incorrect use of '/' (instead of is_dir_sep) it does not explain the problem here. Hultqvist, if you set environment variable GIT_TRACE_SETUP to 1 and run "git status" in G:\Test1, what does it say? -- Duy
Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension
On Fri, 7 Sep 2018 at 22:24, Ben Peart wrote: > > Ben Peart writes: > >> - 160-bit SHA-1 over the extension types and their sizes (but not > >> their contents). E.g. if we have "TREE" extension that is N-bytes > >> long, "REUC" extension that is M-bytes long, followed by "EOIE", > >> then the hash would be: > The purpose of the SHA isn't to detect disk corruption (we already have > a SHA for the entire index that can serve that purpose) but to help > ensure that this was actually a valid EOIE extension and not a lucky > random set of bytes. [...] > >> +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */ > >> +the_hash_algo->init_fn(&c); > >> +while (src_offset < mmap_size - the_hash_algo->rawsz - > >> EOIE_SIZE_WITH_HEADER) { [...] > >> +the_hash_algo->final_fn(hash, &c); > >> +if (hashcmp(hash, (unsigned char *)index)) > >> +return 0; > >> + > >> +/* Validate that the extension offsets returned us back to the eoie > >> extension. */ > >> +if (src_offset != mmap_size - the_hash_algo->rawsz - > >> EOIE_SIZE_WITH_HEADER) > >> +return 0; Besides the issue you and Junio discussed with "should we document this as being SHA-1 or NewHash" (or "the hash algo"), it seems to me that this implementation is living somewhere between using SHA-1 and "the hash algo". The hashing uses `the_hash_algo`, but the hash size is hardcoded at 20 bytes. Maybe it all works out, e.g., so that when someone (brian) merges a NewHash and runs the testsuite, this will fail consistently and in a safe way. But I wonder if it would be too hard to avoid the hardcoded 24 already now. Martin
Re: [PATCH 0/4] un-breaking pack-objects with bitmaps
On Sat, Sep 01 2018, Jeff King wrote: > On Fri, Aug 31, 2018 at 06:55:58PM -0400, Jeff King wrote: > >> On Fri, Aug 31, 2018 at 05:23:17PM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> > On Tue, Aug 21 2018, Jeff King wrote: >> > >> > > +int bitmap_has_sha1_in_uninteresting(struct bitmap_index *bitmap_git, >> > > + const unsigned char *sha1) >> > > +{ >> > > +int pos; >> > > + >> > > +if (!bitmap_git) >> > > +return 0; /* no bitmap loaded */ >> > > +if (!bitmap_git->result) >> > > +BUG("failed to perform bitmap walk before querying"); >> > >> > Some part of what calls this completely breaks pushing from the "next" >> > branch when you have local bitmaps (we *really* should have some tests >> > for this...). >> >> Yikes, thanks for reporting. I agree we need better tests here. > > OK, here is the fix. Since the problem is in 'next', this is done as a > patch on top of jk/pack-delta-reuse-with-bitmap. But since we're set to > rewind 'next' post-release anyway, we could squash it directly into > 30cdc33fba from the original series. That would help later bisections > from running into it, which may be worth it as it's a pretty severe > breakage. Or maybe not: Junio: Just a reminder that next is still broken with this, and I see e.g. the Debian "experimental" has the bug but not the fix at this point. I'm just reverting jk/pack-delta-reuse-with-bitmap out of next when building my own package of git, but I think this really should be fixed in that branch, either by merging the fix down or reverting the original series out of next, I think just merging the fix down makes sense, but have no strong opinion on it.