date:20180907

Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases

2018-09-07 Thread Ævar Arnfjörð Bjarmason



On Fri, Sep 07 2018, Jonathan Nieder wrote:

> Jeff King wrote:
>
>> I don't see any point in generating a sorted list and _then_ making an
>> auxiliary hashmap. My idea was that if you're using a sorted string-list
>> for lookup, then you can replace the whole thing with a hash (inserting
>> as you go, rather than sorting at the end).
>
> What if I'm sorting a string list in preparation for emitting a sorted
> list, and I *also* want to perform lookups in that same list?  In
> other words:

If this turns out to be a common use-case perhaps the easiest way to
support that would be to make the hashmap (optionally?) ordered, as Ruby
1.9 did with their hash implementation:
https://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/

Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases

2018-09-07 Thread Jonathan Nieder

Ævar Arnfjörð Bjarmason wrote:
> On Fri, Sep 07 2018, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> I don't see any point in generating a sorted list and _then_ making an
>>> auxiliary hashmap. My idea was that if you're using a sorted string-list
>>> for lookup, then you can replace the whole thing with a hash (inserting
>>> as you go, rather than sorting at the end).
>>
>> What if I'm sorting a string list in preparation for emitting a sorted
>> list, and I *also* want to perform lookups in that same list?  In
>> other words:
>
> If this turns out to be a common use-case perhaps the easiest way to
> support that would be to make the hashmap (optionally?) ordered, as Ruby
> 1.9 did with their hash implementation:
> https://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/

That's about recording the order of insertion.  I'm talking about
something much simpler: sorting an array (as preparation for emitting
it) and binary searching to find an entry in that array.

Re: [PATCH 14/14] format-patch: allow --range-diff to apply to a lone-patch

2018-09-07 Thread Eric Sunshine

On Wed, Jul 25, 2018 at 5:07 PM Junio C Hamano  wrote:
> Eric Sunshine  writes:
> > + if (cmit_fmt_is_mail(ctx.fmt) && opt->rdiff1) {
> > + struct diff_queue_struct dq;
> > +
> > + memcpy(&dq, &diff_queued_diff, sizeof(diff_queued_diff));
> > + DIFF_QUEUE_CLEAR(&diff_queued_diff);
> > +
> > + next_commentary_block(opt, NULL);
> > + fprintf_ln(opt->diffopt.file, "%s", opt->rdiff_title);
> > + show_range_diff(opt->rdiff1, opt->rdiff2,
> > + opt->creation_factor, 1, &opt->diffopt);
> > +
> > + memcpy(&diff_queued_diff, &dq, sizeof(diff_queued_diff));
> > + }
> >  }
>
> This essentially repeats what is already done for "interdiff".

Yes, the two blocks are very similar, although they access different
members of 'rev_info' and call different functions to perform the
actual diff. I explored ways of avoiding the repeated boilerplate
(using macros or passing a function pointer to a driver function
containing the boilerplate), but the end result was uglier and harder
to understand due to the added abstraction. Introducing
next_commentary_block()[1] reduced the repetition a bit.

> Does the global diff_queued_diff gets cleaned up when
> show_interdiff() and show_range_diff() return, like diff_flush()
> does?  Otherwise we'd be leaking the filepairs accumulated in the
> diff_queued_diff.

Both show_interdiff() and show_range_diff() call diff_flush(). So, the
"temporary" diff_queued_diff set up here does get cleaned up by
diff_flush(), as far as I understand (though this is my first foray
into the diff-generation code, so I may be missing something). And,
the diff_queued_diff which gets "interrupted" by this excursion into
interdiff/range-diff gets cleaned up normally when the interrupted
diff operation completes.

[1]: 
https://public-inbox.org/git/20180722095717.17912-6-sunsh...@sunshineco.com/

Re: [PATCH 11/14] format-patch: extend --range-diff to accept revision range

2018-09-07 Thread Eric Sunshine

On Wed, Jul 25, 2018 at 4:53 PM Junio C Hamano  wrote:
> Eric Sunshine  writes:
> > When submitting a revised a patch series, the --range-diff option embeds
> > a range-diff in the cover letter showing changes since the previous
> > version of the patch series. The argument to --range-diff is a simple
> > revision naming the tip of the previous series [...]
> >
> > However, it fails if the revision ranges of the old and new versions of
> > the series are disjoint. To address this shortcoming, extend
> > --range-diff to also accept an explicit revision range for the previous
> > series. [...]
>
> >  static void infer_range_diff_ranges(struct strbuf *r1,
> >  {
> > - strbuf_addf(r1, "%s..%s", head_oid, prev);
> > - strbuf_addf(r2, "%s..%s", prev, head_oid);
>
> I thought "git range-diff" also took the three-dot notation as a
> short-hand but there is no point using that in this application,
> which wants the RHS and the LHS range in separate output variables.

Correct.

> I actually would feel better to see the second range for the normal
> case to be computed exactly the same way, i.e.
>
> int prev_is_range = strstr(prev, "..");
>
> if (prev_is_range)
> strbuf_addstr(r1, prev);
> else
> strbuf_addf(r1, "%s..%s", head_oid, prev);
>
> if (origin)
> strbuf_addf(r2, "%s..%s", oid_to_hex(&origin->object.oid, 
> head_oid);
> else if (prev_is_range)
> die("cannot figure out the origin of new series");
> else {
> warning("falling back to use '%s' as the origin of new 
> series", prev);
> strbuf_addf(r2, "%s..%s", prev, head_oid);
> }
>
> because origin..HEAD is always the set of the "new" series, no
> matter what "prev" the user chooses to compare that series against,
> when there is a single "origin".

I plan to submit a short series as incremental changes atop
es/format-patch-{interdiff,rangediff} to address the few review
comments[1] (including my own[2]), so I can make this change, as well.

[1]: 
https://public-inbox.org/git/CAPig+cSuYUYSPTuKx08wcmQM-G12_-W2T4BS07fA=6grm1b...@mail.gmail.com/T/#rfaf5a563e81b862bd8ed69232040056ab9a86dd8
[2]: 
https://public-inbox.org/git/CAPig+cSuYUYSPTuKx08wcmQM-G12_-W2T4BS07fA=6grm1b...@mail.gmail.com/

Re: [PATCH] http-backend: allow empty CONTENT_LENGTH

2018-09-07 Thread Junio C Hamano

Max Kirillov  writes:

> Actually, another reason for the latest issue was that CONTENT_LENGTH
> is parsed for GET requests at all. It should be parsed only for POST
> requests, or, rather, only for upoad-pack and receive-pack requests.

Not really.  The layered design of the HTTP protocol means that any
request type can have non-empty body, but request types for which
no semantics of the body is defined must ignore what is in the body,
which in turn means we need to parse and pay attention to the
content length etc. to find the end of the body, if only to ignore
it.

In any case, hopefully we can fix this before the final, as this is
a regression introduced during this cycle?

Re: [PATCH] diff: allow --recurse-submodules as an synonym for --submodule

2018-09-07 Thread Junio C Hamano

Jonathan Nieder  writes:

> When I think about it this way, I suspect that (B) will provide a
> better experience than (A), so this diff change doesn't seem like a
> step in the wrong direction.

OK, that's fair.

Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Allan Sandfeld Jensen

Submodules checked out with older versions of git not longer works in the 
latest 2.19 releases. A "git submodule update --recursive" command wil fail 
for each submodule with a line saying "fatal: could not open 
'/.git' for writing> Is a directory.

I checked the release notes so far, and they do not say the old submodules 
from git 2.16- were no longer supported, so I assume it is a bug?

'Allan

Re: [PATCH 01/11] string_list: print_string_list to use trace_printf

2018-09-07 Thread Junio C Hamano

Jeff King  writes:

> I guess the question is: is this a thing we would want to make available
> to code to leave in all the time? Or is it just for sticking in
> temporarily for a quick dump?
>
> If the former, then I think it needs the early-return at the least (and
> probably _should_ have the key parameter).
>
> But I get the feeling that you really just want the latter, and are only
> grudgingly being pushed into the former by Junio's suggestion.

Well, if that is the case, I'd change "my suggestion" (although I
didn't mean to suggest anything concrete).  If this was needed
and/or was useful only during the defvelopment of the remainder of
the series, and if this is known to be a half-hearted change that is
not meant to be useful as a general solution, then let's not take
the main part of the series a hostage to this step; rather, we
should drop this change and leave it to another series that is
willing to do it right.  It would save both author's and reviewers'
time if we did so.

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason

On Tue, Sep 04 2018, Ævar Arnfjörð Bjarmason wrote:

> Before this change the "commit-graph write" command didn't report any
> progress. On my machine this command takes more than 10 seconds to
> write the graph for linux.git, and around 1m30s on the
> 2015-04-03-1M-git.git[1] test repository, which is a test case for
> larger monorepos.

There's a fun issue with this code that I'll fix, but thought was
informative to send a mail about.

Because the graph verification happens in the main "git gc" process, as
opposed to everything else via external commands, so all this progress
output gets written to .git/gc.log.

Then next time we do a "gc --auto" we vomit out a couple of KB of
progress bar output at the user, since spot that the gc.log isn't empty.

2.19.0.rc2.windows.1: stash fails with dirty submodule

2018-09-07 Thread Thomas Braun

Hi,

(reported at [1] already but reposting here as suggested)

I'm using git with stash and rebase builtins.

$ git --version --build-options

git version 2.19.0.rc2.windows.1
cpu: x86_64
built from commit: 425f414f8e04123eacb5597776d6a8de445a8d8b
sizeof-long: 4
sizeof-size_t: 8

With the following recipe

mkdir test
cd test
git init
echo 1 > file
git add file
git commit file -m "message"
git submodule add ./ mysubmod
git commit -m "Add submodule"
echo 2 > mysubmod/file
git checkout -b mybranch
git rebase -i --autosquash master

I get

Initialized empty Git repository in E:/projekte/test/.git/
[master (root-commit) 97c0108] message
 1 file changed, 1 insertion(+)
 create mode 100644 file
Cloning into 'E:/projekte/test/mysubmod'...
done.
[master 282a50f] Add submodule
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 16 mysubmod
Switched to a new branch 'mybranch'
M   mysubmod
fatal: Unexpected stash response: ''

and that used to work with older git versions.

Thanks for reading,
Thomas

[1]:
https://github.com/git-for-windows/git/issues/1820#issuecomment-419411808

Re: [PATCH 09/11] multi-pack-index: verify object offsets

2018-09-07 Thread Derrick Stolee


On 9/6/2018 8:34 PM, Eric Sunshine wrote:

On Wed, Sep 5, 2018 at 10:46 AM Derrick Stolee via GitGitGadget
 wrote:

Replace the BUG() statement with a die() statement, now that we
may hit a bad pack-int-id during a 'verify' command on a corrupt
multi-pack-index, and it is covered by a test.

Signed-off-by: Derrick Stolee 
---
diff --git a/midx.c b/midx.c
@@ -197,7 +197,7 @@ int prepare_midx_pack(struct multi_pack_index *m, uint32_t 
pack_int_id)
 if (pack_int_id >= m->num_packs)
-   BUG("bad pack-int-id");
+   die(_("bad pack-int-id"));

For someone trying to diagnose this issue, would it be helpful to know
(that is, print out) the values of pack_int_id and num_packs?

Can do! Thanks.

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Derrick Stolee


On 9/7/2018 8:40 AM, Ævar Arnfjörð Bjarmason wrote:

On Tue, Sep 04 2018, Ævar Arnfjörð Bjarmason wrote:


Before this change the "commit-graph write" command didn't report any
progress. On my machine this command takes more than 10 seconds to
write the graph for linux.git, and around 1m30s on the
2015-04-03-1M-git.git[1] test repository, which is a test case for
larger monorepos.

There's a fun issue with this code that I'll fix, but thought was
informative to send a mail about.

Because the graph verification happens in the main "git gc" process, as
opposed to everything else via external commands, so all this progress
output gets written to .git/gc.log.

Then next time we do a "gc --auto" we vomit out a couple of KB of
progress bar output at the user, since spot that the gc.log isn't empty.
Good catch! (I do want to clarify that the graph _writing_ happens 
during 'git gc' since 'git commit-graph verify' is a different thing.)

Re: [PATCH v3 4/4] read-cache: speed up index load through parallelization

2018-09-07 Thread Ben Peart





On 9/7/2018 12:16 AM, Torsten Bögershausen wrote:



diff --git a/read-cache.c b/read-cache.c
index fcc776aaf0..8537a55750 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1941,20 +1941,212 @@ static void *load_index_extensions(void *_data)
return NULL;
  }
  
+/*

+ * A helper function that will load the specified range of cache entries
+ * from the memory mapped file and add them to the given index.
+ */
+static unsigned long load_cache_entry_block(struct index_state *istate,
+   struct mem_pool *ce_mem_pool, int offset, int nr, void 
*mmap,
+   unsigned long start_offset, struct strbuf 
*previous_name)
+{
+   int i;
+   unsigned long src_offset = start_offset;


I read an unsigned long here:
should that be a size_t instead ?

(And probably even everywhere else in this patch)



It's a fair question.  The pre-patch code had a mix of unsigned long and 
size_t.  Both src_offset and consumed were unsigned long but mmap_size 
was a size_t.  I stuck with that pattern for consistency.


While it would be possible to convert everything to size_t as a step to 
enable index files >4 GB, I have a hard time believing that will be 
necessary for a very long time and would likely require more substantial 
changes to enable that to work.



+
+   for (i = offset; i < offset + nr; i++) {
+   struct ondisk_cache_entry *disk_ce;
+   struct cache_entry *ce;
+   unsigned long consumed;
+
+   disk_ce = (struct ondisk_cache_entry *)((char *)mmap + 
src_offset);
+   ce = create_from_disk(ce_mem_pool, disk_ce, &consumed, 
previous_name);
+   set_index_entry(istate, i, ce);
+
+   src_offset += consumed;
+   }
+   return src_offset - start_offset;
+}
+
+static unsigned long load_all_cache_entries(struct index_state *istate,
+   void *mmap, size_t mmap_size, unsigned long src_offset)
+{
+   struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
+   unsigned long consumed;
+
+   if (istate->version == 4) {
+   previous_name = &previous_name_buf;
+   mem_pool_init(&istate->ce_mem_pool,
+   
estimate_cache_size_from_compressed(istate->cache_nr));
+   } else {
+   previous_name = NULL;
+   mem_pool_init(&istate->ce_mem_pool,
+   estimate_cache_size(mmap_size, 
istate->cache_nr));
+   }
+
+   consumed = load_cache_entry_block(istate, istate->ce_mem_pool,
+   0, istate->cache_nr, mmap, src_offset, 
previous_name);
+   strbuf_release(&previous_name_buf);
+   return consumed;
+}
+
+#ifndef NO_PTHREADS
+

Re: ordered string-list considered harmful, was Re: [PATCH v3] Allow aliases that include other aliases

2018-09-07 Thread Jeff King

On Thu, Sep 06, 2018 at 11:32:41PM -0700, Jonathan Nieder wrote:

> > I think Stefan pointed out a "case 4" in the other part of the thread:
> > ones where we really care not just about fast lookup, but actual
> > iteration order.
> 
> I had assumed that that was the whole point of this data structure.
> Anything else that is using it for lookups should indeed use a hash
> map instead, and I can take my share of blame for missing this kind of
> thing in review.

Keep in mind we didn't have a decent generic hashmap for many years. So
I think string-list got used in its place.

> > I think I like the hashmap way, if the conversion isn't too painful.
> 
> If we don't have any callers that actually need the sort-and-lookup
> thing, then yay, let's get rid of it.  But I don't actually think of
> this as the hashmap way.  It's the get-rid-of-the-unneeded-feature
> way.
> 
> In other words, *regardless* of what else we should do, we should
> update any callers that want a hashmap to use a hashmap.  Please go
> ahead, even if it doesn't let us simplify the string list API at all.

Great, I think we're on the same page. Thanks!

-Peff

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Jeff King

On Fri, Sep 07, 2018 at 11:52:58AM +0200, Allan Sandfeld Jensen wrote:

> Submodules checked out with older versions of git not longer works in the 
> latest 2.19 releases. A "git submodule update --recursive" command wil fail 
> for each submodule with a line saying "fatal: could not open 
> '/.git' for writing> Is a directory.

I couldn't reproduce after cloning with v1.6.6.3 (which creates ".git"
as a directory in the tree). Is it possible for you to bisect (or
perhaps share with us the broken example)?

> I checked the release notes so far, and they do not say the old submodules 
> from git 2.16- were no longer supported, so I assume it is a bug?

I don't think it was intentional. +cc Stefan for submodule expertise.

-Peff

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason



On Wed, Sep 05 2018, Derrick Stolee wrote:

> On 9/4/2018 6:07 PM, Junio C Hamano wrote:
>> Ævar Arnfjörð Bjarmason   writes:
>>
>>> With --stdin-packs we don't show any estimation of how much is left to
>>> do. This is because we might be processing more than one pack. We
>>> could be less lazy here and show progress, either detect by detecting
>>> that we're only processing one pack, or by first looping over the
>>> packs to discover how many commits they have. I don't see the point in
>> I do not know if there is no point, but if we were to do it, I think
>> slurping the list of packs and computing the number of objects is
>> not all that bad.
>
> If you want to do that, I have nothing against it. However, I don't
> expect users to use that option directly. That option is used by VFS
> for Git to compute the commit-graph in the background after receiving
> a pack of commits and trees, but not by 'git gc' which I expect is how
> most users will compute commit-graphs.
>
>>>   static void compute_generation_numbers(struct packed_commit_list* commits)
>>>   {
>>> int i;
>>> struct commit_list *list = NULL;
>>> +   struct progress *progress = NULL;
>>>   + progress = start_progress(
>>> +   _("Computing commit graph generation numbers"), commits->nr);
>>> for (i = 0; i < commits->nr; i++) {
>>> +   display_progress(progress, i);
>>> if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY 
>>> &&
>>> commits->list[i]->generation != GENERATION_NUMBER_ZERO)
>>> continue;
>> I am wondering if the progress call should be moved after this
>> conditional continue; would we want to count the entry whose
>> generation is already known here?  Of course, as we give commits->nr
>> as the 100% ceiling, we cannot avoid doing so, but it somehow smells
>> wrong.
>
> If we wanted to be completely right, we would count the commits in the
> list that do not have a generation number and report that as the 100%
> ceiling.
>
> Something like the diff below would work. I tested it in Linux by
> first deleting my commit-graph and running the following:
>
> stolee@stolee-linux:~/linux$ rm .git/objects/info/commit-graph
> stolee@stolee-linux:~/linux$ git rev-parse v4.6 | ~/git/git
> commit-graph write --stdin-commits
> Annotating commits in commit graph: 1180333, done.
> Computing commit graph generation numbers: 100% (590166/590166), done.
> stolee@stolee-linux:~/linux$ ~/git/git commit-graph write --reachable
> Annotating commits in commit graph: 1564087, done.
> Computing commit graph generation numbers: 100% (191590/191590), done.
>
> -->8--
>
> From: Derrick Stolee 
> Date: Wed, 5 Sep 2018 11:55:42 +
> Subject: [PATCH] fixup! commit-graph write: add progress output
>
> Signed-off-by: Derrick Stolee 
> ---
> commit-graph.c | 15 +++
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 1a02fe019a..b933bc9f00 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -634,14 +634,20 @@ static void close_reachable(struct
> packed_oid_list *oids)
>
> static void compute_generation_numbers(struct packed_commit_list* commits)
> {
> - int i;
> + int i, count_uncomputed = 0;
>  struct commit_list *list = NULL;
>  struct progress *progress = NULL;
>
> + for (i = 0; i < commits->nr; i++)
> + if (commits->list[i]->generation ==
> GENERATION_NUMBER_INFINITY ||
> + commits->list[i]->generation == GENERATION_NUMBER_ZERO)
> + count_uncomputed++;
> +
>  progress = start_progress(
> - _("Computing commit graph generation numbers"),
> commits->nr);
> + _("Computing commit graph generation numbers"),
> count_uncomputed);
> + count_uncomputed = 0;
> +
>  for (i = 0; i < commits->nr; i++) {
> - display_progress(progress, i);
>  if (commits->list[i]->generation !=
> GENERATION_NUMBER_INFINITY &&
>  commits->list[i]->generation != GENERATION_NUMBER_ZERO)
>  continue;
> @@ -670,10 +676,11 @@ static void compute_generation_numbers(struct
> packed_commit_list* commits)
>
>  if (current->generation >
> GENERATION_NUMBER_MAX)
>  current->generation =
> GENERATION_NUMBER_MAX;
> +
> + display_progress(progress,
> ++count_uncomputed);
>  }
>  }
>  }
> - display_progress(progress, i);
>  stop_progress(&progress);
> }

One of the things I was trying to do with this series was to make sure
that whenever we run "git gc" there's always some indication that if you
set gc.writeCommitGraph=true that it's actualy doing work.

This modifies that, which I think is actually fine, just something I
wanted to note. I.e. if you run "git commit-graph write" twice in a row,
the second time will have no output.

Unless that is, your repo is big enough that some of the delayed timers
kick in. So e.g. on git.git we get no output the second time around, but
do get output the first time around, and on linux.git we always get
output.

But in the common case people aren't running this in a loop, and it's
useful to see how many new th

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Allan Sandfeld Jensen

On Freitag, 7. September 2018 17:03:27 CEST Jeff King wrote:
> On Fri, Sep 07, 2018 at 11:52:58AM +0200, Allan Sandfeld Jensen wrote:
> > Submodules checked out with older versions of git not longer works in the
> > latest 2.19 releases. A "git submodule update --recursive" command wil
> > fail
> > for each submodule with a line saying "fatal: could not open
> > '/.git' for writing> Is a directory.
> 
> I couldn't reproduce after cloning with v1.6.6.3 (which creates ".git"
> as a directory in the tree). Is it possible for you to bisect (or
> perhaps share with us the broken example)?
> 
I discovered it by using Debian testing, and it is shipping the 2.17rcs for 
some reason. The example is just an old checkout of qt5.git with submodules, 
it is rather large.

I could try bisecting, but I am not sure I have the time anytime soon, I just 
wanted to report it to you first incase you knew of a change that suddenly 
assumed the new structure.

'Allan

Re: Mailsplit

2018-09-07 Thread Kevin Daudt

On Wed, Sep 05, 2018 at 09:17:29PM -0700, Stephen & Linda Smith wrote:
> I thought I would use "git mailsplit" to split a mbox file (which downloaded 
> from public inbox) so that I could attemp to resurrect a patch series for 
> from 
> a year ago.  
> 
> The motivation is that I downloaded the series [1] and applied to a tag from 
> about the time period that the patch was sent out [2].  
> 
> The "git am -3 patch.mbox  quit 2/3 of the way though.   I resolved the fix. 
> and ran "git am --continue" which didn't apply the rest of the patches in the 
> mbox.
> 
> So two questions:
> 1)  why would git version 2.18.0 not appear to continue applying the patches. 
>   
> 
> 2) where do I find the command "git mailsplit".   The onlything in my 
> installed tree is:
> 
>   $ find  /usr/local/ -name '*mailsplit*'
>   /usr/local/share/doc/git-doc/git-mailsplit.txt
>   /usr/local/share/doc/git-doc/git-mailsplit.html
>   /usr/local/share/man/man1/git-mailsplit.1
>   /usr/local/libexec/git-core/git-mailsplit

This is the mailsplit command, and should be executed when running `git
mailsplit`. What does git --exec-dir return?

Kevin

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason



On Fri, Sep 07 2018, Ævar Arnfjörð Bjarmason wrote:

> On Wed, Sep 05 2018, Derrick Stolee wrote:
>
>> On 9/4/2018 6:07 PM, Junio C Hamano wrote:
>>> Ævar Arnfjörð Bjarmason   writes:
>>>
 With --stdin-packs we don't show any estimation of how much is left to
 do. This is because we might be processing more than one pack. We
 could be less lazy here and show progress, either detect by detecting
 that we're only processing one pack, or by first looping over the
 packs to discover how many commits they have. I don't see the point in
>>> I do not know if there is no point, but if we were to do it, I think
>>> slurping the list of packs and computing the number of objects is
>>> not all that bad.
>>
>> If you want to do that, I have nothing against it. However, I don't
>> expect users to use that option directly. That option is used by VFS
>> for Git to compute the commit-graph in the background after receiving
>> a pack of commits and trees, but not by 'git gc' which I expect is how
>> most users will compute commit-graphs.
>>
   static void compute_generation_numbers(struct packed_commit_list* 
 commits)
   {
int i;
struct commit_list *list = NULL;
 +  struct progress *progress = NULL;
   +progress = start_progress(
 +  _("Computing commit graph generation numbers"), commits->nr);
for (i = 0; i < commits->nr; i++) {
 +  display_progress(progress, i);
if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY 
 &&
commits->list[i]->generation != GENERATION_NUMBER_ZERO)
continue;
>>> I am wondering if the progress call should be moved after this
>>> conditional continue; would we want to count the entry whose
>>> generation is already known here?  Of course, as we give commits->nr
>>> as the 100% ceiling, we cannot avoid doing so, but it somehow smells
>>> wrong.
>>
>> If we wanted to be completely right, we would count the commits in the
>> list that do not have a generation number and report that as the 100%
>> ceiling.
>>
>> Something like the diff below would work. I tested it in Linux by
>> first deleting my commit-graph and running the following:
>>
>> stolee@stolee-linux:~/linux$ rm .git/objects/info/commit-graph
>> stolee@stolee-linux:~/linux$ git rev-parse v4.6 | ~/git/git
>> commit-graph write --stdin-commits
>> Annotating commits in commit graph: 1180333, done.
>> Computing commit graph generation numbers: 100% (590166/590166), done.
>> stolee@stolee-linux:~/linux$ ~/git/git commit-graph write --reachable
>> Annotating commits in commit graph: 1564087, done.
>> Computing commit graph generation numbers: 100% (191590/191590), done.
>>
>> -->8--
>>
>> From: Derrick Stolee 
>> Date: Wed, 5 Sep 2018 11:55:42 +
>> Subject: [PATCH] fixup! commit-graph write: add progress output
>>
>> Signed-off-by: Derrick Stolee 
>> ---
>> commit-graph.c | 15 +++
>> 1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index 1a02fe019a..b933bc9f00 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> @@ -634,14 +634,20 @@ static void close_reachable(struct
>> packed_oid_list *oids)
>>
>> static void compute_generation_numbers(struct packed_commit_list* commits)
>> {
>> - int i;
>> + int i, count_uncomputed = 0;
>>  struct commit_list *list = NULL;
>>  struct progress *progress = NULL;
>>
>> + for (i = 0; i < commits->nr; i++)
>> + if (commits->list[i]->generation ==
>> GENERATION_NUMBER_INFINITY ||
>> + commits->list[i]->generation == GENERATION_NUMBER_ZERO)
>> + count_uncomputed++;
>> +
>>  progress = start_progress(
>> - _("Computing commit graph generation numbers"),
>> commits->nr);
>> + _("Computing commit graph generation numbers"),
>> count_uncomputed);
>> + count_uncomputed = 0;
>> +
>>  for (i = 0; i < commits->nr; i++) {
>> - display_progress(progress, i);
>>  if (commits->list[i]->generation !=
>> GENERATION_NUMBER_INFINITY &&
>>  commits->list[i]->generation != GENERATION_NUMBER_ZERO)
>>  continue;
>> @@ -670,10 +676,11 @@ static void compute_generation_numbers(struct
>> packed_commit_list* commits)
>>
>>  if (current->generation >
>> GENERATION_NUMBER_MAX)
>>  current->generation =
>> GENERATION_NUMBER_MAX;
>> +
>> + display_progress(progress,
>> ++count_uncomputed);
>>  }
>>  }
>>  }
>> - display_progress(progress, i);
>>  stop_progress(&progress);
>> }
>
> One of the things I was trying to do with this series was to make sure
> that whenever we run "git gc" there's always some indication that if you
> set gc.writeCommitGraph=true that it's actualy doing work.
>
> This modifies that, which I think is actually fine, just something I
> wanted to note. I.e. if you run "git commit-graph write" twice in a row,
> the second time will have no output.
>
> Unless that is, your repo is big enough that some of the delayed timers
> kick in. So e.g. on git.git we get no output the second t

[PATCH] status: show progress bar if refreshing the index takes too long

2018-09-07 Thread Nguyễn Thái Ngọc Duy

Refreshing the index is usually very fast, but it can still take a
long time sometimes. Cold cache is one, or something else silly (*).
In this case, it's good to show something to let the user know "git
status" is not hanging, it's just busy doing something.

(*) I got called by my colleague because her "git status" took very
long and looked pretty much like hanging. After a bit of strace,
it looks to me that git was trying to rehash every single file,
and this was a big repository. This process could take minutes.

In this case, I think it was probably because she copied this
repository to a new place and stat data did not match anymore. So
git fell back to hashing.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 I need to get this out of my head before I forget. I obviously think
 this is a good idea and could be done in more places, even just to
 deal with cold cache. The hint about "git status -uno" for example,
 could be accompanied by a progress bar for scanning for untracked
 files...

 Another note about rehashing files as part of refresh. We probably
 could do better by hashing in parallel. Or perhaps not because having
 a big lock around object database pretty much kills performance, and
 if I remember correctly none of my core was 100% consumed (i.e. CPU
 bottleneck as an indication that multithread is a good idea...)

 Anyway that's it! Weekend after a long week! I'll read mails and
 respond tomorrow.

 builtin/am.c |  2 +-
 builtin/commit.c |  6 --
 cache.h  |  7 +--
 preload-index.c  | 44 +++-
 read-cache.c | 10 ++
 sequencer.c  |  2 +-
 6 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index 5e866d17c7..22a93cfef3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -2324,7 +2324,7 @@ int cmd_am(int argc, const char **argv, const char 
*prefix)
/* Ensure a valid committer ident can be constructed */
git_committer_info(IDENT_STRICT);
 
-   if (read_index_preload(&the_index, NULL) < 0)
+   if (read_index_preload(&the_index, NULL, 0) < 0)
die(_("failed to read the index"));
 
if (in_progress) {
diff --git a/builtin/commit.c b/builtin/commit.c
index 0d9828e29e..eaf639ece6 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1355,8 +1355,10 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
   PATHSPEC_PREFER_FULL,
   prefix, argv);
 
-   read_cache_preload(&s.pathspec);
-   refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, 
NULL, NULL);
+   read_index_preload(&the_index, &s.pathspec, REFRESH_PROGRESS);
+   refresh_index(&the_index,
+ REFRESH_QUIET|REFRESH_UNMERGED|REFRESH_PROGRESS,
+ &s.pathspec, NULL, NULL);
 
if (use_optional_locks())
fd = hold_locked_index(&index_lock, 0);
diff --git a/cache.h b/cache.h
index 4d014541ab..35da02be90 100644
--- a/cache.h
+++ b/cache.h
@@ -410,7 +410,7 @@ void validate_cache_entries(const struct index_state 
*istate);
 
 #define read_cache() read_index(&the_index)
 #define read_cache_from(path) read_index_from(&the_index, (path), 
(get_git_dir()))
-#define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
+#define read_cache_preload(pathspec) read_index_preload(&the_index, 
(pathspec), 0)
 #define is_cache_unborn() is_index_unborn(&the_index)
 #define read_cache_unmerged() read_index_unmerged(&the_index)
 #define discard_cache() discard_index(&the_index)
@@ -659,7 +659,9 @@ extern int daemonize(void);
 /* Initialize and use the cache information */
 struct lock_file;
 extern int read_index(struct index_state *);
-extern int read_index_preload(struct index_state *, const struct pathspec 
*pathspec);
+extern int read_index_preload(struct index_state *,
+ const struct pathspec *pathspec,
+ unsigned int refresh_flags);
 extern int do_read_index(struct index_state *istate, const char *path,
 int must_exist); /* for testting only! */
 extern int read_index_from(struct index_state *, const char *path,
@@ -814,6 +816,7 @@ extern void fill_stat_cache_info(struct cache_entry *ce, 
struct stat *st);
 #define REFRESH_IGNORE_MISSING 0x0008  /* ignore non-existent */
 #define REFRESH_IGNORE_SUBMODULES  0x0010  /* ignore submodules */
 #define REFRESH_IN_PORCELAIN   0x0020  /* user friendly output, not "needs 
update" */
+#define REFRESH_PROGRESS   0x0040  /* show progress bar if stderr is tty */
 extern int refresh_index(struct index_state *, unsigned int flags, const 
struct pathspec *pathspec, char *seen, const char *header_msg);
 extern struct cache_entry *refresh_cache_entry(struct index_state *, struct 
cache_entry *, unsigned int);
 
diff --git a/preload-index.c b/preload-index.c
index 71cd2437a3..bf7dbed779 100644
--- a/pre

Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-07 Thread Junio C Hamano

Hultqvist  writes:

> Considering that the gitdir could be located on a different drive than
> the workdir wouldn't it make more sense to create the temporary files
> in a subdirectory inside the gitdir rather tan in the workdir?

I do not think we intend to create temporary files, whose final
destination is somewhere under $GIT_DIR/, in any working tree;
rather, I think we try to create them inside $GIT_DIR (or possibly
if the destination is a file in a subdirectory of $GIT_DIR, then in
the same subdirectory).  What you are seeing definitely smells like
a bug in the worktree code, perhaps getting confused by the fact
that the full path to these places look "unusual" by starting with a
single alphabet followed by a colon (IOW, this may manifest only in
Windows port).

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Stefan Beller

On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen  wrote:
>
> Submodules checked out with older versions of git not longer works in the
> latest 2.19 releases. A "git submodule update --recursive" command wil fail
> for each submodule with a line saying "fatal: could not open
> '/.git' for writing> Is a directory.

Can you run the update again with

GIT_TRACE=1 git submodule update 

and post the output?

I have the suspicion that e98317508c0 (submodule:
ensure core.worktree is set after update, 2018-06-18)
might be the offender.

Could you try reverting that commit and check as well?

git clone https://github.com/git/git && cd git
git revert e98317508c0
make install # installs to you home dir at ~/bin

and then try again, as well?
(though bisection may be more fruitful if this doesn't pan out)

Re: Mailsplit

2018-09-07 Thread Stephen & Linda Smith

On Friday, September 7, 2018 8:15:43 AM MST Kevin Daudt wrote:
> On Wed, Sep 05, 2018 at 09:17:29PM -0700, Stephen & Linda Smith wrote:
> 
> This is the mailsplit command, and should be executed when running `git
> mailsplit`. What does git --exec-dir return?
> 

The other night when I ran "git mailsplit", I recieved an unknown command 
response.  Since then I have upated to 2.19.0 plush my submitted patches for 
git commit.  With the new build mailsplit is found.

I don't know what was wrong before.

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Jeff King

On Fri, Sep 07, 2018 at 05:23:31PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Hrm, no. I spoke too soon because I was conflating "commit-graph write"
> v.s. "gc". For "gc" we're now with this change just e.g. spending 6
> seconds on 2015-04-03-1M-git displaying nothing, because we're looping
> through the commits and finding that we have no new work.
> 
> So I'm on the fence about this, but leaning towards just taking my
> initial approch. I.e. it sucks if you're e.g. testing different "git gc"
> options that we're churning in the background doing nothing, just
> because we're trying to report how many *new* things we added to the
> graph.
> 
> After all, the main point IMNSHO is not to show some diagnostic output
> of exactly how much work we're doing, that I have 200 new commits with
> generation numbers or whatever is just useless trivia, but rather to not
> leave the user thinking the command is hanging.

I think there's some precedent for your view of things, too. For
example, "writing objects" counts _all_ of the objects, even though many
of them are just copying bytes straight from disk, and some are actually
generating a delta and/or zlib-deflating content.

So it's not the most precise measurement we could give, but it shows
there's activity, and the "average" movement over many objects tends to
be reasonably smooth.

> So I think I'll just do what I was doing to begin with and change the
> message to "Refreshing commit graph generation numbers" or something to
> indicate that it's a find/verify/compute operation, not just a compute
> operation.

So basically yes, I agree with this. :)

-Peff

Re: [PATCH 01/11] string_list: print_string_list to use trace_printf

2018-09-07 Thread Stefan Beller

On Fri, Sep 7, 2018 at 2:53 AM Junio C Hamano  wrote:
>
> Jeff King  writes:
>
> > I guess the question is: is this a thing we would want to make available
> > to code to leave in all the time? Or is it just for sticking in
> > temporarily for a quick dump?
> >
> > If the former, then I think it needs the early-return at the least (and
> > probably _should_ have the key parameter).
> >
> > But I get the feeling that you really just want the latter, and are only
> > grudgingly being pushed into the former by Junio's suggestion.

That is not quite the case. I did take all but the one suggestion
as I think they are good suggestions and I appreciate it.

However having the trace key is not useful I would think.
(I did not need it for debugging, but presented an API
of what I would have found useful)

If the key is really that useful, I would just think we can add it on top.

> Well, if that is the case, I'd change "my suggestion" (although I
> didn't mean to suggest anything concrete).  If this was needed
> and/or was useful only during the defvelopment of the remainder of
> the series, and if this is known to be a half-hearted change that is
> not meant to be useful as a general solution, then let's not take
> the main part of the series a hostage to this step; rather, we
> should drop this change and leave it to another series that is
> willing to do it right.  It would save both author's and reviewers'
> time if we did so.

In that case, would we be interested in a separate patch
that kills off that function instead?

Stefan

Re: [PATCH v3 0/4] read-cache: speed up index load through parallelization

2018-09-07 Thread Junio C Hamano

Ben Peart  writes:

> On further investigation with the previous patch, I noticed that my test
> repos didn't contain the cache tree extension in their index. After doing a
> commit to ensure they existed, I realized that in some instances, the time
> to load the cache tree exceeded the time to load all the cache entries in
> parallel.  Because the thread to read the cache tree was started last (due
> to having to parse through all the cache entries first) we weren't always
> getting optimal performance.
>
> To better optimize for this case, I decided to write the EOIE extension
> as suggested by Junio [1] in response to my earlier multithreading patch
> series [2].  This enables me to spin up the thread to load the extensions
> earlier as it no longer has to parse through all the cache entries first.

Hmph. I kinda liked the simplicity of the previous one, but if we
need to start reading the extension sections sooner by eliminating
the overhead to scan the cache entries, perhaps we should bite the
bullet now.

> The big changes in this iteration are:
>
> - add the EOIE extension
> - update the index extension worker thread to start first

I guess I'd need to see the actual patch to find this out, but once
we rely on a new extension, then we could omit scanning the main
index even to partition the work among workers (i.e. like the topic
long ago, you can have list of pointers into the main index to help
partitioning, plus reset the prefix compression used in v4).  I
think you didn't get that far in this round, which is good.  If the
gain with EOIE alone (and starting the worker for the extension
section early) is large enough without such a pre-computed work
partition table, the simplicity of this round may give us a good
stopping point.

> This patch conflicts with Duy's patch to remove the double memory copy and
> pass in the previous ce instead.  The two will need to be merged/reconciled
> once they settle down a bit.

Thanks.  I have a feeling that 67922abb ("read-cache.c: optimize
reading index format v4", 2018-09-02) is already 'next'-worthy
and ready to be built on, but I'd prefer to hear from Duy to double
check.

Re: [PATCH 1/2] commit-graph write: add progress output

2018-09-07 Thread Derrick Stolee


On 9/7/2018 1:15 PM, Jeff King wrote:

On Fri, Sep 07, 2018 at 05:23:31PM +0200, Ævar Arnfjörð Bjarmason wrote:


Hrm, no. I spoke too soon because I was conflating "commit-graph write"
v.s. "gc". For "gc" we're now with this change just e.g. spending 6
seconds on 2015-04-03-1M-git displaying nothing, because we're looping
through the commits and finding that we have no new work.

So I'm on the fence about this, but leaning towards just taking my
initial approch. I.e. it sucks if you're e.g. testing different "git gc"
options that we're churning in the background doing nothing, just
because we're trying to report how many *new* things we added to the
graph.

After all, the main point IMNSHO is not to show some diagnostic output
of exactly how much work we're doing, that I have 200 new commits with
generation numbers or whatever is just useless trivia, but rather to not
leave the user thinking the command is hanging.

I think there's some precedent for your view of things, too. For
example, "writing objects" counts _all_ of the objects, even though many
of them are just copying bytes straight from disk, and some are actually
generating a delta and/or zlib-deflating content.

So it's not the most precise measurement we could give, but it shows
there's activity, and the "average" movement over many objects tends to
be reasonably smooth.


So I think I'll just do what I was doing to begin with and change the
message to "Refreshing commit graph generation numbers" or something to
indicate that it's a find/verify/compute operation, not just a compute
operation.

So basically yes, I agree with this. :)


Same here. Thanks for the discussion.

-Stolee

Re: [PATCH] status: show progress bar if refreshing the index takes too long

2018-09-07 Thread Eric Sunshine

On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy  wrote:
> Refreshing the index is usually very fast, but it can still take a
> long time sometimes. Cold cache is one, or something else silly (*).
> In this case, it's good to show something to let the user know "git
> status" is not hanging, it's just busy doing something.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
> diff --git a/read-cache.c b/read-cache.c
> @@ -1516,6 +1522,8 @@ int refresh_index(struct index_state *istate, unsigned 
> int flags,
> +   if (progress)
> +   display_progress(progress, i);
> @@ -1547,6 +1555,8 @@ int refresh_index(struct index_state *istate, unsigned 
> int flags,
> +   if (progress)
> +   stop_progress(&progress);

Nit: Both display_progress() and stop_progress() behave sanely when
'progress' is NULL, so no need for the conditional.

Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-07 Thread Junio C Hamano

Ben Peart  writes:

> The extension consists of:
>
> - 32-bit offset to the end of the index entries
>
> - 160-bit SHA-1 over the extension types and their sizes (but not
> their contents).  E.g. if we have "TREE" extension that is N-bytes
> long, "REUC" extension that is M-bytes long, followed by "EOIE",
> then the hash would be:
>
> SHA-1("TREE" +  +
>   "REUC" + )

I didn't look at the documentation patch in the larger context, but
please make sure that it is clear to the readers that these fixed
width integers "binary representations" use network byte order.

I briefly wondered if the above should include

+ "EOIE" + 

as it is pretty much common file format design to include the header
part of the checksum record (with checksum values padded out with NUL
bytes) when you define a record to hold the checksum of the entire
file.  Since this does not protect the contents of each section from
bit-flipping, adding the data for EOIE itself in the sum does not
give us much (iow, what I am adding above is a constant that does
not contribute any entropy).

How big is a typical TREE extension in _your_ work repository
housing Windows sources?  I am guessing that replacing SHA-1 with
something faster (as this is not about security but is about disk
corruption) and instead hash also the contents of these sections
would NOT help all that much in the performance department, as
having to page them in to read through would already consume
non-trivial amount of time, and that is why you are not hashing the
contents.

> + /*
> +  * CACHE_EXT_ENDOFINDEXENTRIES must be written as the last entry before 
> the SHA1

s/SHA1/trailing checksum/ or something so that we can withstand
NewHash world order?

> +  * so that it can be found and processed before all the index entries 
> are
> +  * read.
> +  */
> + if (!strip_extensions && offset && 
> !git_env_bool("GIT_TEST_DISABLE_EOIE", 0)) {
> + struct strbuf sb = STRBUF_INIT;
> +
> + write_eoie_extension(&sb, &eoie_c, offset);
> + err = write_index_ext_header(&c, NULL, newfd, 
> CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0
>   || ce_write(&c, newfd, sb.buf, sb.len) < 0;
>   strbuf_release(&sb);
>   if (err)

OK.

> +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */
> +#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + 
> <4-byte length> + EOIE_SIZE */
> +
> +#ifndef NO_PTHREADS
> +static unsigned long read_eoie_extension(void *mmap, size_t mmap_size)
> +{
> + /*
> +  * The end of index entries (EOIE) extension is guaranteed to be last
> +  * so that it can be found by scanning backwards from the EOF.
> +  *
> +  * "EOIE"
> +  * <4-byte length>
> +  * <4-byte offset>
> +  * <20-byte hash>
> +  */
> + const char *index, *eoie = (const char *)mmap + mmap_size - 
> GIT_SHA1_RAWSZ - EOIE_SIZE_WITH_HEADER;
> + uint32_t extsize;
> + unsigned long offset, src_offset;
> + unsigned char hash[GIT_MAX_RAWSZ];
> + git_hash_ctx c;
> +
> + /* validate the extension signature */
> + index = eoie;
> + if (CACHE_EXT(index) != CACHE_EXT_ENDOFINDEXENTRIES)
> + return 0;
> + index += sizeof(uint32_t);
> +
> + /* validate the extension size */
> + extsize = get_be32(index);
> + if (extsize != EOIE_SIZE)
> + return 0;
> + index += sizeof(uint32_t);

Do we know we have at least 8-byte to consume to perform the above
two checks, or is that something we need to verify at the beginning
of this function?  Better yet, as we know that a correct EOIE with
its own header is 28-byte long, we probably should abort if
mmap_size is smaller than that.

> + /*
> +  * Validate the offset we're going to look for the first extension
> +  * signature is after the index header and before the eoie extension.
> +  */
> + offset = get_be32(index);
> + if ((const char *)mmap + offset < (const char *)mmap + sizeof(struct 
> cache_header))
> + return 0;

Claims that the end is before the beginning, which we reject as bogus.  Good.

> + if ((const char *)mmap + offset >= eoie)
> + return 0;

Claims that the end is beyond the EOIE marker we should have placed
after it, which we reject as bogus.  Good.

> + index += sizeof(uint32_t);
> +
> + /*
> +  * The hash is computed over extension types and their sizes (but not
> +  * their contents).  E.g. if we have "TREE" extension that is N-bytes
> +  * long, "REUC" extension that is M-bytes long, followed by "EOIE",
> +  * then the hash would be:
> +  *
> +  * SHA-1("TREE" +  +
> +  *   "REUC" + )
> +  */
> + src_offset = offset;
> + the_hash_algo->init_fn(&c);
> + while (src_offset < mmap_size - the_hash_algo->rawsz - 
> EOIE_SIZE_WITH_HEADER) {
> + /* After an array of active_nr index entries,
(Style nit).
> +

[PATCH] config.mak.dev: add -Wformat-security

2018-09-07 Thread Jeff King

We currently build cleanly with -Wformat-security, and it's
a good idea to make sure we continue to do so (since calls
that trigger the warning may be security vulnerabilities).

Note that we cannot use the stronger -Wformat-nonliteral, as
there are case where we are clever with passing around
pointers to string literals. E.g., bisect_rev_setup() takes
bad_format and good_format parameters. These ultimately come
from literals, but they still trigger the warning.

Some of these might be fixable (e.g., by passing flags from
which we locally select a format), and might even be worth
fixing (not because of security, but just because it's an
easy mistake to pass the wrong format). But there are other
cases which are likely quite hard to fix (we actually
generate formats in a local buffer in some cases). So let's
punt on that for now and start with -Wformat-security, which
is supposed to catch the most important cases.

Signed-off-by: Jeff King 
---
 config.mak.dev | 1 +
 1 file changed, 1 insertion(+)

diff --git a/config.mak.dev b/config.mak.dev
index 9a998149d9..f832752454 100644
--- a/config.mak.dev
+++ b/config.mak.dev
@@ -14,6 +14,7 @@ CFLAGS += -Wpointer-arith
 CFLAGS += -Wstrict-prototypes
 CFLAGS += -Wunused
 CFLAGS += -Wvla
+CFLAGS += -Wformat-security
 
 ifndef COMPILER_FEATURES
 COMPILER_FEATURES := $(shell ./detect-compiler $(CC))
-- 
2.19.0.rc2.594.gc4806cd463

[PATCH 2/2] mingw: fix mingw_open_append to work with named pipes

2018-09-07 Thread Jeff Hostetler via GitGitGadget

From: Jeff Hostetler 

Signed-off-by: Jeff Hostetler 
---
 compat/mingw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/mingw.c b/compat/mingw.c
index 858ca14a57..ef03bbe5d2 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -355,7 +355,7 @@ static int mingw_open_append(wchar_t const *wfilename, int 
oflags, ...)
 * FILE_SHARE_WRITE is required to permit child processes
 * to append to the file.
 */
-   handle = CreateFileW(wfilename, FILE_APPEND_DATA,
+   handle = CreateFileW(wfilename, FILE_WRITE_DATA | FILE_APPEND_DATA,
FILE_SHARE_WRITE | FILE_SHARE_READ,
NULL, create, FILE_ATTRIBUTE_NORMAL, NULL);
if (handle == INVALID_HANDLE_VALUE)
-- 
gitgitgadget

[PATCH 1/2] t0051: test GIT_TRACE to a windows named pipe

2018-09-07 Thread Jeff Hostetler via GitGitGadget

From: Jeff Hostetler 

Create a test-tool helper to create the server side of
a windows named pipe, wait for a client connection, and
copy data written to the pipe to stdout.

Create t0051 test to route GIT_TRACE output of a command
to a named pipe using the above test-tool helper.

Signed-off-by: Jeff Hostetler 
---
 Makefile   |  1 +
 t/helper/test-tool.c   |  3 ++
 t/helper/test-tool.h   |  3 ++
 t/helper/test-windows-named-pipe.c | 72 ++
 t/t0051-windows-named-pipe.sh  | 17 +++
 5 files changed, 96 insertions(+)
 create mode 100644 t/helper/test-windows-named-pipe.c
 create mode 100755 t/t0051-windows-named-pipe.sh

diff --git a/Makefile b/Makefile
index e4b503d259..b1b934d295 100644
--- a/Makefile
+++ b/Makefile
@@ -731,6 +731,7 @@ TEST_BUILTINS_OBJS += test-submodule-config.o
 TEST_BUILTINS_OBJS += test-subprocess.o
 TEST_BUILTINS_OBJS += test-urlmatch-normalization.o
 TEST_BUILTINS_OBJS += test-wildmatch.o
+TEST_BUILTINS_OBJS += test-windows-named-pipe.o
 TEST_BUILTINS_OBJS += test-write-cache.o
 
 TEST_PROGRAMS_NEED_X += test-dump-fsmonitor
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 805a45de9c..7a9764cd5c 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -41,6 +41,9 @@ static struct test_cmd cmds[] = {
{ "subprocess", cmd__subprocess },
{ "urlmatch-normalization", cmd__urlmatch_normalization },
{ "wildmatch", cmd__wildmatch },
+#ifdef GIT_WINDOWS_NATIVE
+   { "windows-named-pipe", cmd__windows_named_pipe },
+#endif
{ "write-cache", cmd__write_cache },
 };
 
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 7116ddfb94..01c34fe5e7 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -35,6 +35,9 @@ int cmd__submodule_config(int argc, const char **argv);
 int cmd__subprocess(int argc, const char **argv);
 int cmd__urlmatch_normalization(int argc, const char **argv);
 int cmd__wildmatch(int argc, const char **argv);
+#ifdef GIT_WINDOWS_NATIVE
+int cmd__windows_named_pipe(int argc, const char **argv);
+#endif
 int cmd__write_cache(int argc, const char **argv);
 
 #endif
diff --git a/t/helper/test-windows-named-pipe.c 
b/t/helper/test-windows-named-pipe.c
new file mode 100644
index 00..b4b752b01a
--- /dev/null
+++ b/t/helper/test-windows-named-pipe.c
@@ -0,0 +1,72 @@
+#include "test-tool.h"
+#include "git-compat-util.h"
+#include "strbuf.h"
+
+#ifdef GIT_WINDOWS_NATIVE
+static const char *usage_string = "";
+
+#define TEST_BUFSIZE (4096)
+
+int cmd__windows_named_pipe(int argc, const char **argv)
+{
+   const char *filename;
+   struct strbuf pathname = STRBUF_INIT;
+   int err;
+   HANDLE h;
+   BOOL connected;
+   char buf[TEST_BUFSIZE + 1];
+
+   if (argc < 2)
+   goto print_usage;
+   filename = argv[1];
+   if (strchr(filename, '/') || strchr(filename, '\\'))
+   goto print_usage;
+   strbuf_addf(&pathname, "//./pipe/%s", filename);
+
+   /*
+* Create a single instance of the server side of the named pipe.
+* This will allow exactly one client instance to connect to it.
+*/
+   h = CreateNamedPipeA(
+   pathname.buf,
+   PIPE_ACCESS_INBOUND | FILE_FLAG_FIRST_PIPE_INSTANCE,
+   PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,
+   PIPE_UNLIMITED_INSTANCES,
+   TEST_BUFSIZE, TEST_BUFSIZE, 0, NULL);
+   if (h == INVALID_HANDLE_VALUE) {
+   err = err_win_to_posix(GetLastError());
+   fprintf(stderr, "CreateNamedPipe failed: %s\n",
+   strerror(err));
+   return err;
+   }
+
+   connected = ConnectNamedPipe(h, NULL)
+   ? TRUE
+   : (GetLastError() == ERROR_PIPE_CONNECTED);
+   if (!connected) {
+   err = err_win_to_posix(GetLastError());
+   fprintf(stderr, "ConnectNamedPipe failed: %s\n",
+   strerror(err));
+   CloseHandle(h);
+   return err;
+   }
+
+   while (1) {
+   DWORD nbr;
+   BOOL success = ReadFile(h, buf, TEST_BUFSIZE, &nbr, NULL);
+   if (!success || nbr == 0)
+   break;
+   buf[nbr] = 0;
+
+   write(1, buf, nbr);
+   }
+
+   DisconnectNamedPipe(h);
+   CloseHandle(h);
+   return 0;
+
+print_usage:
+   fprintf(stderr, "usage: %s %s\n", argv[0], usage_string);
+   return 1;
+}
+#endif
diff --git a/t/t0051-windows-named-pipe.sh b/t/t0051-windows-named-pipe.sh
new file mode 100755
index 00..10ac92d225
--- /dev/null
+++ b/t/t0051-windows-named-pipe.sh
@@ -0,0 +1,17 @@
+#!/bin/sh
+
+test_description='Windows named pipes'
+
+. ./test-lib.sh
+
+test_expect_success MINGW 'o_append write to named pipe' '
+   GIT_TRACE="$(pwd)/expect" git status >/dev/null 2>&1 &&
+   { tes

[PATCH 0/2] Fixup for js/mingw-o-append

2018-09-07 Thread Jeff Hostetler via GitGitGadget

The recent change mingw O_APPEND change breaks writing to named pipes on
Windows. The first commit adds a new test to confirm the breakage and the
second commit fixes the problem. These could be squashed together or we can
just keep the fix and omit the test if that would be better.

d641097589 (js/mingw-o-append) mingw: enable atomic O_APPEND

The new mingw_open_append() routine successfully opens the client side of
the named pipe, but the first write() to it fails with EBADF. Adding the
FILE_WRITE_DATA corrects the problem.

 Signed-off-by: Jeff Hostetler jeffh...@microsoft.com
[jeffh...@microsoft.com]

Cc: j6t@kdbg.orgCc: johannes.schinde...@gmx.decc: gitster@pobox.comCc: 
p...@peff.net

Jeff Hostetler (2):
  t0051: test GIT_TRACE to a windows named pipe
  mingw: fix mingw_open_append to work with named pipes

 Makefile   |  1 +
 compat/mingw.c |  2 +-
 t/helper/test-tool.c   |  3 ++
 t/helper/test-tool.h   |  3 ++
 t/helper/test-windows-named-pipe.c | 72 ++
 t/t0051-windows-named-pipe.sh  | 17 +++
 6 files changed, 97 insertions(+), 1 deletion(-)
 create mode 100644 t/helper/test-windows-named-pipe.c
 create mode 100755 t/t0051-windows-named-pipe.sh


base-commit: d641097589160eb795127d8dbcb14c877c217b60
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-35%2Fjeffhostetler%2Ffixup-mingw-o-append-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-35/jeffhostetler/fixup-mingw-o-append-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/35
-- 
gitgitgadget

[PATCH v2 0/2] commit-graph: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason

Based on feedback on v1, and the "this is yelling at my users through
gc.log" bug I found. Range-diff with v1:

1:  e0a09ad641 ! 1:  b2dcfa0f55 commit-graph write: add progress output
@@ -5,8 +5,8 @@
 Before this change the "commit-graph write" command didn't report any
 progress. On my machine this command takes more than 10 seconds to
 write the graph for linux.git, and around 1m30s on the
-2015-04-03-1M-git.git[1] test repository, which is a test case for
-larger monorepos.
+2015-04-03-1M-git.git[1] test repository (a test case for a large
+monorepository).
 
 Furthermore, since the gc.writeCommitGraph setting was added in
 d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27),
@@ -19,7 +19,6 @@
 $ git -c gc.writeCommitGraph=true gc
 Enumerating objects: 2821, done.
 [...]
-Total 2821 (delta 1670), reused 2821 (delta 1670)
 Computing commit graph generation numbers: 100% (867/867), done.
 
 On larger repositories, such as linux.git the delayed progress bar(s)
@@ -27,6 +26,7 @@
 previously happening, printing nothing while we write the graph:
 
 $ git -c gc.writeCommitGraph=true gc
+[...]
 Annotating commits in commit graph: 1565573, done.
 Computing commit graph generation numbers: 100% (782484/782484), 
done.
 
@@ -42,20 +42,90 @@
 
 With --stdin-packs we don't show any estimation of how much is left to
 do. This is because we might be processing more than one pack. We
-could be less lazy here and show progress, either detect by detecting
-that we're only processing one pack, or by first looping over the
-packs to discover how many commits they have. I don't see the point in
-doing that work. So instead we get (on 2015-04-03-1M-git.git):
+could be less lazy here and show progress, either by detecting that
+we're only processing one pack, or by first looping over the packs to
+discover how many commits they have. I don't see the point in doing
+that work. So instead we get (on 2015-04-03-1M-git.git):
 
 $ echo pack-.idx | git -c gc.writeCommitGraph=true 
--exec-path=$PWD commit-graph write --stdin-packs
 Finding commits for commit graph: 13064614, done.
 Annotating commits in commit graph: 3001341, done.
 Computing commit graph generation numbers: 100% (1000447/1000447), 
done.
 
+No GC mode uses --stdin-packs. It's what they use at Microsoft to
+manually compute the generation numbers for their collection of large
+packs which are never coalesced.
+
+The reason we need a "report_progress" variable passed down from "git
+gc" is so that we don't report this output when we're running in the
+process "git gc --auto" detaches from the terminal.
+
+Since we write the commit graph from the "git gc" process itself (as
+opposed to what we do with say the "git repack" phase), we'd end up
+writing the output to .git/gc.log and reporting it to the user next
+time as part of the "The last gc run reported the following[...]"
+error, see 329e6e8794 ("gc: save log from daemonized gc --auto and
+print it next time", 2015-09-19).
+
+So we must keep track of whether or not we're running in that
+demonized mode, and if so print no progress.
+
+See [2] and subsequent replies for a discussion of an approach not
+taken in compute_generation_numbers(). I.e. we're saying "Computing
+commit graph generation numbers", even though on an established
+history we're mostly skipping over all the work we did in the
+past. This is similar to the white lie we tell in the "Writing
+objects" phase (not all are objects being written).
+
+Always showing progress is considered more important than
+accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang
+for 6 seconds with no output on the second "git gc" if no changes were
+made to any objects in the interim if we'd take the approach in [2].
+
 1. https://github.com/avar/2015-04-03-1M-git
 
+2. 
+   
(https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com/)
+
 Signed-off-by: Ævar Arnfjörð Bjarmason 
 
+ diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
+ --- a/builtin/commit-graph.c
+ +++ b/builtin/commit-graph.c
+@@
+   opts.obj_dir = get_object_directory();
+ 
+   if (opts.reachable) {
+-  write_commit_graph_reachable(opts.obj_dir, opts.append);
++  write_commit_graph_reachable(opts.obj_dir, opts.append, 1);
+   return 0;
+   }
+ 
+

[PATCH v2 2/2] commit-graph verify: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason

For the reasons explained in the "commit-graph write: add progress
output" commit leading up to this one, emit progress on "commit-graph
verify". Since e0fd51e1d7 ("fsck: verify commit-graph", 2018-06-27)
"git fsck" has called this command if core.commitGraph=true, but
there's been no progress output to indicate that anything was
different. Now there is (on my tiny dotfiles.git repository):

$ git -c core.commitGraph=true -C ~/ fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (2821/2821), done.
dangling blob 5b8bbdb9b788ed90459f505b0934619c17cc605b
Verifying commits in commit graph: 100% (867/867), done.

And on a larger repository, such as the 2015-04-03-1M-git.git test
repository:

$ time git -c core.commitGraph=true -C ~/g/2015-04-03-1M-git/ commit-graph 
verify
Verifying commits in commit graph: 100% (1000447/1000447), done.
real0m7.813s
[...]

Since the "commit-graph verify" subcommand is never called from "git
gc", we don't have to worry about passing some some "report_progress"
progress variable around for this codepath.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 commit-graph.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index 2c5d996194..0bfb8c180e 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -922,6 +922,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
int generation_zero = 0;
struct hashfile *f;
int devnull;
+   struct progress *progress = NULL;
 
if (!g) {
graph_report("no commit-graph file loaded");
@@ -989,11 +990,14 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
if (verify_commit_graph_error & ~VERIFY_COMMIT_GRAPH_ERROR_HASH)
return verify_commit_graph_error;
 
+   progress = start_progress("Verifying commits in commit graph",
+ g->num_commits);
for (i = 0; i < g->num_commits; i++) {
struct commit *graph_commit, *odb_commit;
struct commit_list *graph_parents, *odb_parents;
uint32_t max_generation = 0;
 
+   display_progress(progress, i + 1);
hashcpy(cur_oid.hash, g->chunk_oid_lookup + g->hash_len * i);
 
graph_commit = lookup_commit(r, &cur_oid);
@@ -1070,6 +1074,7 @@ int verify_commit_graph(struct repository *r, struct 
commit_graph *g)
 graph_commit->date,
 odb_commit->date);
}
+   stop_progress(&progress);
 
return verify_commit_graph_error;
 }
-- 
2.19.0.rc1.350.ge57e33dbd1

[PATCH v2 1/2] commit-graph write: add progress output

2018-09-07 Thread Ævar Arnfjörð Bjarmason

Before this change the "commit-graph write" command didn't report any
progress. On my machine this command takes more than 10 seconds to
write the graph for linux.git, and around 1m30s on the
2015-04-03-1M-git.git[1] test repository (a test case for a large
monorepository).

Furthermore, since the gc.writeCommitGraph setting was added in
d5d5d7b641 ("gc: automatically write commit-graph files", 2018-06-27),
there was no indication at all from a "git gc" run that anything was
different. This why one of the progress bars being added here uses
start_progress() instead of start_delayed_progress(), so that it's
guaranteed to be seen. E.g. on my tiny 867 commit dotfiles.git
repository:

$ git -c gc.writeCommitGraph=true gc
Enumerating objects: 2821, done.
[...]
Computing commit graph generation numbers: 100% (867/867), done.

On larger repositories, such as linux.git the delayed progress bar(s)
will kick in, and we'll show what's going on instead of, as was
previously happening, printing nothing while we write the graph:

$ git -c gc.writeCommitGraph=true gc
[...]
Annotating commits in commit graph: 1565573, done.
Computing commit graph generation numbers: 100% (782484/782484), done.

Note that here we don't show "Finding commits for commit graph", this
is because under "git gc" we seed the search with the commit
references in the repository, and that set is too small to show any
progress, but would e.g. on a smaller repo such as git.git with
--stdin-commits:

$ git rev-list --all | git -c gc.writeCommitGraph=true write --stdin-commits
Finding commits for commit graph: 100% (162576/162576), done.
Computing commit graph generation numbers: 100% (162576/162576), done.

With --stdin-packs we don't show any estimation of how much is left to
do. This is because we might be processing more than one pack. We
could be less lazy here and show progress, either by detecting that
we're only processing one pack, or by first looping over the packs to
discover how many commits they have. I don't see the point in doing
that work. So instead we get (on 2015-04-03-1M-git.git):

$ echo pack-.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD 
commit-graph write --stdin-packs
Finding commits for commit graph: 13064614, done.
Annotating commits in commit graph: 3001341, done.
Computing commit graph generation numbers: 100% (1000447/1000447), done.

No GC mode uses --stdin-packs. It's what they use at Microsoft to
manually compute the generation numbers for their collection of large
packs which are never coalesced.

The reason we need a "report_progress" variable passed down from "git
gc" is so that we don't report this output when we're running in the
process "git gc --auto" detaches from the terminal.

Since we write the commit graph from the "git gc" process itself (as
opposed to what we do with say the "git repack" phase), we'd end up
writing the output to .git/gc.log and reporting it to the user next
time as part of the "The last gc run reported the following[...]"
error, see 329e6e8794 ("gc: save log from daemonized gc --auto and
print it next time", 2015-09-19).

So we must keep track of whether or not we're running in that
demonized mode, and if so print no progress.

See [2] and subsequent replies for a discussion of an approach not
taken in compute_generation_numbers(). I.e. we're saying "Computing
commit graph generation numbers", even though on an established
history we're mostly skipping over all the work we did in the
past. This is similar to the white lie we tell in the "Writing
objects" phase (not all are objects being written).

Always showing progress is considered more important than
accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang
for 6 seconds with no output on the second "git gc" if no changes were
made to any objects in the interim if we'd take the approach in [2].

1. https://github.com/avar/2015-04-03-1M-git

2. 
   
(https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261...@gmail.com/)

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/commit-graph.c |  5 ++--
 builtin/gc.c   |  3 ++-
 commit-graph.c | 60 --
 commit-graph.h |  5 ++--
 4 files changed, 60 insertions(+), 13 deletions(-)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 0bf0c48657..bc0fa9ba52 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -151,7 +151,7 @@ static int graph_write(int argc, const char **argv)
opts.obj_dir = get_object_directory();
 
if (opts.reachable) {
-   write_commit_graph_reachable(opts.obj_dir, opts.append);
+   write_commit_graph_reachable(opts.obj_dir, opts.append, 1);
return 0;
}
 
@@ -171,7 +171,8 @@ static int graph_write(int argc, const char **argv)
write_commit_graph(opts.obj_dir,
   pack_indexes,

Re: [PATCH v3 0/4] read-cache: speed up index load through parallelization

2018-09-07 Thread Ben Peart





On 9/7/2018 1:21 PM, Junio C Hamano wrote:

Ben Peart  writes:


On further investigation with the previous patch, I noticed that my test
repos didn't contain the cache tree extension in their index. After doing a
commit to ensure they existed, I realized that in some instances, the time
to load the cache tree exceeded the time to load all the cache entries in
parallel.  Because the thread to read the cache tree was started last (due
to having to parse through all the cache entries first) we weren't always
getting optimal performance.

To better optimize for this case, I decided to write the EOIE extension
as suggested by Junio [1] in response to my earlier multithreading patch
series [2].  This enables me to spin up the thread to load the extensions
earlier as it no longer has to parse through all the cache entries first.


Hmph. I kinda liked the simplicity of the previous one, but if we
need to start reading the extension sections sooner by eliminating
the overhead to scan the cache entries, perhaps we should bite the
bullet now.



I preferred the simplicity as well but when I was profiling the code and 
found out that loading the extensions was most often the last thread to 
complete, I took this intermediate step to speed things up.



The big changes in this iteration are:

- add the EOIE extension
- update the index extension worker thread to start first


I guess I'd need to see the actual patch to find this out, but once
we rely on a new extension, then we could omit scanning the main
index even to partition the work among workers (i.e. like the topic
long ago, you can have list of pointers into the main index to help
partitioning, plus reset the prefix compression used in v4).  I
think you didn't get that far in this round, which is good.  If the
gain with EOIE alone (and starting the worker for the extension
section early) is large enough without such a pre-computed work
partition table, the simplicity of this round may give us a good
stopping point.



Agreed.  I didn't go that far in this series as it doesn't appear to be 
necessary.  We could always add that later if it turned out to be worth 
the additional complexity.



This patch conflicts with Duy's patch to remove the double memory copy and
pass in the previous ce instead.  The two will need to be merged/reconciled
once they settle down a bit.


Thanks.  I have a feeling that 67922abb ("read-cache.c: optimize
reading index format v4", 2018-09-02) is already 'next'-worthy
and ready to be built on, but I'd prefer to hear from Duy to double
check.



I'll take a closer look at what this will entail. It gets more 
complicated as we don't actually have a previous expanded cache entry 
when starting each thread.  I'll see how complex it makes the code and 
how much additional performance it gives.

Re: [PATCH 0/2] Fixup for js/mingw-o-append

2018-09-07 Thread Jeff Hostetler


GitGitGadget botched the CCs when I submitted this.
Replying here to add them.

Sorry,
Jeff

https://github.com/gitgitgadget/gitgitgadget/issues/35


On 9/7/2018 2:19 PM, Jeff Hostetler via GitGitGadget wrote:

The recent change mingw O_APPEND change breaks writing to named pipes on
Windows. The first commit adds a new test to confirm the breakage and the
second commit fixes the problem. These could be squashed together or we can
just keep the fix and omit the test if that would be better.

d641097589 (js/mingw-o-append) mingw: enable atomic O_APPEND

The new mingw_open_append() routine successfully opens the client side of
the named pipe, but the first write() to it fails with EBADF. Adding the
FILE_WRITE_DATA corrects the problem.

  Signed-off-by: Jeff Hostetler jeffh...@microsoft.com
[jeffh...@microsoft.com]

Cc: j6t@kdbg.orgCc: johannes.schinde...@gmx.decc: gitster@pobox.comCc:
p...@peff.net

Jeff Hostetler (2):
   t0051: test GIT_TRACE to a windows named pipe
   mingw: fix mingw_open_append to work with named pipes

  Makefile   |  1 +
  compat/mingw.c |  2 +-
  t/helper/test-tool.c   |  3 ++
  t/helper/test-tool.h   |  3 ++
  t/helper/test-windows-named-pipe.c | 72 ++
  t/t0051-windows-named-pipe.sh  | 17 +++
  6 files changed, 97 insertions(+), 1 deletion(-)
  create mode 100644 t/helper/test-windows-named-pipe.c
  create mode 100755 t/t0051-windows-named-pipe.sh


base-commit: d641097589160eb795127d8dbcb14c877c217b60
Published-As: 
https://github.com/gitgitgadget/git/releases/tags/pr-35%2Fjeffhostetler%2Ffixup-mingw-o-append-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git 
pr-35/jeffhostetler/fixup-mingw-o-append-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/35

Re: [PATCH 2/2] submodule.c: warn about missing submodule git directories

2018-09-07 Thread Stefan Beller

On Wed, Sep 5, 2018 at 12:18 PM Jonathan Nieder  wrote:
>
> Hi,
>
> Stefan Beller wrote:
>
> > This is the continuation of f2d48994dc1 (submodule.c: submodule_move_head
> > works with broken submodules, 2017-04-18), which tones down the case of
> > "broken submodule" in case of a missing git directory of the submodule to
> > be only a warning.
> >
> > Signed-off-by: Stefan Beller 
> > ---
> >  submodule.c   | 16 
> >  t/t2013-checkout-submodule.sh |  2 +-
> >  2 files changed, 17 insertions(+), 1 deletion(-)
>
> I don't understand what workflow this is a part of.
>
> If the submodule is missing, shouldn't we make it non-missing instead
> of producing a partial checkout that doesn't build?

No. checkout and friends do not want to touch the network
(unless we are in a partial clone world; that is the user is fully
aware that commands can use the network at totally unexpected
times)

So for that, all we can do is better error messages.

Stefan

Re: [PATCH 2/2] submodule.c: warn about missing submodule git directories

2018-09-07 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:
> On Wed, Sep 5, 2018 at 12:18 PM Jonathan Nieder  wrote:
>> Stefan Beller wrote:

>>> This is the continuation of f2d48994dc1 (submodule.c: submodule_move_head
>>> works with broken submodules, 2017-04-18), which tones down the case of
>>> "broken submodule" in case of a missing git directory of the submodule to
>>> be only a warning.
[...]
>> I don't understand what workflow this is a part of.
>>
>> If the submodule is missing, shouldn't we make it non-missing instead
>> of producing a partial checkout that doesn't build?
>
> No. checkout and friends do not want to touch the network
> (unless we are in a partial clone world; that is the user is fully
> aware that commands can use the network at totally unexpected
> times)
>
> So for that, all we can do is better error messages.

Thanks.  This patch doesn't just improve error messages, though, but
it makes the operation report success instead of failing.

Isn't that likely to produce more confusion when I run additional
commands afterward?  In other words, instead of

$ git checkout --recurse-submodules -B master origin/new-fancy-branch
Branch 'master' set up to track remote branch 'new-fancy-branch' from 
'origin'.
Switched to a new branch 'master'
warning: Submodule 'new-fancy-submodule' is missing
$ git status
[some unclean state]

I would prefer to experience

$ git checkout --recurse-submodules -B master origin/new-fancy-branch
fatal: missing submodule 'new-fancy-submodule'
hint: run "git fetch --recurse-submodules" to fetch it
$ git status
[clean state]
$ git fetch --recurse-submodules
[...]
$ git checkout --recurse-submodules -B master origin/new-fancy-branch
Branch 'master' set up to track remote branch 'new-fancy-branch' from 
'origin'.
Switched to a new branch 'master'
$ git status
[clean state]

Thanks,
Jonathan

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Jonathan Nieder

Hi,

Allan Sandfeld Jensen wrote:

> I discovered it by using Debian testing, and it is shipping the 2.17rcs for
> some reason.

I believe you mean Debian unstable.  Debian testing has 2.18.0.

>  The example is just an old checkout of qt5.git with submodules, 
> it is rather large.

Do you have a reproduction recipe I can use (starting with a "git clone"
command I would run using an old version of Git, etc)?

> I could try bisecting, but I am not sure I have the time anytime soon, I just 
> wanted to report it to you first incase you knew of a change that suddenly 
> assumed the new structure.

This is definitely not an intentional change, so more details would be
very welcome.

Thanks,
Jonathan

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Jonathan Nieder

Stefan Beller wrote:
> On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen  
> wrote:

>>   A "git submodule update --recursive" command wil fail
>> for each submodule with a line saying "fatal: could not open
>> '/.git' for writing> Is a directory.
[...]
> I have the suspicion that e98317508c0 (submodule:
> ensure core.worktree is set after update, 2018-06-18)
> might be the offender.

Oh!  That seems likely.

Allan, output from "strace -f git submodule update --init" would also
be interesting.

Jonathan

Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-07 Thread Ben Peart





On 9/7/2018 1:55 PM, Junio C Hamano wrote:

Ben Peart  writes:


The extension consists of:

- 32-bit offset to the end of the index entries

- 160-bit SHA-1 over the extension types and their sizes (but not
their contents).  E.g. if we have "TREE" extension that is N-bytes
long, "REUC" extension that is M-bytes long, followed by "EOIE",
then the hash would be:

SHA-1("TREE" +  +
"REUC" + )


I didn't look at the documentation patch in the larger context, but
please make sure that it is clear to the readers that these fixed
width integers "binary representations" use network byte order.



At the top of the documentation it says "All binary numbers are in 
network byte order" and that is not repeated for any of the other 
sections that are documenting the file format.



I briefly wondered if the above should include

 + "EOIE" + 

as it is pretty much common file format design to include the header
part of the checksum record (with checksum values padded out with NUL
bytes) when you define a record to hold the checksum of the entire
file.  Since this does not protect the contents of each section from
bit-flipping, adding the data for EOIE itself in the sum does not
give us much (iow, what I am adding above is a constant that does
not contribute any entropy).

How big is a typical TREE extension in _your_ work repository
housing Windows sources?  I am guessing that replacing SHA-1 with
something faster (as this is not about security but is about disk
corruption) and instead hash also the contents of these sections
would NOT help all that much in the performance department, as
having to page them in to read through would already consume
non-trivial amount of time, and that is why you are not hashing the
contents.



The purpose of the SHA isn't to detect disk corruption (we already have 
a SHA for the entire index that can serve that purpose) but to help 
ensure that this was actually a valid EOIE extension and not a lucky 
random set of bytes.  I had used leading and trailing signature bytes 
along with the length and version bytes to validate it was an actual 
EOIE extension but you suggested [1] that I use a SHA of the 4 byte 
extension type + 4 byte extension length instead so I rewrote it that way.


[1] 
https://public-inbox.org/git/xmqq1sl017dw@gitster.mtv.corp.google.com/



+   /*
+* CACHE_EXT_ENDOFINDEXENTRIES must be written as the last entry before 
the SHA1


s/SHA1/trailing checksum/ or something so that we can withstand
NewHash world order?



I thought about this but in the document elsewhere it refers to it as 
"160-bit SHA-1 over the content of the index file before this checksum." 
and there are at least a dozen other references to "SHA-1" so I figured 
we can fix them all at the same time when we have a new/better name. :-)



+* so that it can be found and processed before all the index entries 
are
+* read.
+*/
+   if (!strip_extensions && offset && 
!git_env_bool("GIT_TEST_DISABLE_EOIE", 0)) {
+   struct strbuf sb = STRBUF_INIT;
+
+   write_eoie_extension(&sb, &eoie_c, offset);
+   err = write_index_ext_header(&c, NULL, newfd, 
CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0
|| ce_write(&c, newfd, sb.buf, sb.len) < 0;
strbuf_release(&sb);
if (err)


OK.


+#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */
+#define EOIE_SIZE_WITH_HEADER (4 + 4 + EOIE_SIZE) /* <4-byte signature> + <4-byte 
length> + EOIE_SIZE */
+
+#ifndef NO_PTHREADS
+static unsigned long read_eoie_extension(void *mmap, size_t mmap_size)
+{
+   /*
+* The end of index entries (EOIE) extension is guaranteed to be last
+* so that it can be found by scanning backwards from the EOF.
+*
+* "EOIE"
+* <4-byte length>
+* <4-byte offset>
+* <20-byte hash>
+*/
+   const char *index, *eoie = (const char *)mmap + mmap_size - 
GIT_SHA1_RAWSZ - EOIE_SIZE_WITH_HEADER;
+   uint32_t extsize;
+   unsigned long offset, src_offset;
+   unsigned char hash[GIT_MAX_RAWSZ];
+   git_hash_ctx c;
+
+   /* validate the extension signature */
+   index = eoie;
+   if (CACHE_EXT(index) != CACHE_EXT_ENDOFINDEXENTRIES)
+   return 0;
+   index += sizeof(uint32_t);
+
+   /* validate the extension size */
+   extsize = get_be32(index);
+   if (extsize != EOIE_SIZE)
+   return 0;
+   index += sizeof(uint32_t);


Do we know we have at least 8-byte to consume to perform the above
two checks, or is that something we need to verify at the beginning
of this function?  Better yet, as we know that a correct EOIE with
its own header is 28-byte long, we probably should abort if
mmap_size is smaller than that.



I'll add that additional test.


+   /*
+* Validate the offset we're going to look for the first extension
+* signature is after the index he

Re: [PATCH] status: show progress bar if refreshing the index takes too long

2018-09-07 Thread Derrick Stolee




On 9/7/2018 1:38 PM, Eric Sunshine wrote:

On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy  wrote:

Refreshing the index is usually very fast, but it can still take a
long time sometimes. Cold cache is one, or something else silly (*).
In this case, it's good to show something to let the user know "git
status" is not hanging, it's just busy doing something.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
diff --git a/read-cache.c b/read-cache.c
@@ -1516,6 +1522,8 @@ int refresh_index(struct index_state *istate, unsigned 
int flags,
+   if (progress)
+   display_progress(progress, i);
@@ -1547,6 +1555,8 @@ int refresh_index(struct index_state *istate, unsigned 
int flags,
+   if (progress)
+   stop_progress(&progress);

Nit: Both display_progress() and stop_progress() behave sanely when
'progress' is NULL, so no need for the conditional.


Don't forget this one in preload-index.c:preload_index():

+   if (pd.progress)
+   stop_progress(&pd.progress);

I found this extra one by creating the following rules in a Coccinelle script:

@@
expression e;
@@
- if (e) { stop_progress(&e); }
+ stop_progress(&e);

@@
expression e;
expression i;
@@
- if (e) { display_progress(e, i); }
+ display_progress(e, i);


Not sure if we want to put these in a .cocci script or not.

Thanks,
-Stolee

Re: [PATCH v3 3/4] read-cache: load cache extensions on a worker thread

2018-09-07 Thread Junio C Hamano

Ben Peart  writes:

> +struct load_index_extensions
> +{
> +#ifndef NO_PTHREADS
> + pthread_t pthread;
> +#endif
> + struct index_state *istate;
> + void *mmap;
> + size_t mmap_size;
> + unsigned long src_offset;

If the file format only allows uint32_t on any platform, perhaps
this is better specified as uint32_t?  Or if this is offset into
a mmap'ed region of memory, size_t may be more appropriate.

Same comment applies to "extension_offset" we see below (which in
turn means the returned type of read_eoie_extension() function may
want to match).

> + };

Space before '}'??

> +
> +static void *load_index_extensions(void *_data)
> +{
> + struct load_index_extensions *p = _data;

Perhaps we are being superstitious, but I think our code try to
avoid leading underscore when able, i.e.

load_index_extensions(void *data_)
{
struct load_index_extensions *p = data;

> + unsigned long src_offset = p->src_offset;
> +
> + while (src_offset <= p->mmap_size - the_hash_algo->rawsz - 8) {
> + /* After an array of active_nr index entries,
> +  * there can be arbitrary number of extended
> +  * sections, each of which is prefixed with
> +  * extension name (4-byte) and section length
> +  * in 4-byte network byte order.
> +  */
> + uint32_t extsize;
> + memcpy(&extsize, (char *)p->mmap + src_offset + 4, 4);
> + extsize = ntohl(extsize);

The same "ntohl(), not get_be32()?" question as the one for the
previous step applies here, too.  I think the answer is "the
original was written that way" and that is acceptable, but once this
series lands, we may want to review the whole file and see if it is
worth making them consistent with a separate clean-up patch.

I think mmap() and munmap() are the only places that wants p->mmap
and mmap parameters passed around in various callchains to be of
type "void *"---I wonder if it is simpler to use "const char *"
throughout and only cast it to "void *" when necessary (I suspect
that there is nowhere we need to cast to or from "void *" explicitly
if we did so---assignment and argument passing would give us an
appropriate cast for free)?

> + if (read_index_extension(p->istate,
> + (const char *)p->mmap + src_offset,
> + (char *)p->mmap + src_offset + 8,
> + extsize) < 0) {
> + munmap(p->mmap, p->mmap_size);
> + die("index file corrupt");
> + }
> + ...
> @@ -1907,6 +1951,11 @@ ...
> ...
> + p.mmap = mmap;
> + p.mmap_size = mmap_size;
> +
> +#ifndef NO_PTHREADS
> + nr_threads = git_config_get_index_threads();
> + if (!nr_threads)
> + nr_threads = online_cpus();
> +
> + if (nr_threads >= 2) {
> + extension_offset = read_eoie_extension(mmap, mmap_size);
> + if (extension_offset) {
> + /* create a thread to load the index extensions */
> + p.src_offset = extension_offset;
> + if (pthread_create(&p.pthread, NULL, 
> load_index_extensions, &p))
> + die(_("unable to create 
> load_index_extensions_thread"));
> + }
> + }
> +#endif

Makes sense.

Re: [PATCH] status: show progress bar if refreshing the index takes too long

2018-09-07 Thread Eric Sunshine

On Fri, Sep 7, 2018 at 4:29 PM Derrick Stolee  wrote:
> On 9/7/2018 1:38 PM, Eric Sunshine wrote:
> > On Fri, Sep 7, 2018 at 11:51 AM Nguyễn Thái Ngọc Duy  
> > wrote:
> >> +   if (progress)
> >> +   display_progress(progress, i);
> >> +   if (progress)
> >> +   stop_progress(&progress);
> > Nit: Both display_progress() and stop_progress() behave sanely when
> > 'progress' is NULL, so no need for the conditional.
>
> Don't forget this one in preload-index.c:preload_index():
> +   if (pd.progress)
> +   stop_progress(&pd.progress);
>
> I found this extra one by creating the following rules in a Coccinelle script:
> - if (e) { stop_progress(&e); }
> - if (e) { display_progress(e, i); }
> Not sure if we want to put these in a .cocci script or not.

If so, we'd want to match display_throughput() and stop_progress_msg(), as well.

Re: [PATCH v3 2/4] wt-status: rename commitable to committable

2018-09-07 Thread Junio C Hamano

"Stephen P. Smith"  writes:

> Fix variable spelling error.
>
> Signed-off-by: Stephen P. Smith 
> ---

Thanks ;-)

Re: [PATCH v3 0/4] wt-status.c: commitable flag

2018-09-07 Thread Junio C Hamano

"Stephen P. Smith"  writes:

> A couple of years ago, during a patch review Junio found that the
> commitable bit as implemented in wt-status.c was broken.
>
> Stephen P. Smith (4):
>   Move has_unmerged earlier in the file.
>   wt-status: rename commitable to committable
>   t7501: add test of "commit --dry-run --short"
>   wt-status.c: Set the committable flag in the collect phase.
>
>  builtin/commit.c  | 18 +-
>  t/t7501-commit.sh | 10 --
>  wt-status.c   | 45 +++--
>  wt-status.h   |  2 +-
>  4 files changed, 45 insertions(+), 30 deletions(-)

Thanks.

Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.

2018-09-07 Thread Junio C Hamano

"Stephen P. Smith"  writes:

>  void wt_status_collect(struct wt_status *s)
>  {
> + struct wt_status_state state;
>   wt_status_collect_changes_worktree(s);
>  
>   if (s->is_initial)
> @@ -746,6 +752,11 @@ void wt_status_collect(struct wt_status *s)
>   else
>   wt_status_collect_changes_index(s);
>   wt_status_collect_untracked(s);
> +
> + memset(&state, 0, sizeof(state));
> + wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD"));
> + if (state.merge_in_progress && !has_unmerged(s))
> + s->committable = 1;
>  }

I do not think this is wrong per-se, but now we have three calls to
wt_status_get_state() in wt-status.c, and it smells (at least to me)
that each of these callsites does so only because their callers
do not give them enough information, forcing them to find the state
out for themselves.

Given that the ideal paradigm to come up with the "work tree status"
is to do the collection followed by printing, and among three
callers of get_state(), two appear in the "printing" side of the
callchain [*1*], I wonder if it makes a better organization to

 - embed struct wt_status_state in struct wt_status

 - make the new call to wt_status_get_state() added above in this
   patch to populate the wt_status_state embedded in 's'

 - change the other two callers of wt_status_get_state() in
   wt_longstatus_print() and wt_porcelain_v2_print_tracking(), both
   of which will receive 's' that has been populated by a previous
   call to wt_status_collect(), so that they do *not* call
   get_state() themselves, but instead use the result recorded in
   wt_status_state embedded in 's', which was populated by
   wt_status_collect() before they got called.

That would bring the resulting code even closer to the ideal,
i.e. the "collect" phase learns _everything_ we need about the
current state that is necessary in order to later show to the user,
and the "print" phase does not do its own separate discovery.

What do you think?

Thanks.

[Reference]

*1* Here are the selected functions and what other functions they
call.

wt_status_collect()

 -> wt_status_collect_changes_initial()
 -> wt_status_collect_changes_index()
 -> wt_status_collect_untracked()
 -> wt_status_get_state()

wt_longstatus_print()

 -> wt_status_get_state()

wt_porcelain_v2_print_tracking()

 -> wt_status_get_state()

wt_status_print()

 -> wt_porcelain_v2_print()
-> wt_porcelain_v2_print_tracking()
 -> wt_longstatus_print()

run_status()

 -> wt_status_collect()
 -> wt_status_print()

cmd_status()

 -> wt_status_collect()
 -> wt_status_print()

prepare_to_commit(), dry_run_commit()

 -> run_status()

Most notably, wt_status_collect() always happens before
wt_status_print(), which is natural because the former is to
collect information in 's' that is used by the latter to print.

And in various functions wt_status_print() calls indirectly, the
two other callers of wt_status_get_state() appear.

Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.

2018-09-07 Thread Junio C Hamano

Junio C Hamano  writes:

> "Stephen P. Smith"  writes:
>
>>  void wt_status_collect(struct wt_status *s)
>>  {
>> +struct wt_status_state state;
>>  wt_status_collect_changes_worktree(s);
>>  
>>  if (s->is_initial)
>> @@ -746,6 +752,11 @@ void wt_status_collect(struct wt_status *s)
>>  else
>>  wt_status_collect_changes_index(s);
>>  wt_status_collect_untracked(s);
>> +
>> +memset(&state, 0, sizeof(state));
>> +wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD"));
>> +if (state.merge_in_progress && !has_unmerged(s))
>> +s->committable = 1;
>>  }
>
> I do not think this is wrong per-se, but now we have three calls to
> wt_status_get_state() in wt-status.c, and it smells (at least to me)
> that each of these callsites does so only because their callers
> do not give them enough information, forcing them to find the state
> out for themselves.
>
> Given that the ideal paradigm to come up with the "work tree status"
> is to do the collection followed by printing, and among three
> callers of get_state(), two appear in the "printing" side of the
> callchain [*1*], I wonder if it makes a better organization to
>
>  - embed struct wt_status_state in struct wt_status
>
>  - make the new call to wt_status_get_state() added above in this
>patch to populate the wt_status_state embedded in 's'
>
>  - change the other two callers of wt_status_get_state() in
>wt_longstatus_print() and wt_porcelain_v2_print_tracking(), both
>of which will receive 's' that has been populated by a previous
>call to wt_status_collect(), so that they do *not* call
>get_state() themselves, but instead use the result recorded in
>wt_status_state embedded in 's', which was populated by
>wt_status_collect() before they got called.
>
> That would bring the resulting code even closer to the ideal,
> i.e. the "collect" phase learns _everything_ we need about the
> current state that is necessary in order to later show to the user,
> and the "print" phase does not do its own separate discovery.
>
> What do you think?
>
> Thanks.

Such a "clean-up" may look like this patch:

 - add .state field to wt_status to embed a wt_status_state instance

 - remove a parameter of type struct wt_status_state from all
   functions where wt_status is already passed; we'd use .state
   field of the wt_status instead

The patch is mostly for illustration of the idea.

The result seems to compile and pass the test suite, but I haven't
carefully thought about what else I may be breaking with this
mechanical change.  For example, I noticed that both of the old
callsites of wt_status_get_state() have free() of a few fiedls in
the structure, and I kept the code as close to the original, but I
suspect they should not be freed there in the functions in the
"print" phase, but rather the caller of the "collect" and "print"
should be made responsible for deciding when to dispose the entire
wt_status (and wt_status_state as part of it).  This illustration
patch does not address that kind of details (yet).


 wt-status.c | 132 ++--
 wt-status.h |  37 -
 2 files changed, 77 insertions(+), 92 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index c7f76d4758..69f2cbdca9 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -744,18 +744,15 @@ static int has_unmerged(struct wt_status *s)
 
 void wt_status_collect(struct wt_status *s)
 {
-   struct wt_status_state state;
wt_status_collect_changes_worktree(s);
-
if (s->is_initial)
wt_status_collect_changes_initial(s);
else
wt_status_collect_changes_index(s);
wt_status_collect_untracked(s);
 
-   memset(&state, 0, sizeof(state));
-   wt_status_get_state(&state, s->branch && !strcmp(s->branch, "HEAD"));
-   if (state.merge_in_progress && !has_unmerged(s))
+   wt_status_get_state(&s->state, s->branch && !strcmp(s->branch, "HEAD"));
+   if (s->state.merge_in_progress && !has_unmerged(s))
s->committable = 1;
 }
 
@@ -1087,8 +1084,7 @@ static void wt_longstatus_print_tracking(struct wt_status 
*s)
 }
 
 static void show_merge_in_progress(struct wt_status *s,
-   struct wt_status_state *state,
-   const char *color)
+  const char *color)
 {
if (has_unmerged(s)) {
status_printf_ln(s, color, _("You have unmerged paths."));
@@ -1109,16 +1105,15 @@ static void show_merge_in_progress(struct wt_status *s,
 }
 
 static void show_am_in_progress(struct wt_status *s,
-   struct wt_status_state *state,
const char *color)
 {
status_printf_ln(s, color,
_("You are in the middle of an am session."));
-   if (state->am_empty_patch)
+   if (s->state.am_empty_patch)
status_pri

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Jonathan Nieder

Stefan Beller wrote:
> On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen  
> wrote:

>> Submodules checked out with older versions of git not longer works in the
>> latest 2.19 releases. A "git submodule update --recursive" command wil fail
>> for each submodule with a line saying "fatal: could not open
>> '/.git' for writing> Is a directory.
[...]
> I have the suspicion that e98317508c0 (submodule:
> ensure core.worktree is set after update, 2018-06-18)
> might be the offender.

I still was not able to reproduce it, but after a bit of staring at
the code, I'm pretty sure I just did something wrong in the
reproduction process.  That commit is indeed the offender.

It introduces the following code (rewrapped for clarity) in
git-submodule.sh:

if ! $(
git config -f \
"$(git rev-parse 
--git-common-dir)/modules/$name/config" \
core.worktree
) 2>/dev/null
then
git submodule--helper connect-gitdir-workingtree "$name" 
"$sm_path"
fi

Staring at it for a while, you can see one problem: the 'if ! $(git
config)' should be simply 'if ! git config'.  This ends up trying to
run the core.worktree value as a command, which would usually fail.

That brings us into connect_work_tree_and_git_dir, which does

/* Prepare .git file */
strbuf_addf(&gitfile_sb, "%s/.git", work_tree_);
if (safe_create_leading_directories_const(gitfile_sb.buf))
die(_("could not create directories for %s"), gitfile_sb.buf);

/* Prepare config file */
strbuf_addf(&cfg_sb, "%s/config", git_dir_);
if (safe_create_leading_directories_const(cfg_sb.buf))
die(_("could not create directories for %s"), cfg_sb.buf);

git_dir = real_pathdup(git_dir_, 1);
work_tree = real_pathdup(work_tree_, 1);

/* Write .git file */
write_file(gitfile_sb.buf, "gitdir: %s",
   relative_path(git_dir, work_tree, &rel_path));

The write_file runs into .git already existing as a directory, failing
with the message Allan saw.

This would happen in at least two cases:

- if the submodule exists both in .git/modules/ *and* in the worktree
  (due to flipping between Git versions and branches with and without
  the submodule), the above will happen

- likewise if the submodule exists only in the worktree, like for Allan.

In "next" there is 74d4731d (submodule--helper: replace
connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13) which
uses robust helpers in C that handle this much better.  I think we
should revert e98317508c0 in "master" (for 2.19) and keep making use
of that 'second try' in "next" (for 2.20).

I'll try to pin down a reproduction case and send a revert + testsuite
patch.

Thanks again,
Jonathan

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Allan Sandfeld Jensen

On Freitag, 7. September 2018 19:08:43 CEST Stefan Beller wrote:
> On Fri, Sep 7, 2018 at 2:53 AM Allan Sandfeld Jensen  
wrote:
> > Submodules checked out with older versions of git not longer works in the
> > latest 2.19 releases. A "git submodule update --recursive" command wil
> > fail
> > for each submodule with a line saying "fatal: could not open
> > '/.git' for writing> Is a directory.
> 
> Can you run the update again with
> 
> GIT_TRACE=1 git submodule update 
> 
> and post the output?
> 
> I have the suspicion that e98317508c0 (submodule:
> ensure core.worktree is set after update, 2018-06-18)
> might be the offender.
> 
> Could you try reverting that commit and check as well?
> 
> git clone https://github.com/git/git && cd git
> git revert e98317508c0
> make install # installs to you home dir at ~/bin
> 
> and then try again, as well?
> (though bisection may be more fruitful if this doesn't pan out)

Okay. I had the issue on my workstation at work which I won't be back to until 
friday next week, but I managed to reproduce the exact same issue on separate 
machine running Ubuntu, and a freshly built git from git master, on another 
roughly one year old checkout of qt5.git with submodules


[127] carewolf@twilight% GIT_TRACE=1 ~src/git/git submodule update --recursive  
 [5.11.2] ~qt5
00:28:32.234453 git.c:659   trace: exec: git-submodule update 
--recursive
00:28:32.234491 run-command.c:637   trace: run_command: git-submodule 
update --recursive
00:28:32.240792 git.c:415   trace: built-in: git rev-parse --git-dir
00:28:32.241981 git.c:415   trace: built-in: git rev-parse 
--git-path objects
00:28:32.242905 git.c:415   trace: built-in: git rev-parse -q 
--git-dir
00:28:32.244794 git.c:415   trace: built-in: git rev-parse 
--show-prefix
00:28:32.245699 git.c:415   trace: built-in: git rev-parse 
--show-toplevel
00:28:32.247181 git.c:415   trace: built-in: git submodule--helper 
update-clone
00:28:32.247463 run-command.c:1553  run_processes_parallel: preparing to 
run up to 1 tasks
00:28:32.247764 run-command.c:1585  run_processes_parallel: done
00:28:32.248626 git.c:415   trace: built-in: git submodule--helper 
name qt3d
00:28:32.249751 git.c:415   trace: built-in: git config 
submodule.qt3d.update
00:28:32.250608 git.c:415   trace: built-in: git submodule--helper 
relative-path qt3d 
00:28:32.251399 git.c:415   trace: built-in: git rev-parse 
--local-env-vars
00:28:32.252120 git.c:415   trace: built-in: git rev-parse --verify 
HEAD
00:28:32.253096 git.c:415   trace: built-in: git rev-parse 
--git-common-dir
00:28:32.253982 git.c:415   trace: built-in: git config -f 
.git/modules/qt3d/config core.worktree
00:28:32.254771 git.c:415   trace: built-in: git submodule--helper 
connect-gitdir-workingtree qt3d qt3d
fatal: could not open 'qt3d/.git' for writing: Is a directory
00:28:32.255930 git.c:415   trace: built-in: git submodule--helper 
relative-path qt3d/ 
00:28:32.256712 git.c:415   trace: built-in: git rev-parse 
--local-env-vars
00:28:32.257878 git.c:415   trace: built-in: git submodule--helper 
update-clone --recursive-prefix qt3d/
00:28:32.258510 run-command.c:1553  run_processes_parallel: preparing to 
run up to 1 tasks
00:28:32.258544 run-command.c:1585  run_processes_parallel: done
00:28:32.260356 git.c:415   trace: built-in: git submodule--helper 
name qtactiveqt
00:28:32.261475 git.c:415   trace: built-in: git config 
submodule.qtactiveqt.update
00:28:32.262321 git.c:415   trace: built-in: git submodule--helper 
relative-path qtactiveqt 
00:28:32.263259 git.c:415   trace: built-in: git rev-parse 
--local-env-vars
00:28:32.263987 git.c:415   trace: built-in: git rev-parse --verify 
HEAD
00:28:32.264929 git.c:415   trace: built-in: git rev-parse 
--git-common-dir
00:28:32.265775 git.c:415   trace: built-in: git config -f 
.git/modules/qtactiveqt/config core.worktree
00:28:32.266563 git.c:415   trace: built-in: git submodule--helper 
connect-gitdir-workingtree qtactiveqt qtactiveqt
fatal: could not open 'qtactiveqt/.git' for writing: Is a directory
00:28:32.267595 git.c:415   trace: built-in: git submodule--helper 
relative-path qtactiveqt/ 
00:28:32.268310 git.c:415   trace: built-in: git rev-parse 
--local-env-vars
00:28:32.269481 git.c:415   trace: built-in: git submodule--helper 
update-clone --recursive-prefix qtactiveqt/
00:28:32.269626 run-command.c:1553  run_processes_parallel: preparing to 
run up to 1 tasks
00:28:32.269652 run-command.c:1585  run_processes_parallel: done
00:28:32.270893 git.c:415   trace: built-in: git submodule--helper

[RFC PATCH v4 3/3] t0014: Introduce alias testing suite

2018-09-07 Thread Tim Schumacher

Introduce a testing suite that is dedicated to aliases.
For now, check only if nested aliases work and if looping
aliases are detected successfully.

The looping aliases check for mixed execution is there but
expected to fail because there is no check in place yet.

Signed-off-by: Tim Schumacher 
---

Those are the tests that I've come up with. It consists of tests
for nested aliases and looping aliases, both with internal calls
and external calls.

Unfortunately I don't have a fix for the last one yet, so I
marked it as expect_failure. The problem is that the test suite
is waiting a full minute until it aborts the running command
(which I guess should not take that long, as it blocks the whole
test suite for that span of time).

Should I try to decrease the timeout or should I remove that
test completely until I manage to get external calls fixed?

As a last thing, is there any better way to use single quotes
than to write '"'"'? It isn't that bad, but it is hard to read,
especially for bash newcomers.

 t/t0014-alias.sh | 38 ++
 1 file changed, 38 insertions(+)
 create mode 100755 t/t0014-alias.sh

diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh
new file mode 100755
index 0..6c1e34694
--- /dev/null
+++ b/t/t0014-alias.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='git command aliasing'
+
+. ./test-lib.sh
+
+test_expect_success 'setup environment' '
+   git init
+'
+
+test_expect_success 'nested aliases - internal execution' '
+   git config alias.nested-internal-1 nested-internal-2 &&
+   git config alias.nested-internal-2 status
+'
+
+test_expect_success 'nested aliases - mixed execution' '
+   git config alias.nested-external-1 "!git nested-external-2" &&
+   git config alias.nested-external-2 status
+'
+
+test_expect_success 'looping aliases - internal execution' '
+   git config alias.loop-internal-1 loop-internal-2 &&
+   git config alias.loop-internal-2 loop-internal-3 &&
+   git config alias.loop-internal-3 loop-internal-2 &&
+   test_must_fail git loop-internal-1 2>output &&
+   grep -q "fatal: alias loop detected: expansion of 
'"'"'loop-internal-1'"'"' does not terminate" output &&
+   rm output
+'
+
+test_expect_failure 'looping aliases - mixed execution' '
+   git config alias.loop-mixed-1 loop-mixed-2 &&
+   git config alias.loop-mixed-2 "!git loop-mixed-1" &&
+   test_must_fail git loop-mixed-1 2>output &&
+   grep -q "fatal: alias loop detected: expansion of 
'"'"'loop-mixed-1'"'"' does not terminate" output &&
+   rm output
+'
+
+test_done
-- 
2.19.0.rc2.1.g4c98b8d69.dirty

[RFC PATCH v4 2/3] Show the call history when an alias is looping

2018-09-07 Thread Tim Schumacher

Just printing the command that the user entered is not particularly
helpful when trying to find the alias that causes the loop.

Print the history of substituted commands to help the user find the
offending alias. Mark the entrypoint of the loop with "<==" and the
last command (which looped back to the entrypoint) with "==>".

Signed-off-by: Tim Schumacher 
---

I now went with Peff's suggested code and I added in an arrow that points
away from the last command (which caused the loop). A "full" arrow (i.e.
starts at the last command, goes upwards and ends at the entrypoint) would
be more obvious/better, but adding much more code just for having a
vertical line wasn't worth it for me.

 git.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/git.c b/git.c
index 15727c17f..a20eb4fa1 100644
--- a/git.c
+++ b/git.c
@@ -675,6 +675,7 @@ static int run_argv(int *argcp, const char ***argv)
 {
int done_alias = 0;
struct string_list cmd_list = STRING_LIST_INIT_NODUP;
+   struct string_list_item *seen;
 
while (1) {
/*
@@ -692,9 +693,21 @@ static int run_argv(int *argcp, const char ***argv)
/* .. then try the external ones */
execv_dashed_external(*argv);
 
-   if (unsorted_string_list_has_string(&cmd_list, *argv[0])) {
+   seen = unsorted_string_list_lookup(&cmd_list, *argv[0]);
+   if (seen) {
+   int i;
+   struct strbuf sb = STRBUF_INIT;
+   for (i = 0; i < cmd_list.nr; i++) {
+   struct string_list_item *item = 
&cmd_list.items[i];
+
+   strbuf_addf(&sb, "\n  %s", item->string);
+   if (item == seen)
+   strbuf_addstr(&sb, " <==");
+   else if (i == cmd_list.nr - 1)
+   strbuf_addstr(&sb, " ==>");
+   }
die(_("alias loop detected: expansion of '%s' does"
- " not terminate"), cmd_list.items[0].string);
+ " not terminate:%s"), cmd_list.items[0].string, 
sb.buf);
}
 
string_list_append(&cmd_list, *argv[0]);
-- 
2.19.0.rc2.1.g4c98b8d69.dirty

[RFC PATCH v4 1/3] Add support for nested aliases

2018-09-07 Thread Tim Schumacher

Aliases can only contain non-alias git commands and their
arguments, not other user-defined aliases. Resolving further
(nested) aliases is prevented by breaking the loop after the
first alias was processed. Git then fails with a command-not-found
error.

Allow resolving nested aliases by not breaking the loop in
run_argv() after the first alias was processed. Instead, continue
the loop until `handle_alias()` fails, which means that there are
no further aliases that can be processed. Prevent looping aliases
by storing substituted commands in `cmd_list` and checking if
a command has been substituted previously.

While we're at it, fix a styling issue just below the added code.

Signed-off-by: Tim Schumacher 
---
Changes since v3:
 - Print the command that the user entered instead of the command
   which caused the loop (and a nicer, more explanatory error message)
 - Use unsorted_string_list_has_string() instead of the sorted version
 - Fix a code style issue just below the modified code
 - done_alias is a simple boolean again (instead of a counter)

 git.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/git.c b/git.c
index c27c38738..15727c17f 100644
--- a/git.c
+++ b/git.c
@@ -674,6 +674,7 @@ static void execv_dashed_external(const char **argv)
 static int run_argv(int *argcp, const char ***argv)
 {
int done_alias = 0;
+   struct string_list cmd_list = STRING_LIST_INIT_NODUP;
 
while (1) {
/*
@@ -691,17 +692,25 @@ static int run_argv(int *argcp, const char ***argv)
/* .. then try the external ones */
execv_dashed_external(*argv);
 
-   /* It could be an alias -- this works around the insanity
+   if (unsorted_string_list_has_string(&cmd_list, *argv[0])) {
+   die(_("alias loop detected: expansion of '%s' does"
+ " not terminate"), cmd_list.items[0].string);
+   }
+
+   string_list_append(&cmd_list, *argv[0]);
+
+   /*
+* It could be an alias -- this works around the insanity
 * of overriding "git log" with "git show" by having
 * alias.log = show
 */
-   if (done_alias)
-   break;
if (!handle_alias(argcp, argv))
break;
done_alias = 1;
}
 
+   string_list_clear(&cmd_list, 0);
+
return done_alias;
 }
 
-- 
2.19.0.rc2.1.g4c98b8d69.dirty

Re: Old submodules broken in 2.19rc1 and 2.19rc2

2018-09-07 Thread Stefan Beller

 I think we
> should revert e98317508c0 in "master" (for 2.19) and keep making use
> of that 'second try' in "next" (for 2.20).

Actually I'd rather revert the whole topic leading up to
7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18)
as the last patch in there doesn't work well without e98317508c0 IIRC.

And having only the first patch would bring an inconsistent state as
then different commands behave differently w.r.t. setting core.worktree.

So for this release we'd

  git revert 984cd77ddbf0 e98317508c 4fa4f90ccd85

and then can

  git cherry-pick 4fa4f90ccd8 984cd77ddbf0

on top of sb/submodule-update-in-c, as that re-instates the behavior
going forward.

Thoughts?

Thanks,
Stefan

[PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing

2018-09-07 Thread Todd Zullinger

With curl-7.61.1 cookies are sorted by creation-time¹.  Sort the output
used in the 'cookies stored in http.cookiefile when http.savecookies
set' test before comparing it to the expected cookies.

¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support
  creation-time attribute for cookies", 2018-08-28)

Signed-off-by: Todd Zullinger 
---
[Resending with the list in Cc; sorry for spamming you,
Junio, Jeff, and Gábor.]

The in-development version of Fedora updated to the recently
released curl-7.61.1 in the past few days.  This isn't
breakage from the 2.19.0 cycle, but if the fix looks good to
everyone it would be nice to include it.  That way other
distributions and users who update git and curl to the most
recent releases won't run into this test failure.

I tested this against Fedora 30 (curl-7.61.1) as well as
previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and
Fedora 27/28/29 (7.55.1/7.59.0/7.61.0).

The verbose output is:

expecting success:
git config http.cookiefile cookies.txt &&
git config http.savecookies true &&
git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
tail -3 cookies.txt >cookies_tail.txt &&
test_cmp expect_cookies.txt cookies_tail.txt
++ git config http.cookiefile cookies.txt
++ git config http.savecookies true
++ git ls-remote http://127.0.0.1:5551/smart_cookies/repo.git master
7ae89caac6c721f16555e981eaeed64abc165c5drefs/heads/master
263207bb5fbfbefbdf1c9c3fa4ae5d9663323217
refs/namespaces/ns/refs/heads/master
++ tail -3 cookies.txt
++ test_cmp expect_cookies.txt cookies_tail.txt
++ diff -u expect_cookies.txt cookies_tail.txt
--- expect_cookies.txt  2018-09-07 07:29:05.231532462 +
+++ cookies_tail.txt2018-09-07 07:29:05.306532366 +
@@ -1,3 +1,3 @@

-127.0.0.1  FALSE   /smart_cookies/ FALSE   0   othername   
othervalue
 127.0.0.1  FALSE   /smart_cookies/repo.git/info/   FALSE   0   name
value
+127.0.0.1  FALSE   /smart_cookies/ FALSE   0   othername   
othervalue
error: last command exited with $?=1
not ok 22 - cookies stored in http.cookiefile when http.savecookies set

 t/t5551-http-fetch-smart.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 771f36f9ff..538656bfef 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile when 
http.savecookies set
git config http.cookiefile cookies.txt &&
git config http.savecookies true &&
git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
-   tail -3 cookies.txt >cookies_tail.txt &&
+   tail -3 cookies.txt | sort >cookies_tail.txt &&
test_cmp expect_cookies.txt cookies_tail.txt
 '
 
-- 
2.19.0.rc2

Re: [RFC PATCH v4 3/3] t0014: Introduce alias testing suite

2018-09-07 Thread Eric Sunshine

On Fri, Sep 7, 2018 at 6:44 PM Tim Schumacher  wrote:
> Introduce a testing suite that is dedicated to aliases.
> For now, check only if nested aliases work and if looping
> aliases are detected successfully.
>
> The looping aliases check for mixed execution is there but
> expected to fail because there is no check in place yet.
>
> Signed-off-by: Tim Schumacher 
> ---
> Unfortunately I don't have a fix for the last one yet, so I
> marked it as expect_failure. The problem is that the test suite
> is waiting a full minute until it aborts the running command
> (which I guess should not take that long, as it blocks the whole
> test suite for that span of time).
>
> Should I try to decrease the timeout or should I remove that
> test completely until I manage to get external calls fixed?

Perhaps just comment out that test for now and add a comment above it
explaining why it's commented out.

> As a last thing, is there any better way to use single quotes
> than to write '"'"'? It isn't that bad, but it is hard to read,
> especially for bash newcomers.

You should backslash-escape the quotes ("foo \'bar\' baz"), however,
in this case, it would make sense to use regex's with 'grep' to check
that you got the expected error message rather than reproducing the
message literally here in the script.

More below.

> diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh
> @@ -0,0 +1,38 @@
> +#!/bin/sh
> +
> +test_description='git command aliasing'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup environment' '
> +   git init
> +'

"git init" is invoked automatically by the test framework, so no need
for this test. You can drop it.

> +test_expect_success 'nested aliases - internal execution' '
> +   git config alias.nested-internal-1 nested-internal-2 &&
> +   git config alias.nested-internal-2 status
> +'

This isn't actually testing anything, is it? It's setting up the
aliases but never actually invoking them. I would have expected the
next line to actually run a command ("git nested-internal-1") and the
line after that to check that you got the expected output (whatever
"git status" would emit). Output from "git status" isn't necessarily
the easiest to test, though, so perhaps pick a different Git command
for testing (something for which the result can be very easily checked
-- maybe "git rm" or such).

> +test_expect_success 'nested aliases - mixed execution' '
> +   git config alias.nested-external-1 "!git nested-external-2" &&
> +   git config alias.nested-external-2 status
> +'

Same observation.

> +test_expect_success 'looping aliases - internal execution' '
> +   git config alias.loop-internal-1 loop-internal-2 &&
> +   git config alias.loop-internal-2 loop-internal-3 &&
> +   git config alias.loop-internal-3 loop-internal-2 &&
> +   test_must_fail git loop-internal-1 2>output &&
> +   grep -q "fatal: alias loop detected: expansion of 
> '"'"'loop-internal-1'"'"' does not terminate" output &&

Don't bother using -q with 'grep'. Output is hidden already by the
test framework in normal mode, and not hidden when running in verbose
mode. And, the output of 'grep' might be helpful when debugging the
test if something goes wrong.

As noted above, you can use regex to match the expected error rather
than exactly duplicating the text of the message.

Finally, use 'test_i18ngrep' instead of 'grep' in order to play nice
with localization.

> +   rm output

Tests don't normally bother cleaning up their output files like this
since such output can be helpful when debugging the test if something
goes wrong. (You'd want to use test_when_finished to cleanup anyhow,
but you don't need it in this case.)

> +'

Re: [PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing

2018-09-07 Thread Jeff King

On Fri, Sep 07, 2018 at 07:22:05PM -0400, Todd Zullinger wrote:

> With curl-7.61.1 cookies are sorted by creation-time¹.  Sort the output
> used in the 'cookies stored in http.cookiefile when http.savecookies
> set' test before comparing it to the expected cookies.
> 
> ¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support
>   creation-time attribute for cookies", 2018-08-28)

According to that commit message, the creation-time sort is only for
cookies of the same length. But it's not clear to me if that just means
on-the-wire, and curl always stores by creation-time in the cookie file.

Either way, though, I guess it wouldn't matter for us as long as we
choose some arbitrary re-ordering for what curl produces (i.e., the
output of `sort`) and then make sure our "expect" output is in the same
order. Which is basically what your patch does. One question, though:

> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index 771f36f9ff..538656bfef 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile 
> when http.savecookies set
>   git config http.cookiefile cookies.txt &&
>   git config http.savecookies true &&
>   git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
> - tail -3 cookies.txt >cookies_tail.txt &&
> + tail -3 cookies.txt | sort >cookies_tail.txt &&
>   test_cmp expect_cookies.txt cookies_tail.txt
>  '

We pick the bottom 3 before sorting. How do we know those are the three
we want to see?

...Ah, OK. The lines we are skipping are not actually cookies at all,
but just header cruft. I wonder if:

  grep "^[^#]" cookies.txt

would be a better way of doing that, but that is certainly not something
new.

So this fix looks fine. It might be worth a comment above the creation
of expect_cookies.txt to mention it must be in sorted order (of course
anybody modifying it would see a test failure).

> The in-development version of Fedora updated to the recently
> released curl-7.61.1 in the past few days.  This isn't
> breakage from the 2.19.0 cycle, but if the fix looks good to
> everyone it would be nice to include it.  That way other
> distributions and users who update git and curl to the most
> recent releases won't run into this test failure.
> 
> I tested this against Fedora 30 (curl-7.61.1) as well as
> previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and
> Fedora 27/28/29 (7.55.1/7.59.0/7.61.0).

You're pretty late in the 2.19 cycle, since the release is tentatively
scheduled for Sunday. Though since this is just touching the test
script, and since it looks Obviously Correct, I'm not opposed.

-Peff

Re: [PATCH v3 4/4] wt-status.c: Set the committable flag in the collect phase.

2018-09-07 Thread Stephen P. Smith

On Friday, September 7, 2018 3:31:55 PM MST you wrote:
> Junio C Hamano  writes:

> The patch is mostly for illustration of the idea.
> 
> The result seems to compile and pass the test suite, but I haven't
> carefully thought about what else I may be breaking with this
> mechanical change.  For example, I noticed that both of the old
> callsites of wt_status_get_state() have free() of a few fiedls in
> the structure, and I kept the code as close to the original, but I
> suspect they should not be freed there in the functions in the
> "print" phase, but rather the caller of the "collect" and "print"
> should be made responsible for deciding when to dispose the entire
> wt_status (and wt_status_state as part of it).  This illustration
> patch does not address that kind of details (yet).

If we use this as a basis of a follow on patch, how do I handle credit.   You 
obviously wrote this patch and I did not.

So how is the mechanics of that normally done?   Thanks for the patch I will 
work with it.

sps

[PATCH] Revert "Merge branch 'sb/submodule-core-worktree'" (was Re: Old submodules broken in 2.19rc1 and 2.19rc2)

2018-09-07 Thread Jonathan Nieder

Subject: Revert "Merge branch 'sb/submodule-core-worktree'"

This reverts commit 7e25437d35a70791b345872af202eabfb3e1a8bc, reversing
changes made to 00624d608cc69bd62801c93e74d1ea7a7ddd6598.

v2.19.0-rc0~165^2~1 (submodule: ensure core.worktree is set after
update, 2018-06-18) assumes an "absorbed" submodule layout, where the
submodule's Git directory is in the superproject's .git/modules/
directory and .git in the submodule worktree is a .git file pointing
there.  In particular, it uses $GIT_DIR/modules/$name to find the
submodule to find out whether it already has core.worktree set, and it
uses connect_work_tree_and_git_dir if not, resulting in

fatal: could not open sub/.git for writing

The context behind that patch: v2.19.0-rc0~165^2~2 (submodule: unset
core.worktree if no working tree is present, 2018-06-12) unsets
core.worktree when running commands like "git checkout
--recurse-submodules" to switch to a branch without the submodule.  If
a user then uses "git checkout --no-recurse-submodules" to switch back
to a branch with the submodule and runs "git submodule update", this
patch is needed to ensure that commands using the submodule directly
are aware of the path to the worktree.

It is late in the release cycle, so revert the whole 3-patch series.
We can try again later for 2.20.

Reported-by: Allan Sandfeld Jensen 
Helped-by: Stefan Beller 
Signed-off-by: Jonathan Nieder 
---
Stefan Beller wrote:
> Jonathan Nieder wrote:

>> I think we
>> should revert e98317508c0 in "master" (for 2.19) and keep making use
>> of that 'second try' in "next" (for 2.20).
>
> Actually I'd rather revert the whole topic leading up to
> 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18)
> as the last patch in there doesn't work well without e98317508c0 IIRC.
>
> And having only the first patch would bring an inconsistent state as
> then different commands behave differently w.r.t. setting core.worktree.

Like this (generated using "git revert -m1)?

 builtin/submodule--helper.c | 26 --
 git-submodule.sh|  5 -
 submodule.c | 14 --
 submodule.h |  2 --
 t/lib-submodule-update.sh   |  5 ++---
 t/t7400-submodule-basic.sh  |  5 -
 6 files changed, 2 insertions(+), 55 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index b56028ba9d..f6fb8991f3 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1123,8 +1123,6 @@ static void deinit_submodule(const char *path, const char 
*prefix,
if (!(flags & OPT_QUIET))
printf(format, displaypath);
 
-   submodule_unset_core_worktree(sub);
-
strbuf_release(&sb_rm);
}
 
@@ -2005,29 +2003,6 @@ static int check_name(int argc, const char **argv, const 
char *prefix)
return 0;
 }
 
-static int connect_gitdir_workingtree(int argc, const char **argv, const char 
*prefix)
-{
-   struct strbuf sb = STRBUF_INIT;
-   const char *name, *path;
-   char *sm_gitdir;
-
-   if (argc != 3)
-   BUG("submodule--helper connect-gitdir-workingtree  
");
-
-   name = argv[1];
-   path = argv[2];
-
-   strbuf_addf(&sb, "%s/modules/%s", get_git_dir(), name);
-   sm_gitdir = absolute_pathdup(sb.buf);
-
-   connect_work_tree_and_git_dir(path, sm_gitdir, 0);
-
-   strbuf_release(&sb);
-   free(sm_gitdir);
-
-   return 0;
-}
-
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2041,7 +2016,6 @@ static struct cmd_struct commands[] = {
{"name", module_name, 0},
{"clone", module_clone, 0},
{"update-clone", update_clone, 0},
-   {"connect-gitdir-workingtree", connect_gitdir_workingtree, 0},
{"relative-path", resolve_relative_path, 0},
{"resolve-relative-url", resolve_relative_url, 0},
{"resolve-relative-url-test", resolve_relative_url_test, 0},
diff --git a/git-submodule.sh b/git-submodule.sh
index f7fd80345c..1cb2c0a31b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -580,11 +580,6 @@ cmd_update()
die "$(eval_gettext "Unable to find current 
\${remote_name}/\${branch} revision in submodule path '\$sm_path'")"
fi
 
-   if ! $(git config -f "$(git rev-parse 
--git-common-dir)/modules/$name/config" core.worktree) 2>/dev/null
-   then
-   git submodule--helper connect-gitdir-workingtree 
"$name" "$sm_path"
-   fi
-
if test "$subsha1" != "$sha1" || test -n "$force"
then
subforce=$force
diff --git a/submodule.c b/submodule.c
index 50cbf5f13e..a2b266fbfa 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1534,18 +1534,6 @@ int bad_to_remove_submodule(const char *path, unsigned 
flags)
return ret;
 }
 
-void submodule_unset_core_worktree(const struct submodule *sub)
-{
-   char *config_pa

Re: [PATCH v2] http-backend: allow empty CONTENT_LENGTH

2018-09-07 Thread Jonathan Nieder

Hi,

Max Kirillov wrote:

> According to RFC3875, empty environment variable is equivalent to unset,
> and for CONTENT_LENGTH it should mean zero body to read.
>
> However, unset CONTENT_LENGTH is also used for chunked encoding to indicate
> reading until EOF. At least, the test "large fetch-pack requests can be split
> across POSTs" from t5551 starts faliing, if unset or empty CONTENT_LENGTH is
> treated as zero length body. So keep the existing behavior as much as 
> possible.
>
> Add a test for the case.
>
> Reported-By: Jelmer Vernooĳ 
> Signed-off-by: Max Kirillov 
> ---
> Added the "reported-by" and explained inline the reason to keep existing 
> behavior

Lovely, thanks.

To me, "keep the existing behavior as much as possible" isn't comforting
because it doesn't tell me *which* existing behavior.  Fortunately the patch
itself is comforting: it makes us treat "" the same way as unset, which is
exactly what the RFC requires.

So I'm happy with this version.  Thanks for your patient work.

Reviewed-by: Jonathan Nieder

Re: 2.19.0.rc2.windows.1: stash fails with dirty submodule

2018-09-07 Thread Jonathan Nieder

Hi,

Thomas Braun wrote:

> I'm using git with stash and rebase builtins.
>
> $ git --version --build-options
>
> git version 2.19.0.rc2.windows.1
[...]
> mkdir test
> cd test
> git init
> echo 1 > file
> git add file
> git commit file -m "message"
> git submodule add ./ mysubmod
> git commit -m "Add submodule"
> echo 2 > mysubmod/file
> git checkout -b mybranch
> git rebase -i --autosquash master
[...]
> fatal: Unexpected stash response: ''
>
> and that used to work with older git versions.

Thanks for reporting.  I'm cc-ing Dscho, who has been looking for
reports of issues with the new experimental stash and rebase code[2].

Thanks,
Jonathan

> [1]:
> https://github.com/git-for-windows/git/issues/1820#issuecomment-419411808
[2] 
https://public-inbox.org/git/nycvar.qro.7.76.6.1808241320540...@tvgsbejvaqbjf.bet/

Re: [PATCH] Revert "Merge branch 'sb/submodule-core-worktree'" (was Re: Old submodules broken in 2.19rc1 and 2.19rc2)

2018-09-07 Thread Junio C Hamano

Jonathan Nieder  writes:

> It is late in the release cycle, so revert the whole 3-patch series.
> We can try again later for 2.20.
>
> Reported-by: Allan Sandfeld Jensen 
> Helped-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
> Stefan Beller wrote:
>> Jonathan Nieder wrote:
>
>>> I think we
>>> should revert e98317508c0 in "master" (for 2.19) and keep making use
>>> of that 'second try' in "next" (for 2.20).
>>
>> Actually I'd rather revert the whole topic leading up to
>> 7e25437d35a (Merge branch 'sb/submodule-core-worktree', 2018-07-18)
>> as the last patch in there doesn't work well without e98317508c0 IIRC.
>>
>> And having only the first patch would bring an inconsistent state as
>> then different commands behave differently w.r.t. setting core.worktree.
>
> Like this (generated using "git revert -m1)?

OK.  Thanks for taking care of it.

>
>  builtin/submodule--helper.c | 26 --
>  git-submodule.sh|  5 -
>  submodule.c | 14 --
>  submodule.h |  2 --
>  t/lib-submodule-update.sh   |  5 ++---
>  t/t7400-submodule-basic.sh  |  5 -
>  6 files changed, 2 insertions(+), 55 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index b56028ba9d..f6fb8991f3 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1123,8 +1123,6 @@ static void deinit_submodule(const char *path, const 
> char *prefix,
>   if (!(flags & OPT_QUIET))
>   printf(format, displaypath);
>  
> - submodule_unset_core_worktree(sub);
> -
>   strbuf_release(&sb_rm);
>   }
>  
> @@ -2005,29 +2003,6 @@ static int check_name(int argc, const char **argv, 
> const char *prefix)
>   return 0;
>  }
>  
> -static int connect_gitdir_workingtree(int argc, const char **argv, const 
> char *prefix)
> -{
> - struct strbuf sb = STRBUF_INIT;
> - const char *name, *path;
> - char *sm_gitdir;
> -
> - if (argc != 3)
> - BUG("submodule--helper connect-gitdir-workingtree  
> ");
> -
> - name = argv[1];
> - path = argv[2];
> -
> - strbuf_addf(&sb, "%s/modules/%s", get_git_dir(), name);
> - sm_gitdir = absolute_pathdup(sb.buf);
> -
> - connect_work_tree_and_git_dir(path, sm_gitdir, 0);
> -
> - strbuf_release(&sb);
> - free(sm_gitdir);
> -
> - return 0;
> -}
> -
>  #define SUPPORT_SUPER_PREFIX (1<<0)
>  
>  struct cmd_struct {
> @@ -2041,7 +2016,6 @@ static struct cmd_struct commands[] = {
>   {"name", module_name, 0},
>   {"clone", module_clone, 0},
>   {"update-clone", update_clone, 0},
> - {"connect-gitdir-workingtree", connect_gitdir_workingtree, 0},
>   {"relative-path", resolve_relative_path, 0},
>   {"resolve-relative-url", resolve_relative_url, 0},
>   {"resolve-relative-url-test", resolve_relative_url_test, 0},
> diff --git a/git-submodule.sh b/git-submodule.sh
> index f7fd80345c..1cb2c0a31b 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -580,11 +580,6 @@ cmd_update()
>   die "$(eval_gettext "Unable to find current 
> \${remote_name}/\${branch} revision in submodule path '\$sm_path'")"
>   fi
>  
> - if ! $(git config -f "$(git rev-parse 
> --git-common-dir)/modules/$name/config" core.worktree) 2>/dev/null
> - then
> - git submodule--helper connect-gitdir-workingtree 
> "$name" "$sm_path"
> - fi
> -
>   if test "$subsha1" != "$sha1" || test -n "$force"
>   then
>   subforce=$force
> diff --git a/submodule.c b/submodule.c
> index 50cbf5f13e..a2b266fbfa 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -1534,18 +1534,6 @@ int bad_to_remove_submodule(const char *path, unsigned 
> flags)
>   return ret;
>  }
>  
> -void submodule_unset_core_worktree(const struct submodule *sub)
> -{
> - char *config_path = xstrfmt("%s/modules/%s/config",
> - get_git_common_dir(), sub->name);
> -
> - if (git_config_set_in_file_gently(config_path, "core.worktree", NULL))
> - warning(_("Could not unset core.worktree setting in submodule 
> '%s'"),
> -   sub->path);
> -
> - free(config_path);
> -}
> -
>  static const char *get_super_prefix_or_empty(void)
>  {
>   const char *s = get_super_prefix();
> @@ -1711,8 +1699,6 @@ int submodule_move_head(const char *path,
>  
>   if (is_empty_dir(path))
>   rmdir_or_warn(path);
> -
> - submodule_unset_core_worktree(sub);
>   }
>   }
>  out:
> diff --git a/submodule.h b/submodule.h
> index 7d476cefa7..e452919aa4 100644
> --- a/submodule.h
> +++ b/submodule.h
> @@ -127,8 +127,6 @@ int submodule_move_head(const char *path,
>   const char *new_head,
>   unsigned flags);
>  
> -v

Re: [PATCH] t5551-http-fetch-smart.sh: sort cookies before comparing

2018-09-07 Thread Todd Zullinger

Jeff King wrote:
> On Fri, Sep 07, 2018 at 07:22:05PM -0400, Todd Zullinger wrote:
> 
>> With curl-7.61.1 cookies are sorted by creation-time¹.  Sort the output
>> used in the 'cookies stored in http.cookiefile when http.savecookies
>> set' test before comparing it to the expected cookies.
>> 
>> ¹ https://github.com/curl/curl/commit/e2ef8d6fa ("cookies: support
>>   creation-time attribute for cookies", 2018-08-28)
> 
> According to that commit message, the creation-time sort is only for
> cookies of the same length. But it's not clear to me if that just means
> on-the-wire, and curl always stores by creation-time in the cookie file.

Yeah, I didn't dig into the curl code deeply to try and
understand it.  I did test with the only the curl package
downgraded to 7.61.0 to confirm the test worked without
sorting.  And I saw that the curl commit updated existing
tests to sort the test data.

> Either way, though, I guess it wouldn't matter for us as long as we
> choose some arbitrary re-ordering for what curl produces (i.e., the
> output of `sort`) and then make sure our "expect" output is in the same
> order. Which is basically what your patch does. One question, though:
> 
>> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
>> index 771f36f9ff..538656bfef 100755
>> --- a/t/t5551-http-fetch-smart.sh
>> +++ b/t/t5551-http-fetch-smart.sh
>> @@ -215,7 +215,7 @@ test_expect_success 'cookies stored in http.cookiefile 
>> when http.savecookies set
>>  git config http.cookiefile cookies.txt &&
>>  git config http.savecookies true &&
>>  git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
>> -tail -3 cookies.txt >cookies_tail.txt &&
>> +tail -3 cookies.txt | sort >cookies_tail.txt &&
>>  test_cmp expect_cookies.txt cookies_tail.txt
>>  '
> 
> We pick the bottom 3 before sorting. How do we know those are the three
> we want to see?
> 
> ...Ah, OK. The lines we are skipping are not actually cookies at all,
> but just header cruft. I wonder if:
> 
>   grep "^[^#]" cookies.txt
> 
> would be a better way of doing that, but that is certainly not something
> new.
> 
> So this fix looks fine. It might be worth a comment above the creation
> of expect_cookies.txt to mention it must be in sorted order (of course
> anybody modifying it would see a test failure).

I thought about running the expect_cookies.txt file through
sort as well.  That would ensure that both files were using
the same sorting.  Whether that's needed on any platform
now, I don't know.  Maybe that would be a useful way to
protect against future edits to the expect_cookies.txt file
catching the editor?

I thought there might be a test function to sort the output,
but I was (incorrectly) thinking of check_access_log() which
Gábor added in e8b3b2e275 ("t/lib-httpd: avoid occasional
failures when checking access.log", 2018-07-12).

Perhaps it would be useful to have a test_cmp_sorted() to do
the simple dance of sorting the actual & expected.  I
haven't looked through the tests to see how often such a
function might be useful.

>> The in-development version of Fedora updated to the recently
>> released curl-7.61.1 in the past few days.  This isn't
>> breakage from the 2.19.0 cycle, but if the fix looks good to
>> everyone it would be nice to include it.  That way other
>> distributions and users who update git and curl to the most
>> recent releases won't run into this test failure.
>> 
>> I tested this against Fedora 30 (curl-7.61.1) as well as
>> previous releases from RHEL/CentOS 6/7 (7.19.7/7.29.0) and
>> Fedora 27/28/29 (7.55.1/7.59.0/7.61.0).
> 
> You're pretty late in the 2.19 cycle, since the release is tentatively
> scheduled for Sunday. Though since this is just touching the test
> script, and since it looks Obviously Correct, I'm not opposed.

Yep, I knew the final was coming very soon.  I would not
have been surprised to see it land tonight while I was
finishing my testing of this patch. :)

I'm certainly covered for the Fedora packages.  It's hard to
say whether there are many other users/packagers who might
upgrade both git and curl and run into this.  So it may not
be worth even a small risk of making the change at this
point.

On the other hand, the change only affects one test and may
be safe enough to apply.  I'll leave that choice in the
capable hands of our maintainer and the good folks here.

Thanks for a thoughtful review, as always.

-- 
Todd
~~
That which seems the height of absurdity in one generation often
becomes the height of wisdom in the next.
-- John Stuart Mill (1806-1873)

Re: [PATCH] http-backend: allow empty CONTENT_LENGTH

2018-09-07 Thread Max Kirillov

On Fri, Sep 07, 2018 at 02:49:23AM -0700, Junio C Hamano wrote:
> Max Kirillov  writes:
> 
>> Actually, another reason for the latest issue was that CONTENT_LENGTH
>> is parsed for GET requests at all. It should be parsed only for POST
>> requests, or, rather, only for upoad-pack and receive-pack requests.
> 
> Not really.  The layered design of the HTTP protocol means that any
> request type can have non-empty body, but request types for which
> no semantics of the body is defined must ignore what is in the body,
> which in turn means we need to parse and pay attention to the
> content length etc. to find the end of the body, if only to ignore
> it.

I don't think it is git's job to police web server implementations,
especially considering that there is a gap between letter of RFC and
actual behavior.  Anyway, it only runs the check for "*/info/refs" GET
request, which ends up in get_info_refs(). Other GET requests do not
check CONTENT_LENGTH. Also, the version of service which is started from
get_info_refs() do not consume input (I think, actually, the
"--stateless-rpc" argument is not needed there).

> In any case, hopefully we can fix this before the final, as this is
> a regression introduced during this cycle?

Yes, I'm working on it.

[PATCH v3] http-backend: allow empty CONTENT_LENGTH

2018-09-07 Thread Max Kirillov

Before 817f7dc223, CONTENT_LENGTH variable was never considered,
http-backend was just reading request body from standard input until EOF
when it, or a command started by it, needed it.

Then it was discovered that some HTTP do not close standard input, instead
expecting CGI scripts to obey CONTENT_LENGTH. In 817f7dc223, behavior
was changed to consider the CONTENT_LENGTH variable when it is set. Case
of unset CONTENT_LENGTH was kept to mean "read until EOF" which is not
compliant to the RFC3875 (which treats it as empty body), but
practically is used when client uses chunked encoding to submit big
request.

Case of empty CONTENT_LENGTH has slept through this conditions.
Apparently, it is used for GET requests, and RFC3875 does specify that
it also means empty body. Current implementation, however, fails to
parse it and aborts the request.

Fix the case of empty CONTENT_LENGTH to also be treated as "read until EOF".
It does not actually matter what does it mean because body is never read
anyway, it just should not cause parse error. Add a test for the case.

Reported-By: Jelmer Vernooĳ 
Signed-off-by: Max Kirillov 
---
Provided more thorough message, also fix test (it did not test actually the 
error before)

There will be more versions later, at least the one which suggested by Jeff

PS: did I write v2, it should be v3 of course!
 http-backend.c |  2 +-
 t/t5562-http-backend-content-length.sh | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/http-backend.c b/http-backend.c
index e88d29f62b..a1230d7ead 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -353,7 +353,7 @@ static ssize_t get_content_length(void)
ssize_t val = -1;
const char *str = getenv("CONTENT_LENGTH");
 
-   if (str && !git_parse_ssize_t(str, &val))
+   if (str && *str && !git_parse_ssize_t(str, &val))
die("failed to parse CONTENT_LENGTH: %s", str);
return val;
 }
diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index 057dcb85d6..b28c3c4765 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -152,4 +152,15 @@ test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
grep "fatal:.*CONTENT_LENGTH" err
 '
 
+test_expect_success 'empty CONTENT_LENGTH' '
+   env \
+   QUERY_STRING="/repo.git/info/refs?service=git-receive-pack" \
+   PATH_TRANSLATED="$PWD"/.git/info/refs \
+   GIT_HTTP_EXPORT_ALL=TRUE \
+   REQUEST_METHOD=GET \
+   CONTENT_LENGTH="" \
+   git http-backend act.out 2>act.err &&
+   verify_http_result "200 OK"
+'
+
 test_done
-- 
2.17.0.1185.g782057d875

[PATCH v2] http-backend: allow empty CONTENT_LENGTH

2018-09-07 Thread Max Kirillov

Before 817f7dc223, CONTENT_LENGTH variable was never considered,
http-backend was just reading request body from standard input until EOF
when it, or a command started by it, needed it.

Then it was discovered that some HTTP do not close standard input, instead
expecting CGI scripts to obey CONTENT_LENGTH. In 817f7dc223, behavior
was changed to consider the CONTENT_LENGTH variable when it is set. Case
of unset CONTENT_LENGTH was kept to mean "read until EOF" which is not
compliant to the RFC3875 (which treats it as empty body), but
practically is used when client uses chunked encoding to submit big
request.

Case of empty CONTENT_LENGTH has slept through this conditions.
Apparently, it is used for GET requests, and RFC3875 does specify that
it also means empty body. Current implementation, however, fails to
parse it and aborts the request.

Fix the case of empty CONTENT_LENGTH to also be treated as "read until EOF".
It does not actually matter what does it mean because body is never read
anyway, it just should not cause parse error. Add a test for the case.

Reported-By: Jelmer Vernooĳ 
Signed-off-by: Max Kirillov 
---
Provided more thorough message, also fix test (it did not test actually the 
error before)

There will be more versions later, at least the one which suggested by Jeff
 http-backend.c |  2 +-
 t/t5562-http-backend-content-length.sh | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/http-backend.c b/http-backend.c
index e88d29f62b..a1230d7ead 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -353,7 +353,7 @@ static ssize_t get_content_length(void)
ssize_t val = -1;
const char *str = getenv("CONTENT_LENGTH");
 
-   if (str && !git_parse_ssize_t(str, &val))
+   if (str && *str && !git_parse_ssize_t(str, &val))
die("failed to parse CONTENT_LENGTH: %s", str);
return val;
 }
diff --git a/t/t5562-http-backend-content-length.sh 
b/t/t5562-http-backend-content-length.sh
index 057dcb85d6..b28c3c4765 100755
--- a/t/t5562-http-backend-content-length.sh
+++ b/t/t5562-http-backend-content-length.sh
@@ -152,4 +152,15 @@ test_expect_success 'CONTENT_LENGTH overflow ssite_t' '
grep "fatal:.*CONTENT_LENGTH" err
 '
 
+test_expect_success 'empty CONTENT_LENGTH' '
+   env \
+   QUERY_STRING="/repo.git/info/refs?service=git-receive-pack" \
+   PATH_TRANSLATED="$PWD"/.git/info/refs \
+   GIT_HTTP_EXPORT_ALL=TRUE \
+   REQUEST_METHOD=GET \
+   CONTENT_LENGTH="" \
+   git http-backend act.out 2>act.err &&
+   verify_http_result "200 OK"
+'
+
 test_done
-- 
2.17.0.1185.g782057d875

Re: Temporary git files for the gitdir created on a separate drive in workdir

2018-09-07 Thread Duy Nguyen

On Fri, Sep 7, 2018 at 6:48 PM Junio C Hamano  wrote:
>
> Hultqvist  writes:
>
> > Considering that the gitdir could be located on a different drive than
> > the workdir wouldn't it make more sense to create the temporary files
> > in a subdirectory inside the gitdir rather tan in the workdir?
>
> I do not think we intend to create temporary files, whose final
> destination is somewhere under $GIT_DIR/, in any working tree;
> rather, I think we try to create them inside $GIT_DIR (or possibly
> if the destination is a file in a subdirectory of $GIT_DIR, then in
> the same subdirectory).  What you are seeing definitely smells like
> a bug in the worktree code, perhaps getting confused by the fact
> that the full path to these places look "unusual" by starting with a
> single alphabet followed by a colon (IOW, this may manifest only in
> Windows port).

I agree. Auditing the setup code did not reveal anything though. Our
code should recognize these unusual Windows paths as absolute and
while I spotted an incorrect use of '/' (instead of is_dir_sep) it
does not explain the problem here.

Hultqvist, if you set environment variable GIT_TRACE_SETUP to 1 and
run "git status" in G:\Test1, what does it say?
-- 
Duy

Re: [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension

2018-09-07 Thread Martin Ågren

On Fri, 7 Sep 2018 at 22:24, Ben Peart  wrote:
> > Ben Peart  writes:

> >> - 160-bit SHA-1 over the extension types and their sizes (but not
> >> their contents).  E.g. if we have "TREE" extension that is N-bytes
> >> long, "REUC" extension that is M-bytes long, followed by "EOIE",
> >> then the hash would be:

> The purpose of the SHA isn't to detect disk corruption (we already have
> a SHA for the entire index that can serve that purpose) but to help
> ensure that this was actually a valid EOIE extension and not a lucky
> random set of bytes. [...]

> >> +#define EOIE_SIZE 24 /* <4-byte offset> + <20-byte hash> */

> >> +the_hash_algo->init_fn(&c);
> >> +while (src_offset < mmap_size - the_hash_algo->rawsz - 
> >> EOIE_SIZE_WITH_HEADER) {
[...]
> >> +the_hash_algo->final_fn(hash, &c);
> >> +if (hashcmp(hash, (unsigned char *)index))
> >> +return 0;
> >> +
> >> +/* Validate that the extension offsets returned us back to the eoie 
> >> extension. */
> >> +if (src_offset != mmap_size - the_hash_algo->rawsz - 
> >> EOIE_SIZE_WITH_HEADER)
> >> +return 0;

Besides the issue you and Junio discussed with "should we document this
as being SHA-1 or NewHash" (or "the hash algo"), it seems to me that
this implementation is living somewhere between using SHA-1 and "the
hash algo". The hashing uses `the_hash_algo`, but the hash size is
hardcoded at 20 bytes.

Maybe it all works out, e.g., so that when someone (brian) merges a
NewHash and runs the testsuite, this will fail consistently and in a
safe way. But I wonder if it would be too hard to avoid the hardcoded 24
already now.

Martin

Re: [PATCH 0/4] un-breaking pack-objects with bitmaps

2018-09-07 Thread Ævar Arnfjörð Bjarmason

On Sat, Sep 01 2018, Jeff King wrote:

> On Fri, Aug 31, 2018 at 06:55:58PM -0400, Jeff King wrote:
>
>> On Fri, Aug 31, 2018 at 05:23:17PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> > On Tue, Aug 21 2018, Jeff King wrote:
>> >
>> > > +int bitmap_has_sha1_in_uninteresting(struct bitmap_index *bitmap_git,
>> > > + const unsigned char *sha1)
>> > > +{
>> > > +int pos;
>> > > +
>> > > +if (!bitmap_git)
>> > > +return 0; /* no bitmap loaded */
>> > > +if (!bitmap_git->result)
>> > > +BUG("failed to perform bitmap walk before querying");
>> >
>> > Some part of what calls this completely breaks pushing from the "next"
>> > branch when you have local bitmaps (we *really* should have some tests
>> > for this...).
>>
>> Yikes, thanks for reporting. I agree we need better tests here.
>
> OK, here is the fix. Since the problem is in 'next', this is done as a
> patch on top of jk/pack-delta-reuse-with-bitmap. But since we're set to
> rewind 'next' post-release anyway, we could squash it directly into
> 30cdc33fba from the original series. That would help later bisections
> from running into it, which may be worth it as it's a pretty severe
> breakage.  Or maybe not:

Junio: Just a reminder that next is still broken with this, and I see
e.g. the Debian "experimental" has the bug but not the fix at this
point.

I'm just reverting jk/pack-delta-reuse-with-bitmap out of next when
building my own package of git, but I think this really should be fixed
in that branch, either by merging the fix down or reverting the original
series out of next, I think just merging the fix down makes sense, but
have no strong opinion on it.

71 matches

Mail list logo