Re: pack bitmap woes on Windows

2014-02-13 Thread Johannes Sixt
Am 2/12/2014 17:48, schrieb Jeff King:
> On Wed, Feb 12, 2014 at 03:49:24PM +0100, Erik Faye-Lund wrote:
> 
>> It seems the author of a201c20b41a2f0725977bcb89a2a66135d776ba2
>> assumes __BYTE_ORDER was guaranteed to always be defined, probably
>> fooled by the following check:
>>
>> #if !defined(__BYTE_ORDER)
>> # error "Cannot determine endianness"
>> #endif
>>
>> However, that check is only performed if we're NOT building with GCC
>> on x86/x64 nor MSVC (which I don't think defined __BYTE_ORDER either)
>>
>> The following would make __BYTE_ORDER a guarantee, and show that MinGW
>> does not define it:
> 
> Yes, that is the problem. Sorry, this got brought up earlier[1], but the
> discussion went on and I did not notice that Brian's patch did not get
> applied.
> 
> After re-reading the discussion, I think the patch below is the best
> solution. Can you confirm that it fixes the problem for you?

Yes, it fixes the problem! t5310-pack-bitmaps passes all tests with the patch.

Thanks,
-- Hannes

> 
> -Peff
> 
> [1] http://thread.gmane.org/gmane.comp.version-control.git/241278
> 
> -- >8 --
> Subject: ewah: unconditionally ntohll ewah data
> 
> Commit a201c20 tried to optimize out a loop like:
> 
>   for (i = 0; i < len; i++)
> data[i] = ntohll(data[i]);
> 
> in the big-endian case, because we know that ntohll is a
> noop, and we do not need to pay the cost of the loop at all.
> However, it mistakenly assumed that __BYTE_ORDER was always
> defined, whereas it may not be on systems which do not
> define it by default, and where we did not need to define it
> to set up the ntohll macro. This includes OS X and Windows.
> 
> We could muck with the ordering in compat/bswap.h to make
> sure it is defined unconditionally, but it is simpler to
> still to just execute the loop unconditionally. That avoids
> the application code knowing anything about these magic
> macros, and lets it depend only on having ntohll defined.
> 
> And since the resulting loop looks like (on a big-endian
> system):
> 
>   for (i = 0; i < len; i++)
> data[i] = data[i];
> 
> any decent compiler can probably optimize it out.
> 
> Original report and analysis by Brian Gernhardt.
> 
> Signed-off-by: Jeff King 
> ---
>  ewah/ewah_io.c | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
> index 4a7fae6..f7f700e 100644
> --- a/ewah/ewah_io.c
> +++ b/ewah/ewah_io.c
> @@ -113,6 +113,7 @@ int ewah_serialize(struct ewah_bitmap *self, int fd)
>  int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
>  {
>   uint8_t *ptr = map;
> + size_t i;
>  
>   self->bit_size = get_be32(ptr);
>   ptr += sizeof(uint32_t);
> @@ -135,13 +136,8 @@ int ewah_read_mmap(struct ewah_bitmap *self, void *map, 
> size_t len)
>   memcpy(self->buffer, ptr, self->buffer_size * sizeof(uint64_t));
>   ptr += self->buffer_size * sizeof(uint64_t);
>  
> -#if __BYTE_ORDER != __BIG_ENDIAN
> - {
> - size_t i;
> - for (i = 0; i < self->buffer_size; ++i)
> - self->buffer[i] = ntohll(self->buffer[i]);
> - }
> -#endif
> + for (i = 0; i < self->buffer_size; ++i)
> + self->buffer[i] = ntohll(self->buffer[i]);
>  
>   self->rlw = self->buffer + get_be32(ptr);
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tree-walk: be more specific about corrupt tree errors

2014-02-13 Thread Michael Haggerty
On 02/12/2014 08:49 PM, Jeff King wrote:
> When the tree-walker runs into an error, it just calls
> die(), and the message is always "corrupt tree file".
> However, we are actually covering several cases here; let's
> give the user a hint about what happened.
> 
> Let's also avoid using the word "corrupt", which makes it
> seem like the data bit-rotted on disk. Our sha1 check would
> already have found that. These errors are ones of data that
> is malformed in the first place.
> 
> Signed-off-by: Jeff King 
> ---
> Michael and I have been looking off-list at some bogus trees (created by
> a non-git.git implementation). git-fsck unhelpfully dies during the
> tree-walk:
> 
>   $ git fsck
>   Checking object directories: 100% (256/256), done.
>   fatal: corrupt tree file
> 
> I think in the long run we want to either teach fsck to avoid the
> regular tree-walker or to set a special "continue as much as you can"
> flag. That will let us keep going to find more errors, do our usual fsck
> error checks (which are more strict), and especially report on _which_
> object was broken (we can't do that here because we are deep in the call
> stack and may not even have a real object yet).
> 
> This patch at least gives us slightly more specific error messages (both
> for fsck and for other commands). And it may provide a first step in
> clarity if we follow the "continue as much as you can" path.
> 
>  tree-walk.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/tree-walk.c b/tree-walk.c
> index 79dba1d..d53b084 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -28,11 +28,13 @@ static void decode_tree_entry(struct tree_desc *desc, 
> const char *buf, unsigned
>   unsigned int mode, len;
>  
>   if (size < 24 || buf[size - 21])
> - die("corrupt tree file");
> + die("truncated tree file");
>  

I suggest splitting this into two separate checks, because the first
boolean definitely indicates a truncated file, whereas the second
boolean could indicate malformedness that is not caused by truncation.

Another tiny point: I suppose that the number "24" comes from

A one-digit mode
SP
A one-character filename
NUL
20-byte SHA1

But given that you are detecting zero-length filenames a few lines
later, I think it makes logical sense to admit for that possibility
here, by reducing 24 -> 23.  (I realize that it doesn't change the end
result, because the only syntactically correct situation with length=23
would be a doubly-broken line that has a one-digit mode *and* a
zero-length filename, and it's arbitrary which of the forms of
brokenness we report in such a case.)

>   path = get_mode(buf, &mode);
> - if (!path || !*path)
> - die("corrupt tree file");
> + if (!path)
> + die("malformed mode in tree entry");
> + if (!*path)
> + die("empty filename in tree entry");
>   len = strlen(path) + 1;
>  
>   /* Initialize the descriptor entry */
> @@ -81,7 +83,7 @@ void update_tree_entry(struct tree_desc *desc)
>   unsigned long len = end - (const unsigned char *)buf;
>  
>   if (size < len)
> - die("corrupt tree file");
> + die("truncated tree file");
>   buf = end;
>   size -= len;
>   desc->buffer = buf;
> 

Otherwise, I think this is a nice improvement.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Johannes Sixt
Am 2/12/2014 20:30, schrieb Stefan Zager:
> On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees  
> wrote:
>> Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
>>> ReOpenFile, that's fantastic. Thanks a lot!
>>
>> ...but should be loaded dynamically via GetProcAddress, or are we ready to 
>> drop XP support?
> 
> Right, that is an issue.  From our perspective, it's well past time to
> drop XP support.

Not from mine.

-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread David Kastrup
Mike Hommey  writes:

> On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote:
>> Stefan Zager  writes:
>> 
>> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup  wrote:
>> >
>> >> Really, give the above patch a try.  I am taking longer to finish it
>> >> than anticipated (with a lot due to procrastination but that is,
>> >> unfortunately, a large part of my workflow), and it's cutting into my
>> >> "paychecks" (voluntary donations which to a good degree depend on timely
>> >> and nontrivial progress reports for my freely available work on GNU
>> >> LilyPond).
>> >
>> > I will give that a try.  How much of a performance improvement have
>> > you clocked?
>> 
>> Depends on file type and size.  With large files with lots of small
>> changes, performance improvements get more impressive.
>> 
>> Some ugly real-world examples are the Emacs repository, src/xdisp.c
>> (performance improvement about a factor of 3), a large file in the style
>> of /usr/share/dict/words clocking in at a factor of about 5.
>> 
>> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
>> are no improvements in system time (I/O) except for patch 4 of the
>> series which helps perhaps 20% or so.
>> 
>> So the benefits of the patch will come into play mostly for big, bad
>> files on Windows: other than that, the I/O time is likely to be the
>> dominant player anyway.
>
> How much fragmentation does that add to the files, though?

Uh, git-blame is a read-only operation.  It does not add fragmentation
to any file.  The patch will add a diff of probably a few dozen hunks to
builtin/blame.c.  Do you call that "fragmentation"?  It is small enough
that I expect even

git blame builtin/blame.c

to be faster than before.  But that interpretation of your question
probably tries to make too much sense out of what is just nonsense in
the given context.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread David Kastrup
David Kastrup  writes:

> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
> are no improvements in system time (I/O) except for patch 4 of the
> series which helps perhaps 20% or so.
>
> So the benefits of the patch will come into play mostly for big, bad
> files on Windows: other than that, the I/O time is likely to be the
> dominant player anyway.
>
> If you have benchmarked the stuff, for annoying cases expect I/O time
> to go down maybe 10-20%, and user time to drop by a factor of 4.
> Under GNU/Linux, that makes for a significant overall improvement.  On
> Windows, the payback is likely quite less because of the worse I/O
> performance.  Pity.

But of course, you can significantly reduce the relevant file
open/close/search times by running

git gc --aggressive

While this does not actually help with performance in GNU/Linux (though
with file space), dealing with few but compressed files under Windows is
likely a reasonably big win since the uncompression happens in user
space and cannot be bungled by Microsoft (apart from bad memory
management strategies).

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-13 Thread Jeff King
On Tue, Feb 11, 2014 at 11:38:09AM -0800, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > Jeff King  writes:
> >
> >>  - Branch rename breaks local downstream branches
> >>http://article.gmane.org/gmane.comp.version-control.git/241228
> >
> > If you have a branch B that builds on A, if you are renaming A to C,
> > you may want B to automatically set to build on C in some cases, and
> > in other cases your renaming A is done explicitly in order to severe
> > the tie between A and B and setting the latter to build on C can be
> > a bad thing---after all, the user's intention may be to create a
> > branch A starting at some commit immediately after this rename so
> > that B will keep building on that updated A.
> >
> > So I am not sure if this is a bug.
> 
> Having said that, the current behaviour of leaving B half-configured
> to build on a missing branch is undesirable. If we were to change
> this so that any branch B that used to build on branch A being
> renamed to build on the branch under the new name C, the user may
> have to do an extra "--set-upstream-to A" on B after recreating A if
> this was done to save away the current state of A to C and then keep
> building B on an updated A, so we may have to give _some_ clue what
> we are doing behind their back when we rename, e.g.
> 
>   $ git branch -m A C
> warning: branch B is set to build on C now.
> 
> or something.

Yeah, I agree. I think most of the time people would want the dependency
to move with the branch, and the key is being clear about it so the user
can override. I don't think there is a problem with overriding with
`--set-upstream-to` after the fact, rather than giving a special option
to the move operation.

There was a team working on this at the hackday, and they seemed to make
reasonable progress, but I haven't seen the final output yet. Most teams
did not finish their projects, or if they did, I didn't get a chance to
coach them through the patch submission process. I'll hopefully be
revisiting those in the next week or two as they finish them up offline.

My goal is to get them all posting to the list themselves, but I fear I
may have to pick up the pieces in some cases.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread David Kastrup
Johannes Sixt  writes:

> Am 2/12/2014 20:30, schrieb Stefan Zager:
>> On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees  
>> wrote:
>>> Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
 ReOpenFile, that's fantastic. Thanks a lot!
>>>
>>> ...but should be loaded dynamically via GetProcAddress, or are we
>>> ready to drop XP support?
>> 
>> Right, that is an issue.  From our perspective, it's well past time to
>> drop XP support.
>
> Not from mine.

XP has not even reached end of life yet.  As a point of comparison,
there are tensions on the Emacs developer list several times a decade
because some people suggest it might be time to drop the MSDOS port
and/or the associated restriction of having filenames be unique in the
8+3 naming scheme.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-13 Thread Jeff King
On Tue, Feb 11, 2014 at 10:57:22AM -0800, Junio C Hamano wrote:

> >  - git stash doesn't use --index as default
> >http://article.gmane.org/gmane.comp.version-control.git/235892
> 
> I tend to think "git stash" was designed to work this way from day
> one.

Right. The thing that bothers me is the data loss of:

  echo 1 >foo
  git add foo
  echo 2 >foo
  git stash
  git stash pop

The content in "1" is stashed, but gets lost on the apply, and our pop
drops the stash.

> >  - using git commit-tree with "-F -" adds trailing newlines
> >http://article.gmane.org/gmane.comp.version-control.git/236583
> 
> According to the initial commit that introduced this option, it
> deliberately calls strbuf_complete_line() to make sure we do not
> leave an incomplete line at the end.

Yeah, I think this one is a non-bug. I hadn't read it carefully but the
"trailing newlines" made me think it was adding multiple extra blank
lines. But it looks like it is just the normal strbuf_complete_line, and
that is the right thing to be doing (it has _never_ been encouraged to
have a commit message that does not have a newline at all, and
I think stripspace has enforced that from day one).

> Many of the other items in your "bugs" section seem to be worth
> fixing.

As I mentioned elsewhere, I've yet to see the full results of the
hackday. Hopefully soon. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Process of updating a local branch from another with tracking

2014-02-13 Thread Chris Packham
On 13/02/14 10:55, Robert Dailey wrote:
> I have the following alias defined:
> sync = "!f() { cbr=$(git rev-parse --abbrev-ref HEAD); git co $1 &&
> git pull && git co $cbr && git rebase $1; }; f"
> 
> The goal is to basically update a local branch which tracks a branch
> on a remote, and then rebase my local topic branch onto that updated
> local branch.

Generally speaking there is little benefit in manually keeping a local
branch that just tracks a remote one because you already have a "remote
tracking branch" which gets updated whenever you fetch from that remote.
So you could be doing something like

  git fetch origin # or whatever other remotes you may have configured
  git rebase origin/master # or whatever branch you're tracking

You can also tell git this kind of information when you create your
topic branch and git will to this all for you.

  git checkout -b topic1 origin/master
  git pull -r # -r makes pull run fetch + rebase instead of pull + merge

If you already have a topic branch and you want to tell it what remote
branch (upstream) to track you can do that too.

  git checkout topic
  git branch -u origin/master
  git pull -r

> I do this instead of just rebasing onto origin/master. Example:
> 
> git checkout master
> git pull --rebase origin master
> git checkout topic1
> git rebase master
> 
> I could do this instead but not sure if it is recommended:
> 
> git checkout topic1
> git fetch
> git rebase origin/master

I do it all the time. Actually I set the upstream and I can just use
'git pull -r'.

> Any thoughts on the alias I created? I'm a Windows user and still new
> to linux stuff, so if it sucks or could be simplified just let me
> know.

Aliases are fine to save yourself some typing. In this case I think you
can just set the upstream branch and use 'git pull -r'. There is a
config option to make this the default but you probably want to be
comfortable with the difference between merging and rebasing before you
set that.

> And as a secondary question, just curious what people think about
> rebasing onto a remote tracking branch. When I do merges I usually
> refer to the remote branch, but during rebase I use the local branch
> instead, but I don't know if there is any functional reason to not
> skip the local branch and go straight to the remote tracking branch
> (it saves a step).

Branches are branches. You might use multiple local branches to separate
dependent topics that you're working on. You might do the same with
remote branches if you have topics based on other peoples work.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-13 Thread Jeff King
On Thu, Feb 06, 2014 at 10:51:54AM +0100, Matthieu Moy wrote:

> > Some of Matthieu's students worked on it a few years ago but didn't finish.
> 
> Right. There was still quite some work to do, but this is most likely
> too small for a GSoC project. But that could be a part of it. I'm not
> sure how google welcomes GSoC projects made of multiple small tasks, but
> my experience with students is that it's much better than a single (too)
> big task, and I think that was the general feeling on this list when we
> discussed it last year.

I think Google leaves it up to us to decide. I'd be OK with a project
made of multiple small tasks, as I think it would be an interesting
experiment.  I'd rather not do all of them like that, though. And bonus
points if they are on a theme that will let the student use the ramp-up
time from one for another.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git blame: "Not Committed Yet" with clean WD

2014-02-13 Thread Ephrim Khong

Hi,

for files that contain windows line endings in a repository with 
core.autocrlf=input, git blame will show lines as "Not Committed Yet", 
even though they were not modified.


Example:

--
git init
git config core.autocrlf false
echo "foo" > a
unix2dos a
git add a
git commit -m "initial commit"
git config core.autocrlf input
git status
git blame a
--

Output:

--
Reinitialized existing Git repository in /.../testblame2/.git/
unix2dos: converting file a to DOS format ...
On branch master
nothing to commit, working directory clean
On branch master
nothing to commit, working directory clean
 (Not Committed Yet 2014-02-13 10:02:43 +0100 1) foo
--

Is there an easy way to work around this; is this desired behaviour or 
mor a bug?


Thanks - Eph
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Git GSoC 2014

2014-02-13 Thread Jeff King
The Google Summer of Code application process is upon us. We have about
34 hours until the deadline (2014-02-14T19:00 UTC) . That's not very
much time, but I know some people have been thinking about projects for
a while, so I have hope that we can put together an ideas page.

What we need immediately is:

  - somebody to be the backup admin (I am assuming I'll be the primary
admin, but as always, if somebody else wants to...)

  - ideas ideas ideas

The email I sent out last week generated some response, but what I
really need are people to write up full ideas as they would be read
by students (and by GSoC personnel who are reading our application).

As mentioned in a related thread, I'd really like to avoid just
reposting proposals that didn't get any traction in previous years.

Tomorrow I'll try to look through some of the links that have been
posted, or collect stuff from the list. But I really need other
people to contribute, or it's going to be a lame list.

I've put the application draft here:

  http://git.github.io/SoC-2014-Org-Application.html

You can make suggestions in this thread, or edit it by cloning:

  https://github.com/git/git.github.io

and post a patch, or send a pull request, or I should be able to give
people push access to the repo. It's a bit of an experiment, in the same
realm as the wiki we've used in the past.

The idea page is there, too, but there's nothing good on it yet:

  http://git.github.io/SoC-2014-Ideas.html

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-13 Thread Christian Couder
On Thu, Feb 13, 2014 at 9:50 AM, Jeff King  wrote:
> On Thu, Feb 06, 2014 at 10:51:54AM +0100, Matthieu Moy wrote:
>
>> > Some of Matthieu's students worked on it a few years ago but didn't finish.
>>
>> Right. There was still quite some work to do, but this is most likely
>> too small for a GSoC project. But that could be a part of it. I'm not
>> sure how google welcomes GSoC projects made of multiple small tasks, but
>> my experience with students is that it's much better than a single (too)
>> big task, and I think that was the general feeling on this list when we
>> discussed it last year.
>
> I think Google leaves it up to us to decide. I'd be OK with a project
> made of multiple small tasks, as I think it would be an interesting
> experiment.  I'd rather not do all of them like that, though. And bonus
> points if they are on a theme that will let the student use the ramp-up
> time from one for another.

Yeah, a student working on the "git bisect fix/unfixed" feature, could
fix git bisect testing too many merge bases, and if there is still
time work on moving more code from shell to C.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Mike Hommey
On Thu, Feb 13, 2014 at 07:04:02AM +0100, David Kastrup wrote:
> Mike Hommey  writes:
> 
> > On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote:
> >> Stefan Zager  writes:
> >> 
> >> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup  wrote:
> >> >
> >> >> Really, give the above patch a try.  I am taking longer to finish it
> >> >> than anticipated (with a lot due to procrastination but that is,
> >> >> unfortunately, a large part of my workflow), and it's cutting into my
> >> >> "paychecks" (voluntary donations which to a good degree depend on timely
> >> >> and nontrivial progress reports for my freely available work on GNU
> >> >> LilyPond).
> >> >
> >> > I will give that a try.  How much of a performance improvement have
> >> > you clocked?
> >> 
> >> Depends on file type and size.  With large files with lots of small
> >> changes, performance improvements get more impressive.
> >> 
> >> Some ugly real-world examples are the Emacs repository, src/xdisp.c
> >> (performance improvement about a factor of 3), a large file in the style
> >> of /usr/share/dict/words clocking in at a factor of about 5.
> >> 
> >> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
> >> are no improvements in system time (I/O) except for patch 4 of the
> >> series which helps perhaps 20% or so.
> >> 
> >> So the benefits of the patch will come into play mostly for big, bad
> >> files on Windows: other than that, the I/O time is likely to be the
> >> dominant player anyway.
> >
> > How much fragmentation does that add to the files, though?
> 
> Uh, git-blame is a read-only operation.  It does not add fragmentation
> to any file.  The patch will add a diff of probably a few dozen hunks to
> builtin/blame.c.  Do you call that "fragmentation"?  It is small enough
> that I expect even
> 
> git blame builtin/blame.c
> 
> to be faster than before.  But that interpretation of your question
> probably tries to make too much sense out of what is just nonsense in
> the given context.

Sorry, I thought you were talking about write operations, not reads.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Mike Hommey
On Thu, Feb 13, 2014 at 06:34:39PM +0900, Mike Hommey wrote:
> On Thu, Feb 13, 2014 at 07:04:02AM +0100, David Kastrup wrote:
> > Mike Hommey  writes:
> > 
> > > On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote:
> > >> Stefan Zager  writes:
> > >> 
> > >> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup  wrote:
> > >> >
> > >> >> Really, give the above patch a try.  I am taking longer to finish it
> > >> >> than anticipated (with a lot due to procrastination but that is,
> > >> >> unfortunately, a large part of my workflow), and it's cutting into my
> > >> >> "paychecks" (voluntary donations which to a good degree depend on 
> > >> >> timely
> > >> >> and nontrivial progress reports for my freely available work on GNU
> > >> >> LilyPond).
> > >> >
> > >> > I will give that a try.  How much of a performance improvement have
> > >> > you clocked?
> > >> 
> > >> Depends on file type and size.  With large files with lots of small
> > >> changes, performance improvements get more impressive.
> > >> 
> > >> Some ugly real-world examples are the Emacs repository, src/xdisp.c
> > >> (performance improvement about a factor of 3), a large file in the style
> > >> of /usr/share/dict/words clocking in at a factor of about 5.
> > >> 
> > >> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there
> > >> are no improvements in system time (I/O) except for patch 4 of the
> > >> series which helps perhaps 20% or so.
> > >> 
> > >> So the benefits of the patch will come into play mostly for big, bad
> > >> files on Windows: other than that, the I/O time is likely to be the
> > >> dominant player anyway.
> > >
> > > How much fragmentation does that add to the files, though?
> > 
> > Uh, git-blame is a read-only operation.  It does not add fragmentation
> > to any file.  The patch will add a diff of probably a few dozen hunks to
> > builtin/blame.c.  Do you call that "fragmentation"?  It is small enough
> > that I expect even
> > 
> > git blame builtin/blame.c
> > 
> > to be faster than before.  But that interpretation of your question
> > probably tries to make too much sense out of what is just nonsense in
> > the given context.
> 
> Sorry, I thought you were talking about write operations, not reads.

Specifically, I thought you were talking about git checkout.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFH] hackday and GSoC topic suggestions

2014-02-13 Thread David Kastrup
Christian Couder  writes:

> On Thu, Feb 13, 2014 at 9:50 AM, Jeff King  wrote:
>
>> I think Google leaves it up to us to decide. I'd be OK with a project
>> made of multiple small tasks, as I think it would be an interesting
>> experiment.  I'd rather not do all of them like that, though. And
>> bonus points if they are on a theme that will let the student use the
>> ramp-up time from one for another.
>
> Yeah, a student working on the "git bisect fix/unfixed" feature, could
> fix git bisect testing too many merge bases, and if there is still
> time work on moving more code from shell to C.

In the context of programming tasks, "if there is still time" is a prime
candidate for the successful application of branch prediction.

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] release notes: typo fixes

2014-02-13 Thread Michael J Gruber
Signed-off-by: Michael J Gruber 
---
Just a few things I spotted while trying to keep myself informed :)

 Documentation/RelNotes/1.9.txt | 36 ++--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/Documentation/RelNotes/1.9.txt b/Documentation/RelNotes/1.9.txt
index 43c7b68..a0cabcd 100644
--- a/Documentation/RelNotes/1.9.txt
+++ b/Documentation/RelNotes/1.9.txt
@@ -7,19 +7,19 @@ Backward compatibility notes
 "git submodule foreach $cmd $args" used to treat "$cmd $args" the same
 way "ssh" did, concatenating them into a single string and letting the
 shell unquote. Careless users who forget to sufficiently quote $args
-gets their argument split at $IFS whitespaces by the shell, and got
+get their argument split at $IFS whitespaces by the shell, and got
 unexpected results due to this. Starting from this release, the
 command line is passed directly to the shell, if it has an argument.
 
 Read-only support for experimental loose-object format, in which users
-could optionally choose to write in their loose objects for a short
-while between v1.4.3 to v1.5.3 era, has been dropped.
+could optionally choose to write their loose objects for a short
+while between v1.4.3 and v1.5.3 era, has been dropped.
 
-The meanings of "--tags" option to "git fetch" has changed; the
-command fetches tags _in addition to_ what are fetched by the same
+The meanings of the "--tags" option to "git fetch" has changed; the
+command fetches tags _in addition to_ what is fetched by the same
 command line without the option.
 
-The way "git push $there $what" interprets $what part given on the
+The way "git push $there $what" interprets the $what part given on the
 command line, when it does not have a colon that explicitly tells us
 what ref at the $there repository is to be updated, has been enhanced.
 
@@ -96,7 +96,7 @@ UI, Workflows & Features
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This release attempts
to allow object transfer out of a shallowly-cloned repository in a
-   more controlled way (i.e. the receiver become a shallow repository
+   more controlled way (i.e. the receiver becomes a shallow repository
with a truncated history).
 
  * Just like we give a reasonable default for "less" via the LESS
@@ -107,12 +107,12 @@ UI, Workflows & Features
hierarchies, whose variables are predominantly three-level, were
not completed by hitting a  in bash and zsh completions.
 
- * Fetching 'frotz' branch with "git fetch", while 'frotz/nitfol'
+ * Fetching a 'frotz' branch with "git fetch", while a 'frotz/nitfol'
remote-tracking branch from an earlier fetch was still there, would
error out, primarily because the command was not told that it is
allowed to lose any information on our side.  "git fetch --prune"
-   now can be used to remove 'frotz/nitfol' to make room to fetch and
-   store 'frotz' remote-tracking branch.
+   now can be used to remove 'frotz/nitfol' to make room for fetching and
+   storing the 'frotz' remote-tracking branch.
 
  * "diff.orderfile=" configuration variable can be used to
pretend as if the "-O" option were given from the command
@@ -218,7 +218,7 @@ track are contained in this release (see the maintenance 
releases' notes
 for details).
 
  * The pathspec matching code, while comparing two trees (e.g. "git
-   diff A B -- path1 path2") was too agrresive and failed to match
+   diff A B -- path1 path2") was too aggressive and failed to match
some paths when multiple pathspecs were involved.
(merge e4ddb05 as/tree-walk-fix-aggressive-short-cut later to maint).
 
@@ -227,11 +227,11 @@ for details).
(merge b861e23 sb/repack-in-c later to maint).
 
  * An earlier update in v1.8.4.x to "git rev-list --objects" with
-   negative ref had performance regression.
+   negative ref had a performance regression.
(merge 200abe7 jk/mark-edges-uninteresting later to maint).
 
  * A recent update to "git send-email" broke platforms where
-   /etc/ssl/certs/ directory exists, but it cannot used as SSL_ca_path
+   /etc/ssl/certs/ directory exists but cannot be used as SSL_ca_path
(e.g. Fedora rawhide).
(merge 01645b7 rk/send-email-ssl-cert later to maint).
 
@@ -275,8 +275,8 @@ for details).
names.
(merge 82246b7 nd/daemon-informative-errors-typofix later to maint).
 
- * There is no reason to have a hardcoded upper limit of the number of
-   parents for an octopus merge, created via the graft mechanism, but
+ * There is no reason to have a hardcoded upper limit for the number of
+   parents of an octopus merge, created via the graft mechanism, but
there was.
(merge e228c17 js/lift-parent-count-limit later to maint).
 
@@ -306,8 +306,8 @@ for details).
  * When we figure out how many file descriptors to allocate for
keeping packfiles open, a system with non-working getrlimit() could
cause us to die(), but because we make t

[no subject]

2014-02-13 Thread Ryan Wright


I CAN HELP YOU WITH THE LOAN YOU NEED AT 3% INTEREST RATE. CONTACT US 
AT:-ryanwrightfinance...@gmail.com

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] t5537: move http tests out to t5539

2014-02-13 Thread Nguyễn Thái Ngọc Duy
start_httpd is supposed to be at the beginning of the test file, not
the middle of it. The "test_seq" line in "no shallow lines.." test is
updated to compensate missing refs that are there in t5537, but not in
the new t5539.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 On Thu, Feb 13, 2014 at 8:22 AM, Duy Nguyen  wrote:
 > On Thu, Feb 13, 2014 at 5:12 AM, Jeff King  wrote:
 >> lib-httpd was never designed to be included from anywhere except the
 >> beginning of the file. But that wouldn't be right for t5537, because it
 >> wants to run some of the tests, even if apache setup fails. The right
 >> way to do it is probably to have lib-httpd do all of its work in a lazy
 >> prereq. I don't know how clunky that will end up, though; it might be
 >> simpler to just move the shallow http test into one of the http-fetch
 >> scripts.
 >
 > I'll move it out later.
 
 Here it is, on top of nd/http-fetch-shallow-fix because the new test
 in t5537 is picky and a simple merge resolution wouldn't do it.

 t/t5537-fetch-shallow.sh   | 57 ---
 t/t5539-fetch-http-shallow.sh (new +x) | 82 ++
 2 files changed, 82 insertions(+), 57 deletions(-)

diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
index 098f220..3ae9092 100755
--- a/t/t5537-fetch-shallow.sh
+++ b/t/t5537-fetch-shallow.sh
@@ -173,61 +173,4 @@ EOF
)
 '
 
-if test -n "$NO_CURL" -o -z "$GIT_TEST_HTTPD"; then
-   say 'skipping remaining tests, git built without http support'
-   test_done
-fi
-
-. "$TEST_DIRECTORY"/lib-httpd.sh
-start_httpd
-
-test_expect_success 'clone http repository' '
-   git clone --bare --no-local shallow 
"$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
-   git clone $HTTPD_URL/smart/repo.git clone &&
-   (
-   cd clone &&
-   git fsck &&
-   git log --format=%s origin/master >actual &&
-   cat  done" ../trace
-   )
-'
-
-stop_httpd
 test_done
diff --git a/t/t5539-fetch-http-shallow.sh b/t/t5539-fetch-http-shallow.sh
new file mode 100755
index 000..94553e1
--- /dev/null
+++ b/t/t5539-fetch-http-shallow.sh
@@ -0,0 +1,82 @@
+#!/bin/sh
+
+test_description='fetch/clone from a shallow clone over http'
+
+. ./test-lib.sh
+
+if test -n "$NO_CURL"; then
+   skip_all='skipping test, git built without http support'
+   test_done
+fi
+
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+commit() {
+   echo "$1" >tracked &&
+   git add tracked &&
+   git commit -m "$1"
+}
+
+test_expect_success 'setup shallow clone' '
+   commit 1 &&
+   commit 2 &&
+   commit 3 &&
+   commit 4 &&
+   commit 5 &&
+   commit 6 &&
+   commit 7 &&
+   git clone --no-local --depth=5 .git shallow &&
+   git config --global transfer.fsckObjects true
+'
+
+test_expect_success 'clone http repository' '
+   git clone --bare --no-local shallow 
"$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
+   git clone $HTTPD_URL/smart/repo.git clone &&
+   (
+   cd clone &&
+   git fsck &&
+   git log --format=%s origin/master >actual &&
+   cat 

Re: [PATCH 11/11] tree-diff: reuse base str(buf) memory on sub-tree recursion

2014-02-13 Thread Kirill Smelkov
On Fri, Feb 07, 2014 at 09:48:52PM +0400, Kirill Smelkov wrote:
> instead of allocating it all the time for every subtree in
> __diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all
> callee just use it in stacking style, without memory allocations.
> 
> This should be faster, and for me this change gives the following
> slight speedups for `git log --raw --no-abbrev --no-renames`
> 
> navy.gitlinux.git v3.10..v3.11
> 
> before  0.547s  1.791s
> after   0.541s  1.777s
> speedup 1.1%0.8%

The timings above was done with

`git log --raw --no-abbrev --no-renames --format='%H'`
^

Please change them to correct timings:


navy.gitlinux.git v3.10..v3.11

before  0.618s  1.903s
after   0.611s  1.889s
speedup 1.1%0.7%



Thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup

2014-02-13 Thread Kirill Smelkov
Here go combine-diff speedup patches in form of first reworking diff
tree-walker to work in general case - when a commit have several parents, not
only one - we are traversing all 1+nparent trees in parallel.

Then we are taking advantage of the new diff tree-walker for speeding up
combine-diff, which for linux.git results in ~14 times speedup.

I understand v1.9.0 is going to be released first, but wanted to finally send
the patches, so that people could start reviewing them.

Please apply on top of ks/tree-diff-more and thanks beforehand,

Kirill



Kirill Smelkov (2):
  tree-diff: rework diff_tree() to generate diffs for multiparent cases as well
  combine-diff: speed it up, by using multiparent diff tree-walker directly

 combine-diff.c |  85 +-
 diff.c |   2 +
 diff.h |  10 ++
 tree-diff.c| 501 +
 4 files changed, 529 insertions(+), 69 deletions(-)

-- 
1.9.rc1.181.g641f458

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly

2014-02-13 Thread Kirill Smelkov
As was recently shown (c839f1bd "combine-diff: optimize
combine_diff_path sets intersection"), combine-diff runs very slowly. In
that commit we optimized paths sets intersection, but that accounted
only for ~ 25% of the slowness, and as my tracing showed, for linux.git
v3.10..v3.11, for merges a lot of time is spent computing
diff(commit,commit^2) just to only then intersect that huge diff to
almost small set of files from diff(commit,commit^1).

In previous commit, we described the problem in more details, and
reworked the diff tree-walker to be general one - i.e. to work in
multiple parent case too. Now is the time to take advantage of it for
finding paths for combine diff.

The implementation is straightforward - if we know, we can get generated
diff paths directly, and at present that means no diff filtering or
rename/copy detection was requested(*), we can call multiparent tree-walker
directly and get ready paths.

(*) because e.g. at present, all diffcore transformations work on
diff_filepair queues, but in the future, that limitation can be
lifted, if filters would operate directly on combine_diff_paths.

Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log")
and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges")
before and after the patch are as follows:

linux.git v3.10..v3.11

log log -c log -c --merges

before  1.9s16.4s  15.2s
after   1.9s 2.4s   1.1s

The result stayed the same.

Signed-off-by: Kirill Smelkov 
---
 combine-diff.c | 85 ++
 diff.c |  1 +
 2 files changed, 81 insertions(+), 5 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 1732dfd..ddf7495 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1303,7 +1303,7 @@ static const char *path_path(void *obj)
 
 
 /* find set of paths that every parent touches */
-static struct combine_diff_path *find_paths(const unsigned char *sha1,
+static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
const struct sha1_array *parents, struct diff_options *opt)
 {
struct combine_diff_path *paths = NULL;
@@ -1316,6 +1316,7 @@ static struct combine_diff_path *find_paths(const 
unsigned char *sha1,
/* tell diff_tree to emit paths in sorted (=tree) order */
opt->orderfile = NULL;
 
+   /* D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (wrt paths) */
for (i = 0; i < num_parent; i++) {
/*
 * show stat against the first parent even when doing
@@ -1346,6 +1347,35 @@ static struct combine_diff_path *find_paths(const 
unsigned char *sha1,
 }
 
 
+/*
+ * find set of paths that everybody touches, assuming diff is run without
+ * rename/copy detection, etc, comparing all trees simultaneously (= faster).
+ */
+static struct combine_diff_path *find_paths_multitree(
+   const unsigned char *sha1, const struct sha1_array *parents,
+   struct diff_options *opt)
+{
+   int i, nparent = parents->nr;
+   const unsigned char **parents_sha1;
+   struct combine_diff_path paths_head;
+   struct strbuf base;
+
+   parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0]));
+   for (i = 0; i < nparent; i++)
+   parents_sha1[i] = parents->sha1[i];
+
+   /* fake list head, so worker can assume it is non-NULL */
+   paths_head.next = NULL;
+
+   strbuf_init(&base, PATH_MAX);
+   diff_tree_paths(&paths_head, sha1, parents_sha1, nparent, &base, opt);
+
+   strbuf_release(&base);
+   free(parents_sha1);
+   return paths_head.next;
+}
+
+
 void diff_tree_combined(const unsigned char *sha1,
const struct sha1_array *parents,
int dense,
@@ -1355,6 +1385,7 @@ void diff_tree_combined(const unsigned char *sha1,
struct diff_options diffopts;
struct combine_diff_path *p, *paths;
int i, num_paths, needsep, show_log_first, num_parent = parents->nr;
+   int need_generic_pathscan;
 
/* nothing to do, if no parents */
if (!num_parent)
@@ -1377,11 +1408,55 @@ void diff_tree_combined(const unsigned char *sha1,
 
/* find set of paths that everybody touches
 *
-* NOTE find_paths() also handles --stat, as it computes
-* diff(sha1,parent_i) for all i to do the job, specifically
-* for parent0.
+* NOTE
+*
+* Diffcore transformations are bound to diff_filespec and logic
+* comparing two entries - i.e. they do not apply directly to combine
+* diff.
+*
+* If some of such transformations is requested - we launch generic
+* path scanning, which works significantly slower compared to
+* simultaneous all-trees-in-one-go scan in find_paths_multitree().
+*
+* TODO some of the filters could be ported to work on
+* combine_diff_paths - i.e. all

[PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well

2014-02-13 Thread Kirill Smelkov
Previously diff_tree(), which is now named __diff_tree_sha1(), was
generating diff_filepair(s) for two trees t1 and t2, and that was
usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes
a commit introduces.

In Git, however, we have fundamentally built flexibility in that a
commit can have many parents - 1 for a plain commit, 2 for a simple merge,
but also more than 2 for merging several heads at once.

For merges there is a so called combine-diff, which shows diff, a merge
introduces by itself, omitting changes done by any parent. That works
through first finding paths, that are different to all parents, and then
showing generalized diff, with separate columns for +/- for each parent.
The code lives in combine-diff.c .

There is an impedance mismatch, however, in that a commit could
generally have any number of parents, and that while diffing trees, we
divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there
is no special casing for multiple parents commits in e.g.
revision-walker .

That impedance mismatch *hurts* *performance* *badly* for generating
combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path
sets intersection) I've already removed some slowness from it, but from
the timings provided there, it could be seen, that combined diffs still
cost more than an order of magnitude more cpu time, compared to diff for
usual commits, and that would only be an optimistic estimate, if we take
into account that for e.g. linux.git there is only one merge for several
dozens of plain commits.

That slowness comes from the fact that currently, while generating
combined diff, a lot of time is spent computing diff(commit,commit^2)
just to only then intersect that huge diff to almost small set of files
from diff(commit,commit^1).

That's because at present, to compute combine-diff, for first finding
paths, that "every parent touches", we use the following combine-diff
property/definition:

D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)  (w.r.t. paths)

where

D(A,P1...Pn) is combined diff between commit A, and parents Pi

and

D(A,Pi) is usual two-tree diff Pi..A

So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n
1-parent diffs and intersecting results will be slow.

And usually, for linux.git and other topic-based workflows, that
D(A,P2) is huge, because, if merge-base of A and P2, is several dozens
of merges (from A, via first parent) below, that D(A,P2) will be diffing
sum of merges from several subsystems to 1 subsystem.

The solution is to avoid computing n 1-parent diffs, and to find
changed-to-all-parents paths via scanning A's and all Pi's trees
simultaneously, at each step comparing their entries, and based on that
comparison, populate paths result, and deduce we could *skip*
*recursing* into subdirectories, if at least for 1 parent, sha1 of that
dir tree is the same as in A. That would save us from doing significant
amount of needless work.

Such approach is very similar to what diff_tree() does, only there we
deal with scanning only 2 trees simultaneously, and for n+1 tree, the
logic is a bit more complex:

D(A,X1...Xn) calculation scheme
---

D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn)   (regarding resulting paths set)

 D(A,Xj) - diff between A..Xj
 D(A,X1...Xn)- combined diff from A to parents X1,...,Xn

We start from all trees, which are sorted, and compare their entries in
lock-step:

  A X1   Xn
  - --
 |a|   |x1| |xn|
 |-|   |--| ... |--|  i = argmin(x1...xn)
 | |   |  | |  |
 |-|   |--| |--|
 |.|   |. | |. |
  . ..
  . ..

at any time there could be 3 cases:

 1)  a < xi;
 2)  a > xi;
 3)  a = xi.

Schematic deduction of what every case means, and what to do, follows:

1)  a < xi  ->  ∀j a ∉ Xj  ->  "+a" ∈ D(A,Xj)  ->  D += "+a";  a↓

2)  a > xi

2.1) ∃j: xj > xi  ->  "-xi" ∉ D(A,Xj)  ->  D += ø;  ∀ xk=xi  xk↓
2.2) ∀j  xj = xi  ->  xj ∉ A  ->  "-xj" ∈ D(A,Xj)  ->  D += "-xi";  ∀j xj↓

3)  a = xi

3.1) ∃j: xj > xi  ->  "+a" ∈ D(A,Xj)  ->  only xk=xi remains to investigate
3.2) xj = xi  ->  investigate δ(a,xj)
 |
 |
 v

3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø  ->

  ⎧δ(a,xk)  - if xk=xi
 ->  D += ⎨
  ⎩"+a" - if xk>xi

in any case a↓  ∀ xk=xi  xk↓

~

For comparison, here is how diff_tree() works:

D(A,B) calculation scheme
-

A B
- -
   |a|   |b|a < b   ->  a ∉ B   ->   D(A,B) +=  +aa↓
   |-|   |-|a > b   ->  b ∉ A   ->   D(A,B) +=  -bb↓
   | |   | |a = b   ->  investigate δ(a,b)a↓ b↓
   |-|   |-|
   |.|   |.|
. .
. .



This patch generalizes diff tree-walker to work with arbitrary number of
parents as described above - i.e. now there is a resulting tree t, and
some parents tree

Re: [PATCH 00/11] More preparatory work for multiparent tree-walker

2014-02-13 Thread Kirill Smelkov
On Wed, Feb 12, 2014 at 09:25:51AM -0800, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > Kirill Smelkov  writes:
> >
> >> Sorry for the confusion. Could you please do the following:
> >>
> >> Patches should be applied over to ks/tree-diff-walk
> >> (74aa4a18). Before applying the patches, please cherry-pick
> >>
> >> c90483d9(tree-diff: no need to manually verify that there is no
> >>  mode change for a path)
> >>
> >> ef4f0928(tree-diff: no need to pass match to
> >>  skip_uninteresting())
> >>
> >> into that branch, and drop them from ks/combine-diff - we'll have one
> >> branch reworking the diff tree-walker, and the other taking advantage of
> >> it for combine-diff.
> >
> > As long as that does not lose changes to tests and clean-ups, I'm
> > fine with that direction.  For example, I do not know if you want to
> > lose e3f62d12 (diffcore-order: export generic ordering interface,
> > 2014-01-20), which is part of the combine-diff topic.
> 
> Ahh, sorry, I misread the "drop" as "salvage these two and drop the
> rest".  The new series does apply cleanly on a commit in master..pu
> that has both ks/tree-diff-walk and ks/combine-diff, and the latter
> is not yet in 'next' so we are free to reorganize.
> 
> Let me flip the latter topic around, also queue these updates and
> push the result out on 'pu'.
> 
> Thanks.

Thank you. As we've managed to apply this to pu, I've send the final
speedup patches. Please review them as time permits.

Thanks beforehand,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/11] tree-diff: consolidate code for emitting diffs and recursion in one place

2014-02-13 Thread Junio C Hamano
Kirill Smelkov  writes:

> +static void show_path(struct strbuf *base, struct diff_options *opt,
> +   struct tree_desc *t1, struct tree_desc *t2)
>  {
>   unsigned mode;
>   const char *path;
> - const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode);
> - int pathlen = tree_entry_len(&desc->entry);
> + const unsigned char *sha1;
> + int pathlen;
>   int old_baselen = base->len;
> + int isdir, recurse = 0, emitthis = 1;
> +
> + /* at least something has to be valid */
> + assert(t1 || t2);
> +
> + if (t2) {
> + /* path present in resulting tree */
> + sha1 = tree_entry_extract(t2, &path, &mode);
> + pathlen = tree_entry_len(&t2->entry);
> + isdir = S_ISDIR(mode);
> + }
> + else {
> + /* a path was removed - take path from parent. Also take
> +  * mode from parent, to decide on recursion.
> +  */
> + tree_entry_extract(t1, &path, &mode);
> + pathlen = tree_entry_len(&t1->entry);
> +
> + isdir = S_ISDIR(mode);
> + sha1 = NULL;
> + mode = 0;
> + }
> +
> + if (DIFF_OPT_TST(opt, RECURSIVE) && isdir) {
> + recurse = 1;
> + emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE);
> + }
>  
>   strbuf_add(base, path, pathlen);
> - if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode)) {
> - if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE))
> - opt->add_remove(opt, *prefix, mode, sha1, 1, base->buf, 
> 0);
>  
> + if (emitthis)
> + emit_diff(opt, base, t1, t2);
> +
> + if (recurse) {
>   strbuf_addch(base, '/');
> - diff_tree_sha1(*prefix == '-' ? sha1 : NULL,
> -*prefix == '+' ? sha1 : NULL, base->buf, opt);
> - } else
> - opt->add_remove(opt, prefix[0], mode, sha1, 1, base->buf, 0);
> + diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
> +t2 ? t2->entry.sha1 : NULL, base->buf, opt);
> + }


After this step, "sha1" is assigned but never gets used.  Please
double-check the fix-up I queued in the series before merging it to
'pu'.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/11] tree-diff: consolidate code for emitting diffs and recursion in one place

2014-02-13 Thread Kirill Smelkov
On Thu, Feb 13, 2014 at 09:43:27AM -0800, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > +static void show_path(struct strbuf *base, struct diff_options *opt,
> > + struct tree_desc *t1, struct tree_desc *t2)
> >  {
> > unsigned mode;
> > const char *path;
> > -   const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode);
> > -   int pathlen = tree_entry_len(&desc->entry);
> > +   const unsigned char *sha1;
> > +   int pathlen;
> > int old_baselen = base->len;
> > +   int isdir, recurse = 0, emitthis = 1;
> > +
> > +   /* at least something has to be valid */
> > +   assert(t1 || t2);
> > +
> > +   if (t2) {
> > +   /* path present in resulting tree */
> > +   sha1 = tree_entry_extract(t2, &path, &mode);
> > +   pathlen = tree_entry_len(&t2->entry);
> > +   isdir = S_ISDIR(mode);
> > +   }
> > +   else {
> > +   /* a path was removed - take path from parent. Also take
> > +* mode from parent, to decide on recursion.
> > +*/
> > +   tree_entry_extract(t1, &path, &mode);
> > +   pathlen = tree_entry_len(&t1->entry);
> > +
> > +   isdir = S_ISDIR(mode);
> > +   sha1 = NULL;
> > +   mode = 0;
> > +   }
> > +
> > +   if (DIFF_OPT_TST(opt, RECURSIVE) && isdir) {
> > +   recurse = 1;
> > +   emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE);
> > +   }
> >  
> > strbuf_add(base, path, pathlen);
> > -   if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode)) {
> > -   if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE))
> > -   opt->add_remove(opt, *prefix, mode, sha1, 1, base->buf, 
> > 0);
> >  
> > +   if (emitthis)
> > +   emit_diff(opt, base, t1, t2);
> > +
> > +   if (recurse) {
> > strbuf_addch(base, '/');
> > -   diff_tree_sha1(*prefix == '-' ? sha1 : NULL,
> > -  *prefix == '+' ? sha1 : NULL, base->buf, opt);
> > -   } else
> > -   opt->add_remove(opt, prefix[0], mode, sha1, 1, base->buf, 0);
> > +   diff_tree_sha1(t1 ? t1->entry.sha1 : NULL,
> > +  t2 ? t2->entry.sha1 : NULL, base->buf, opt);
> > +   }
> 
> 
> After this step, "sha1" is assigned but never gets used.  Please
> double-check the fix-up I queued in the series before merging it to
> 'pu'.

Your fixup is correct - it was my overlook when preparing the patch -
sha1 is needed for later patch (multiparent diff tree-walker), but I've
mistakenly left it here.

The two interesting patches I've sent you today, are already adjusted to
this correction - it is safe to squash the fixup in.

Thanks for noticing,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] for-each-ref: add option to omit newlines

2014-02-13 Thread Øystein Walle
Even when having specified a format string for-each-ref still prints a
newline after printing each ref according to the format. This breaks
splitting the output on newlines, for example.

Signed-off-by: Øystein Walle 
---
I was somewhat surprised by this behaviour; I expected to be in full control of
the output but I could not get rid of the newlines.

As far as I'm *personally* concerned the line "putchar('\n');" could just as
well be removed completely and an '\n' appended to the default format string.
But since this command (and this behaviour) as been around since 2006 I assume
that it's in Git's best interest to not change the default behaviour. Hence the
introduction of this option. Although I was taken aback by this behaviour, is
it patch-worthy? The fix isn't very pretty.

On to the patch itself: I contemplated putting '\n' in the default format and
removing it if -n was given, which would get rid of the need to pass an exta
argument to show_ref(). But that means we would need to *insert it* when a
format is given and -n is not...

I've changed how the default format is assigned so that we can check if it's
NULL after option parsing to spit out an error. Alternatively we could just
allow it.

 Documentation/git-for-each-ref.txt |  3 +++
 builtin/for-each-ref.c | 20 +++-
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt 
b/Documentation/git-for-each-ref.txt
index 4240875..805914a 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -62,6 +62,9 @@ OPTIONS
the specified host language.  This is meant to produce
a scriptlet that can directly be `eval`ed.
 
+-n::
+--no-newlines::
+   Do not print a newline after printing each formatted ref.
 
 FIELD NAMES
 ---
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 51798b4..32d6b12 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -998,7 +998,8 @@ static void emit(const char *cp, const char *ep)
}
 }
 
-static void show_ref(struct refinfo *info, const char *format, int quote_style)
+static void show_ref(struct refinfo *info, const char *format, int quote_style,
+int no_newlines)
 {
const char *cp, *sp, *ep;
 
@@ -1023,7 +1024,8 @@ static void show_ref(struct refinfo *info, const char 
*format, int quote_style)
resetv.s = color;
print_value(&resetv, quote_style);
}
-   putchar('\n');
+   if (!no_newlines)
+   putchar('\n');
 }
 
 static struct ref_sort *default_sort(void)
@@ -1067,9 +1069,9 @@ static char const * const for_each_ref_usage[] = {
 int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 {
int i, num_refs;
-   const char *format = "%(objectname) %(objecttype)\t%(refname)";
+   const char *format = NULL;
struct ref_sort *sort = NULL, **sort_tail = &sort;
-   int maxcount = 0, quote_style = 0;
+   int maxcount = 0, quote_style = 0, no_newlines = 0;
struct refinfo **refs;
struct grab_ref_cbdata cbdata;
 
@@ -1082,6 +1084,8 @@ int cmd_for_each_ref(int argc, const char **argv, const 
char *prefix)
N_("quote placeholders suitably for python"), 
QUOTE_PYTHON),
OPT_BIT(0 , "tcl",  "e_style,
N_("quote placeholders suitably for tcl"), QUOTE_TCL),
+   OPT_BOOL('n' , "no-newlines",  &no_newlines,
+   N_("do not output a newline after each ref")),
 
OPT_GROUP(""),
OPT_INTEGER( 0 , "count", &maxcount, N_("show only  matched 
refs")),
@@ -1100,6 +1104,12 @@ int cmd_for_each_ref(int argc, const char **argv, const 
char *prefix)
error("more than one quoting style?");
usage_with_options(for_each_ref_usage, opts);
}
+   if (no_newlines && !format) {
+   error("--no-newlines without --format does not make sense");
+   usage_with_options(for_each_ref_usage, opts);
+   }
+   if (!format)
+   format = "%(objectname) %(objecttype)\t%(refname)";
if (verify_format(format))
usage_with_options(for_each_ref_usage, opts);
 
@@ -1120,6 +1130,6 @@ int cmd_for_each_ref(int argc, const char **argv, const 
char *prefix)
if (!maxcount || num_refs < maxcount)
maxcount = num_refs;
for (i = 0; i < maxcount; i++)
-   show_ref(refs[i], format, quote_style);
+   show_ref(refs[i], format, quote_style, no_newlines);
return 0;
 }
-- 
1.8.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] for-each-ref: add option to omit newlines

2014-02-13 Thread Øystein Walle
Øystein Walle  gmail.com> writes:

> 
> splitting the output on newlines, for example.
> 

Ugh, I mean on null bytes, naturally.

Øsse.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Why is there no execution of the pre-commit hook while rebasing?

2014-02-13 Thread Matthias Beyer
Hi,

I'm wondering why there is no execution of the pre-commit hook when
rebasing.

When I do a rebase of N commits (say, 3 or so) onto another branch,
each of the rebased commits gets "recommitted" to the branch, am I
right?

If yes, wouldn't it make sense to run the pre-commit hook after each
applied commit?

Just wondering...

-- 
Mit freundlichen Grüßen,
Kind regards,
Matthias Beyer

Proudly sent with mutt.
Happily signed with gnupg.


pgpjBB2ibZKuU.pgp
Description: PGP signature


Re: Make the git codebase thread-safe

2014-02-13 Thread Stefan Zager
On Thu, Feb 13, 2014 at 12:27 AM, Johannes Sixt  wrote:
> Am 2/12/2014 20:30, schrieb Stefan Zager:
>> On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees  
>> wrote:
>>> Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
 ReOpenFile, that's fantastic. Thanks a lot!
>>>
>>> ...but should be loaded dynamically via GetProcAddress, or are we ready to 
>>> drop XP support?
>>
>> Right, that is an issue.  From our perspective, it's well past time to
>> drop XP support.
>
> Not from mine.

All this really means is that the build config will test WIN_VER, and
there will need to be an additional binary distribution of msysgit for
newer Windows.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Zachary Turner
Karsten Blees  gmail.com> writes:

> 
> Am 12.02.2014 19:37, schrieb Erik Faye-Lund:
> > On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager  google.com> 
wrote:
> >> On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund  
gmail.com> wrote:
> >>> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager  google.com> 
wrote:
> 
>  I don't want to steal the thunder of my coworker, who wrote the
>  implementation.  He plans to submit it upstream soon-ish.  It relies
>  on using the lpOverlapped argument to ReadFile(), with some 
additional
>  tomfoolery to make sure that the implicit position pointer for the
>  file descriptor doesn't get modified.
> >>>
> >>> Is the code available somewhere? I'm especially interested in the
> >>> "additional tomfoolery to make sure that the implicit position pointer
> >>> for the file descriptor doesn't get modified"-part, as this was what I
> >>> ended up butting my head into when trying to do this myself.
> >>
> >> https://chromium-review.googlesource.com/#/c/186104/
> > 
> > ReOpenFile, that's fantastic. Thanks a lot!
> 
> ...but should be loaded dynamically via GetProcAddress, or are we ready to 
drop XP support?
> 
> 

Original patch author here.  In trying to prepare this patch to use 
GetProcAddress to load dynamically, I've run into a bit of a snag.  
NO_THREAD_SAFE_PREAD is a compile-time flag, which will be incompatible with 
any attempt to make this a runtime decision a la LoadLibrary / 
GetProcAddress.  On XP, we would need to fallback to the single-threaded 
path, and on Vista+ we would use the thread-able path, and obviously this 
decision could not be made until runtime.

If MinGW were the only configuration using NO_THREAD_SAFE_PREAD, I would 
just remove it entirely, but it appears Cygwin configuration uses it also.

Suggestions?  

One possibility is to disallow (by convention, perhaps), the use of pread() 
and read() against the same fd.  The only reason ReOpenFile is necessary at 
all is because some code somewhere is mixing read-styles against the same 
fd.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] contrib/diff-highlight: multibyte characters diff

2014-02-13 Thread Yoshihiro Sugi
Thanks for reviewing.
as you wrote, diff content may not be utf8 at all. and we don't know that the 
user's terminal watns is utf8.
I think your trying utf8 decode and fall back approach is better than my patch, 
and do work well.

is using "$@" for catching error like the patch below?
According to perldoc Encode.pm, encode/decode with "FB_CROAK" may destroy 
original string. We should probabry use "LEAVE_SRC" on decode_utf8's second 
argument.

---
 contrib/diff-highlight/diff-highlight | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/contrib/diff-highlight/diff-highlight 
b/contrib/diff-highlight/diff-highlight
index c4404d4..0743851 100755
--- a/contrib/diff-highlight/diff-highlight
+++ b/contrib/diff-highlight/diff-highlight
@@ -2,6 +2,7 @@
 
 use warnings FATAL => 'all';
 use strict;
+use Encode qw(decode_utf8 encode_utf8);
 
 # Highlight by reversing foreground and background. You could do
 # other things like bold or underline if you prefer.
@@ -73,13 +74,23 @@ sub show_hunk {
 
my @queue;
for (my $i = 0; $i < @$a; $i++) {
-   my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]);
-   print $rm;
-   push @queue, $add;
+   my ($a_dec, $encode_rm) = decode($a->[$i]);
+   my ($b_dec, $encode_add) = decode($b->[$i]);
+   my ($rm, $add) = highlight_pair($a_dec, $b_dec);
+   print $encode_rm->($rm);
+   push @queue, $encode_add->($add);
}
print @queue;
 }
 
+sub decode {
+   my $orig = shift;
+   my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK | 
Encode::LEAVE_SRC) };
+   return $@ ?
+  ($orig, sub { shift }) :
+  ($decoded, sub { encode_utf8(shift) });
+}
+
 sub highlight_pair {
my @a = split_line(shift);
my @b = split_line(shift);
-- 
1.8.5.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tests: turn on network daemon tests by default

2014-02-13 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>>   test_normalize_tristate GIT_TEST_DAEMON
>
> Heh, great minds think alike.  This is what I am playing with,
> without committing (because I do like your "ask config if this is a
> kind of various boolean 'false' representations, which I haven't
> managed to add to it).

And this is with the "ask config" helper.

Two tangents.

 - We may want to do something similar in cvsserver and git-gui to
   make them more robust.

   $ git grep -e true --and -e 1 --and -e yes

 - Do we want to do something similar to GIT_TEST_CREDENTIAL_HELPER?

-- >8 --
From: Jeff King 
Date: Mon, 10 Feb 2014 16:29:37 -0500
Subject: [PATCH] tests: turn on network daemon tests by default

We do not run the httpd nor git-daemon tests by default, as
they are rather heavyweight and require network access
(albeit over localhost). However, it would be nice if more
pepole ran them, for two reasons:

  1. We would get more test coverage on more systems.

  2. The point of the test suite is to find regressions. It
 is very easy to change some of the underlying code and
 break the httpd code without realizing you are even
 affecting it. Running the httpd tests helps find these
 problems sooner (ideally before the patches even hit
 the list).

We still want to leave an "out", though, for people who really do
not want to run them. For that reason, the GIT_TEST_HTTPD and
GIT_TEST_GIT_DAEMON variables are now tri-state booleans
(true/false/auto), so you can say GIT_TEST_HTTPD=false to turn the
tests back off.  To support those who want a stable single way to
disable these tests across versions of Git before and after this
change, an empty string explicitly set to these variables is also
taken as "false", so the behaviour changes only for those who:

  a. did not express any preference by leaving these variables
 unset.  They did not test these features before, but now they
 do; or

  b. did express that they want to test these features by setting
 GIT_TEST_FEATURE=false (or any equivalent other ways to tell
 "false" to Git, e.g. "0"), which has been a valid but funny way
 to say that they do want to test the feature only because we
 used to interpret any non-empty string to mean "yes please
 test".  They no longer test that feature.

In addition, we are forgiving of common setup failures (e.g., you do
not have apache installed, or have an old version) when the
tri-state is "auto" (or empty), but report an error when it is
"true". This makes "auto" a sane default, as we should not cause
failures on setups where the tests cannot run. But it allows people
who use "true" to catch regressions in their system (e.g., they
uninstalled apache, but were expecting their automated test runs to
test git-httpd, and would want to be notified).

Signed-off-by: Jeff King 
Signed-off-by: Junio C Hamano 
---
 t/lib-git-daemon.sh |  8 ---
 t/lib-httpd.sh  | 22 +--
 t/test-lib-functions.sh | 58 +
 3 files changed, 74 insertions(+), 14 deletions(-)

diff --git a/t/lib-git-daemon.sh b/t/lib-git-daemon.sh
index 394b06b..615bf5d 100644
--- a/t/lib-git-daemon.sh
+++ b/t/lib-git-daemon.sh
@@ -16,9 +16,10 @@
 #  stop_git_daemon
 #  test_done
 
-if test -z "$GIT_TEST_GIT_DAEMON"
+test_tristate GIT_TEST_GIT_DAEMON
+if test "$GIT_TEST_GIT_DAEMON" = false
 then
-   skip_all="git-daemon testing disabled (define GIT_TEST_GIT_DAEMON to 
enable)"
+   skip_all="git-daemon testing disabled (unset GIT_TEST_GIT_DAEMON to 
enable)"
test_done
 fi
 
@@ -58,7 +59,8 @@ start_git_daemon() {
kill "$GIT_DAEMON_PID"
wait "$GIT_DAEMON_PID"
trap 'die' EXIT
-   error "git daemon failed to start"
+   test_skip_or_die $GIT_TEST_GIT_DAEMON \
+   "git daemon failed to start"
fi
 }
 
diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh
index bfdff2a..f9c2e22 100644
--- a/t/lib-httpd.sh
+++ b/t/lib-httpd.sh
@@ -30,9 +30,10 @@
 # Copyright (c) 2008 Clemens Buchacher 
 #
 
-if test -z "$GIT_TEST_HTTPD"
+test_tristate GIT_TEST_HTTPD
+if test "$GIT_TEST_HTTPD" = false
 then
-   skip_all="Network testing disabled (define GIT_TEST_HTTPD to enable)"
+   skip_all="Network testing disabled (unset GIT_TEST_HTTPD to enable)"
test_done
 fi
 
@@ -76,8 +77,7 @@ GIT_VALGRIND_OPTIONS=$GIT_VALGRIND_OPTIONS; export 
GIT_VALGRIND_OPTIONS
 
 if ! test -x "$LIB_HTTPD_PATH"
 then
-   skip_all="skipping test, no web server found at '$LIB_HTTPD_PATH'"
-   test_done
+   test_skip_or_die $GIT_TEST_HTTPD "no web server found at 
'$LIB_HTTPD_PATH'"
 fi
 
 HTTPD_VERSION=`$LIB_HTTPD_PATH -v | \
@@ -89,19 +89,20 @@ then
then
if ! test $HTTPD_VERSION -ge 2
then
-   skip_all="skipping test, at least Apache version 2 is 
required"
-   test_d

Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well

2014-02-13 Thread Junio C Hamano
Kirill Smelkov  writes:

> + /* until we go to it next round, .next holds how many bytes we
> +  * allocated (for faster realloc - we don't need copying old 
> data).
> +  */
> + p->next = (struct combine_diff_path *)alloclen;

I am getting this here:

tree-diff.c: In function '__path_appendnew':
tree-diff.c:140:13: error: cast to pointer from integer of different size 
[-Werror=int-to-pointer-cast]
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly

2014-02-13 Thread Junio C Hamano
Kirill Smelkov  writes:

> + if (need_generic_pathscan) {
> + /* NOTE generic case also handles --stat, as it computes
> +  * diff(sha1,parent_i) for all i to do the job, specifically
> +  * for parent0.
> +  */
> + paths = find_paths_generic(sha1, parents, &diffopts);
> + }
> + else {
> + paths = find_paths_multitree(sha1, parents, &diffopts);
> +
> + /* show stat against the first parent even
> +  * when doing combined diff.
> +  */
> + int stat_opt = (opt->output_format &
> + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT));

/*
 * We see decl-after-stmt here.
 * Also please have slash-asterisk and asterisk-slash
 * at the beginning and the end of a multi-line comment
 * block on their own line.
 */
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] for-each-ref: add option to omit newlines

2014-02-13 Thread Junio C Hamano
Øystein Walle  writes:

> On to the patch itself: I contemplated putting '\n' in the default format and
> removing it if -n was given, which would get rid of the need to pass an exta
> argument to show_ref(). But that means we would need to *insert it* when a
> format is given and -n is not...

I would rather see us go in the direction to add "-z" output option,
which is what everybody else that produces NUL terminated entries in
our suite of subcommands does.

IOW, something along with this line (untested).

 builtin/for-each-ref.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 51798b4..2c8cac8 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -94,6 +94,7 @@ static const char **used_atom;
 static cmp_type *used_atom_type;
 static int used_atom_cnt, need_tagged, need_symref;
 static int need_color_reset_at_eol;
+static int line_termination = '\n';
 
 /*
  * Used to parse format string and sort specifiers
@@ -1023,7 +1024,7 @@ static void show_ref(struct refinfo *info, const char 
*format, int quote_style)
resetv.s = color;
print_value(&resetv, quote_style);
}
-   putchar('\n');
+   putchar(line_termination);
 }
 
 static struct ref_sort *default_sort(void)
@@ -1088,6 +1089,9 @@ int cmd_for_each_ref(int argc, const char **argv, const 
char *prefix)
OPT_STRING(  0 , "format", &format, N_("format"), N_("format to 
use for the output")),
OPT_CALLBACK(0 , "sort", sort_tail, N_("key"),
N_("field name to sort on"), &opt_parse_sort),
+   OPT_SET_INT('z', NULL, &line_termination,
+   N_("Use NUL instead of LF to end each output 
records"),
+   '\0'),
OPT_END(),
};
 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] t5537: move http tests out to t5539

2014-02-13 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> start_httpd is supposed to be at the beginning of the test file, not
> the middle of it. The "test_seq" line in "no shallow lines.." test is
> updated to compensate missing refs that are there in t5537, but not in
> the new t5539.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  On Thu, Feb 13, 2014 at 8:22 AM, Duy Nguyen  wrote:
>  > On Thu, Feb 13, 2014 at 5:12 AM, Jeff King  wrote:
>  >> lib-httpd was never designed to be included from anywhere except the
>  >> beginning of the file. But that wouldn't be right for t5537, because it
>  >> wants to run some of the tests, even if apache setup fails. The right
>  >> way to do it is probably to have lib-httpd do all of its work in a lazy
>  >> prereq. I don't know how clunky that will end up, though; it might be
>  >> simpler to just move the shallow http test into one of the http-fetch
>  >> scripts.
>  >
>  > I'll move it out later.
>  
>  Here it is, on top of nd/http-fetch-shallow-fix because the new test
>  in t5537 is picky and a simple merge resolution wouldn't do it.

Will queue; thanks.

>
>  t/t5537-fetch-shallow.sh   | 57 ---
>  t/t5539-fetch-http-shallow.sh (new +x) | 82 
> ++
>  2 files changed, 82 insertions(+), 57 deletions(-)
>
> diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh
> index 098f220..3ae9092 100755
> --- a/t/t5537-fetch-shallow.sh
> +++ b/t/t5537-fetch-shallow.sh
> @@ -173,61 +173,4 @@ EOF
>   )
>  '
>  
> -if test -n "$NO_CURL" -o -z "$GIT_TEST_HTTPD"; then
> - say 'skipping remaining tests, git built without http support'
> - test_done
> -fi
> -
> -. "$TEST_DIRECTORY"/lib-httpd.sh
> -start_httpd
> -
> -test_expect_success 'clone http repository' '
> - git clone --bare --no-local shallow 
> "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
> - git clone $HTTPD_URL/smart/repo.git clone &&
> - (
> - cd clone &&
> - git fsck &&
> - git log --format=%s origin/master >actual &&
> - cat  -7
> -6
> -5
> -4
> -3
> -EOF
> - test_cmp expect actual
> - )
> -'
> -
> -# This test is tricky. We need large enough "have"s that fetch-pack
> -# will put pkt-flush in between. Then we need a "have" the server
> -# does not have, it'll send "ACK %s ready"
> -test_expect_success 'no shallow lines after receiving ACK ready' '
> - (
> - cd shallow &&
> - for i in $(test_seq 10)
> - do
> - git checkout --orphan unrelated$i &&
> - test_commit unrelated$i &&
> - git push -q "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
> - refs/heads/unrelated$i:refs/heads/unrelated$i &&
> - git push -q ../clone/.git \
> - refs/heads/unrelated$i:refs/heads/unrelated$i ||
> - exit 1
> - done &&
> - git checkout master &&
> - test_commit new &&
> - git push  "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" master
> - ) &&
> - (
> - cd clone &&
> - git checkout --orphan newnew &&
> - test_commit new-too &&
> - GIT_TRACE_PACKET="$TRASH_DIRECTORY/trace" git fetch --depth=2 &&
> - grep "fetch-pack< ACK .* ready" ../trace &&
> - ! grep "fetch-pack> done" ../trace
> - )
> -'
> -
> -stop_httpd
>  test_done
> diff --git a/t/t5539-fetch-http-shallow.sh b/t/t5539-fetch-http-shallow.sh
> new file mode 100755
> index 000..94553e1
> --- /dev/null
> +++ b/t/t5539-fetch-http-shallow.sh
> @@ -0,0 +1,82 @@
> +#!/bin/sh
> +
> +test_description='fetch/clone from a shallow clone over http'
> +
> +. ./test-lib.sh
> +
> +if test -n "$NO_CURL"; then
> + skip_all='skipping test, git built without http support'
> + test_done
> +fi
> +
> +. "$TEST_DIRECTORY"/lib-httpd.sh
> +start_httpd
> +
> +commit() {
> + echo "$1" >tracked &&
> + git add tracked &&
> + git commit -m "$1"
> +}
> +
> +test_expect_success 'setup shallow clone' '
> + commit 1 &&
> + commit 2 &&
> + commit 3 &&
> + commit 4 &&
> + commit 5 &&
> + commit 6 &&
> + commit 7 &&
> + git clone --no-local --depth=5 .git shallow &&
> + git config --global transfer.fsckObjects true
> +'
> +
> +test_expect_success 'clone http repository' '
> + git clone --bare --no-local shallow 
> "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
> + git clone $HTTPD_URL/smart/repo.git clone &&
> + (
> + cd clone &&
> + git fsck &&
> + git log --format=%s origin/master >actual &&
> + cat  +7
> +6
> +5
> +4
> +3
> +EOF
> + test_cmp expect actual
> + )
> +'
> +
> +# This test is tricky. We need large enough "have"s that fetch-pack
> +# will put pkt-flush in between. Then we need a "have" the server
> +# does not have, it'll send "ACK %s ready"
> +test_expect_success 'no shallow lines after receiving ACK

Re: [PATCH] release notes: typo fixes

2014-02-13 Thread Junio C Hamano
Michael J Gruber  writes:

> Signed-off-by: Michael J Gruber 
> ---
> Just a few things I spotted while trying to keep myself informed :)

Thanks.  Very much appreciated.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Thomas Rast
Jeff King  writes:

>   - somebody to be the backup admin (I am assuming I'll be the primary
> admin, but as always, if somebody else wants to...)

I can be backup, if Shawn doesn't want it.

>   - ideas ideas ideas

Here's my moonshot:

--- 8< ---
Replace object loading/writing layer by libgit2

Git reads objects from storage (loose and packed) through functions in
sha1_file.c.  Most commands only require very simple, opaque read and
write access to the object storage.  As a weatherballoon, show that it
is feasible to use libgit2 git_odb_* routines for these simple callers.

Aim for passing the git test suite using git_odb_* object storage
access, except for tests that verify behavior in the face of storage
corruption, replacement objects, alternate storage locations, and
similar quirks.  Of course it is even better if you pass the test suite
without exception.

Language: C
Difficulty: hard
Possible mentors: Thomas Rast and 
--- >8 ---

That absolutely requires a co-mentor from the libgit2 side to do,
however.  Perhaps you could talk someone into it? ;-)

Motivation: I believe that migrating to libgit2 is the better approach,
medium term, than rewriting everything ourselves to be nice, clean and
thread-safe.  I took a shot a while ago at making the pack reading code
thread-safe, but it's adding mess when we could simply replace it all by
the already thread-safe libgit2 calls.  It also helps shake out
incompatibilities in libgit2.

Downside: not listing "code merged" as a goal may not make the project
as shiny, neither for Git nor for the student.

-- 
Thomas Rast
t...@thomasrast.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] Git v1.8.5.5

2014-02-13 Thread Junio C Hamano
The latest maintenance release Git v1.8.5.5 is now available at
the usual places.  Hopefully this will be the last update to the
1.8.5.x series.

The release tarballs are found at:

http://code.google.com/p/git-core/downloads/list

and their SHA-1 checksums are:

7bb4ea883b1f8f6f7f927035f85e8e27b57e0194  git-1.8.5.5.tar.gz
39dd7979c8757d2dc4bc3aaa82741ba93557d566  git-htmldocs-1.8.5.5.tar.gz
a4a2aef1440d4751f37c65359da57c9bd51a7beb  git-manpages-1.8.5.5.tar.gz

The following public repositories all have a copy of the v1.8.5.5
tag and the maint branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://code.google.com/p/git-core/
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git

Also, http://www.kernel.org/pub/software/scm/git/ has copies of the
release tarballs.

Git v1.8.5.5 Release Notes
==

Fixes since v1.8.5.4


 * The pathspec matching code, while comparing two trees (e.g. "git
   diff A B -- path1 path2") was too aggressive and failed to match
   some paths when multiple pathspecs were involved.

 * "git repack --max-pack-size=8g" stopped being parsed correctly when
   the command was reimplemented in C.

 * A recent update to "git send-email" broke platforms where
   /etc/ssl/certs/ directory exists but cannot be used as SSL_ca_path
   (e.g. Fedora rawhide).

 * A handful of bugs around interpreting $branch@{upstream} notation
   and its lookalike, when $branch part has interesting characters,
   e.g. "@", and ":", have been fixed.

 * "git clone" would fail to clone from a repository that has a ref
   directly under "refs/", e.g. "refs/stash", because different
   validation paths do different things on such a refname.  Loosen the
   client side's validation to allow such a ref.

 * "git log --left-right A...B" lost the "leftness" of commits
   reachable from A when A is a tag as a side effect of a recent
   bugfix.  This is a regression in 1.8.4.x series.

 * "git merge-base --octopus" used to leave cleaning up suboptimal
   result to the caller, but now it does the clean-up itself.

 * "git mv A B/", when B does not exist as a directory, should error
   out, but it didn't.

Also contains typofixes, documentation updates and trivial code clean-ups.



Changes since v1.8.5.4 are as follows:

Andy Spencer (1):
  tree_entry_interesting: match against all pathspecs

Jeff King (9):
  fetch-pack: do not filter out one-level refs
  interpret_branch_name: factor out upstream handling
  interpret_branch_name: rename "cp" variable to "at"
  interpret_branch_name: always respect "namelen" parameter
  interpret_branch_name: avoid @{upstream} past colon
  interpret_branch_name: find all possible @-marks
  repack: fix typo in max-pack-size option
  repack: make parsed string options const-correct
  repack: propagate pack-objects options as strings

Junio C Hamano (5):
  merge-base: separate "--independent" codepath into its own helper
  merge-base --octopus: reduce the result from get_octopus_merge_bases()
  revision: mark contents of an uninteresting tree uninteresting
  revision: propagate flag bits from tags to pointees
  Git 1.8.5.5

Ruben Kerkhof (1):
  send-email: /etc/ssl/certs/ directory may not be usable as ca_path

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Junio C Hamano
Thomas Rast  writes:

> Downside: not listing "code merged" as a goal may not make the project
> as shiny, neither for Git nor for the student.

I'd actually view that as an upside. This sounds like a good first
step for a feasibility study that is really necessary.

I wonder why the handling of storage corruption and replacement
could be left broken, though. Is that because libgit2 has known
breakages in these areas, or is there some other reason?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Karsten Blees
Am 13.02.2014 19:38, schrieb Zachary Turner:

> The only reason ReOpenFile is necessary at 
> all is because some code somewhere is mixing read-styles against the same 
> fd.
> 

I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the 
HANDLE's file position, so you should be able to mix read()/pread() however you 
like (as long as read() is only called from one thread).

I tried without ReOpenFile and it seems to work like a charm, or am I missing 
something?

8<
ssize_t mingw_pread(int fd, void *buf, size_t count, off64_t offset)
{
DWORD bytes_read;
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(overlapped));
overlapped.Offset = (DWORD) offset;
overlapped.OffsetHigh = (DWORD) (offset >> 32);

if (!ReadFile((HANDLE) _get_osfhandle(fd), buf, count, &bytes_read,
&overlapped)) {
errno = err_win_to_posix(GetLastError());
return -1;
}
return (ssize_t) bytes_read;
}

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Stefan Zager
On Thu, Feb 13, 2014 at 2:51 PM, Karsten Blees  wrote:
> Am 13.02.2014 19:38, schrieb Zachary Turner:
>
>> The only reason ReOpenFile is necessary at
>> all is because some code somewhere is mixing read-styles against the same
>> fd.
>>
>
> I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the 
> HANDLE's file position, so you should be able to mix read()/pread() however 
> you like (as long as read() is only called from one thread).

That is, apparently, a bald-faced lie in the ReadFile API doc.  First
implementation didn't use ReOpenFile, and it crashed all over the
place.  ReOpenFile fixed it.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make the git codebase thread-safe

2014-02-13 Thread Zachary Turner
To elaborate a little bit more, you can verify with a sample program
that ReadFile with OVERLAPPED does in fact modify the HANDLE's file
position.  The documentation doesn't actually state one way or
another.   My original attempt at a patch didn't have the ReOpenFile,
and we experienced regular read corruption.  We scratched our heads
over it for a bit, and then hypothesized that someone must be mixing
read styles, which led to this ReOpenFile workaround, which
incidentally also solved the corruption problems.  We wrote a similar
sample program to verify that when using ReOpenHandle, and changing
the file pointer of the duplicated handle, that the file pointer of
the original handle is not modified.

We did not actually try to identify the source of the mixed read
styles, but it seems like the only possible explanation.

On Thu, Feb 13, 2014 at 2:53 PM, Stefan Zager  wrote:
> On Thu, Feb 13, 2014 at 2:51 PM, Karsten Blees  
> wrote:
>> Am 13.02.2014 19:38, schrieb Zachary Turner:
>>
>>> The only reason ReOpenFile is necessary at
>>> all is because some code somewhere is mixing read-styles against the same
>>> fd.
>>>
>>
>> I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the 
>> HANDLE's file position, so you should be able to mix read()/pread() however 
>> you like (as long as read() is only called from one thread).
>
> That is, apparently, a bald-faced lie in the ReadFile API doc.  First
> implementation didn't use ReOpenFile, and it crashed all over the
> place.  ReOpenFile fixed it.
>
> Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Make the global packed_git variable static to sha1_file.c.

2014-02-13 Thread Stefan Zager
I uploaded a new patch; a few comments inline below...

On Wed, Feb 12, 2014 at 1:19 PM, Junio C Hamano  wrote:
> sza...@chromium.org writes:
>
> Also I'd suggest s/pack_data_fn/collect_pack_data/ or something.
> "_fn" may be a good suffix for typedef'ed typename used in a
> callback function, but for a concrete function, it only adds noise
> without giving us anything new.

Fixed this everywhere.

>>  static struct cached_object *find_cached_object(const unsigned char *sha1)
>> @@ -468,7 +469,6 @@ static unsigned int pack_open_fds;
>>  static unsigned int pack_max_fds;
>>  static size_t peak_pack_mapped;
>>  static size_t pack_mapped;
>> -struct packed_git *packed_git;
>
> Hmm, any particular reason why only this variable and not others are
> moved up?

No, just need packed_git declared before use.  I moved all the static
variables up, for clarity.

>> + foreach_packed_git(find_pack_fn, NULL, &fpd);
>> + if (fpd.found_pack && !exclude &&
>> + (incremental ||
>> +  (local && fpd.found_non_local_pack) ||
>> +  (ignore_packed_keep && fpd.found_pack_keep)))
>> + return 0;
>
> When told to do --incremental, we used to return 0 from this
> function immediately once we find the object in one pack, without
> going thru the list of packs.  Now we let foreach to loop thru all
> of them and then return 0.  Does this difference matter?  A similar
> difference may exist for local/keep but I did not think it through.

Fixed.



Thanks,

Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Ramkumar Ramachandra
Jeff King wrote:
>   - ideas ideas ideas

I'll throw in a few ideas from half-finished work.

1. Speed up git-rebase--am.sh

Currently, git-rebase--am.sh is really slow because it dumps each
patch to a file using git-format-patch, and picks it up to apply
subsequently using git-am. Find a way to speed this up, without
sacrificing safety. You can use the continuation features of
cherry-pick, and dump to file only to persist state in the case of a
failure.

Language: Shell script, C
Difficulty: Most of the difficulty lies in "what to do", not so much
"how to do it". Might require modifying cherry-pick to do additional
work on failure.

2. Invent new conflict style

As an alternative to the diff3 conflict style, invent a conflict style
that shows the original unpatched segment along with the raw patch
text. The user can then apply the patch by hand.

Language: C
Difficulty: Since it was first written, very few people have touched
the xdiff portion of the code. Since the area is very core to git, the
series will have to go through a ton of iterations.

3. Rewrite git-branch to use git-for-each-ref

For higher flexibility in command-line options and output format, use
git for-each-ref to re-implement git-branch. The first task is to grow
features that are in branch but not fer into fer (like --column,
--merged, --contains). The second task is to refactor fer so that an
external program can call into it.

Language: C
Difficulty: fer was never written with the idea of being reusable; it
therefore persists a lot of global state, and even leaks memory in
some places. Refactoring it to be more modern is definitely a
challenge.

4. Implement @{publish}
(I just can't find the time to finish this)

@{publish} is a feature like @{upstream}, showing the state of the
publish-point in the case of triangular workflows. Implement this
while sharing code with git-push, and polish it until the prompt shows
publish-state.

Language: C, Shell script
Difficulty: Once you figure out how to share code with git-push, this
task should be relatively straightforward.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Make the global packed_git variable static to sha1_file.c.

2014-02-13 Thread szager
This is a first step in making the codebase thread-safe.  By and
large, the operations which might benefit from threading are those
that work with pack files (e.g., checkout, blame), so the focus of
this patch is stop leaking the global list of pack files outside of
sha1_file.c.

The next step will be to control access to the list of pack files
with a mutex.  However, that alone is not enough to make pack file
access thread safe.  Even in a read-only operation, the window list
associated with each pack file will need to be controlled.
Additionally, the global counters in sha1_file.c will need to be
controlled.

This patch is a pure refactor with no functional changes, so it
shouldn't require any additional tests.  Adding the actual locks
will be a functional change, and will require additional tests.

Signed-off-by: Stefan Zager 
---
 builtin/count-objects.c  |  44 ++-
 builtin/fsck.c   |  46 +++-
 builtin/gc.c |  26 +++
 builtin/pack-objects.c   | 189 ---
 builtin/pack-redundant.c |  37 +++---
 cache.h  |  18 -
 fast-import.c|   4 +-
 http-backend.c   |  28 ---
 http-push.c  |   4 +-
 http-walker.c|   2 +-
 pack-revindex.c  |  20 ++---
 server-info.c|  36 -
 sha1_file.c  |  52 ++---
 sha1_name.c  |  18 -
 14 files changed, 327 insertions(+), 197 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a7f70cb..a27c006 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -83,14 +83,32 @@ static char const * const count_objects_usage[] = {
NULL
 };
 
+struct pack_data {
+   unsigned long packed;
+   off_t size_pack;
+   unsigned long num_pack;
+};
+
+static int count_pack_objects(struct packed_git *p, void *data)
+{
+   struct pack_data *pd = (struct pack_data *) data;
+   if (p->pack_local && !open_pack_index(p)) {
+   pd->packed += p->num_objects;
+   pd->size_pack += p->pack_size + p->index_size;
+   pd->num_pack++;
+   }
+   return 0;
+}
+
 int cmd_count_objects(int argc, const char **argv, const char *prefix)
 {
int i, verbose = 0, human_readable = 0;
const char *objdir = get_object_directory();
int len = strlen(objdir);
char *path = xmalloc(len + 50);
-   unsigned long loose = 0, packed = 0, packed_loose = 0;
+   unsigned long loose = 0, packed_loose = 0;
off_t loose_size = 0;
+   struct pack_data pd = {0, 0, 0};
struct option opts[] = {
OPT__VERBOSE(&verbose, N_("be verbose")),
OPT_BOOL('H', "human-readable", &human_readable,
@@ -118,41 +136,29 @@ int cmd_count_objects(int argc, const char **argv, const 
char *prefix)
closedir(d);
}
if (verbose) {
-   struct packed_git *p;
-   unsigned long num_pack = 0;
-   off_t size_pack = 0;
struct strbuf loose_buf = STRBUF_INIT;
struct strbuf pack_buf = STRBUF_INIT;
struct strbuf garbage_buf = STRBUF_INIT;
-   if (!packed_git)
-   prepare_packed_git();
-   for (p = packed_git; p; p = p->next) {
-   if (!p->pack_local)
-   continue;
-   if (open_pack_index(p))
-   continue;
-   packed += p->num_objects;
-   size_pack += p->pack_size + p->index_size;
-   num_pack++;
-   }
+   prepare_packed_git();
+   foreach_packed_git(count_pack_objects, NULL, &pd);
 
if (human_readable) {
strbuf_humanise_bytes(&loose_buf, loose_size);
-   strbuf_humanise_bytes(&pack_buf, size_pack);
+   strbuf_humanise_bytes(&pack_buf, pd.size_pack);
strbuf_humanise_bytes(&garbage_buf, size_garbage);
} else {
strbuf_addf(&loose_buf, "%lu",
(unsigned long)(loose_size / 1024));
strbuf_addf(&pack_buf, "%lu",
-   (unsigned long)(size_pack / 1024));
+   (unsigned long)(pd.size_pack / 1024));
strbuf_addf(&garbage_buf, "%lu",
(unsigned long)(size_garbage / 1024));
}
 
printf("count: %lu\n", loose);
printf("size: %s\n", loose_buf.buf);
-   printf("in-pack: %lu\n", packed);
-   printf("packs: %lu\n", num_pack);
+   printf("in-pack: %lu\n", pd.packed);
+   printf("packs: %lu\n", pd.num_pack);
printf("size-pac

git stash and --skip-worktree files

2014-02-13 Thread Mick Killianey
My team is starting to use the --skip-worktree flag on files to allow them to 
keep modified files in their working directories.  Ideally, they'd like an 
option for "git-stash" to apply to skip-worktree files, just as it can 
(optionally) be applied to untracked files, so that skip-worktree files could 
be included when saving and restored (ideally, with the skip-worktree bit set) 
on a git-stash pop.

(1)  I don't see an option for "git-stash" to support skip-worktree files...is 
it there and I'm just not seeing it?
(2)  Does anyone know if there's a reason why this is not possible/unreasonable 
to support/philosophically discouraged?
(3)  If it is reasonable, where would the right place be to suggest/request 
this feature?

I'm happy to look at making such changes to git-stash.sh, but before I do so, 
I'd love to know that such work would be aligned with the design philosophy of 
git-stash.  (For example, if the git team would only want such an option if it 
also included the ability to stash assume-unchanged files, too, that would be 
useful to know before I start making changes.)

Mick
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git GSoC 2014

2014-02-13 Thread Thomas Rast
Junio C Hamano  writes:

> Thomas Rast  writes:
>
>> Downside: not listing "code merged" as a goal may not make the project
>> as shiny, neither for Git nor for the student.
>
> I'd actually view that as an upside. This sounds like a good first
> step for a feasibility study that is really necessary.
>
> I wonder why the handling of storage corruption and replacement
> could be left broken, though. Is that because libgit2 has known
> breakages in these areas, or is there some other reason?

It's because I don't know enough about what libgit2's state is, and I
wanted to keep the scope limited.  Naturally, the next step would then
be to implement the lacking functionality (if any) in libgit2 so that
the test suite passes.  I just don't know if that's trivial, or
something for the "if we have time" section of the project, or too much
work.

(I did do a quick "can we reasonably link against libgit2" test where I
gave git-cat-file a --libgit2 option that loads blobs with libgit2.
There are some name collisions in the git_config* identifiers that need
to be resolved, but otherwise it seems entirely possible.)

-- 
Thomas Rast
t...@thomasrast.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html