Re: pack bitmap woes on Windows
Am 2/12/2014 17:48, schrieb Jeff King: > On Wed, Feb 12, 2014 at 03:49:24PM +0100, Erik Faye-Lund wrote: > >> It seems the author of a201c20b41a2f0725977bcb89a2a66135d776ba2 >> assumes __BYTE_ORDER was guaranteed to always be defined, probably >> fooled by the following check: >> >> #if !defined(__BYTE_ORDER) >> # error "Cannot determine endianness" >> #endif >> >> However, that check is only performed if we're NOT building with GCC >> on x86/x64 nor MSVC (which I don't think defined __BYTE_ORDER either) >> >> The following would make __BYTE_ORDER a guarantee, and show that MinGW >> does not define it: > > Yes, that is the problem. Sorry, this got brought up earlier[1], but the > discussion went on and I did not notice that Brian's patch did not get > applied. > > After re-reading the discussion, I think the patch below is the best > solution. Can you confirm that it fixes the problem for you? Yes, it fixes the problem! t5310-pack-bitmaps passes all tests with the patch. Thanks, -- Hannes > > -Peff > > [1] http://thread.gmane.org/gmane.comp.version-control.git/241278 > > -- >8 -- > Subject: ewah: unconditionally ntohll ewah data > > Commit a201c20 tried to optimize out a loop like: > > for (i = 0; i < len; i++) > data[i] = ntohll(data[i]); > > in the big-endian case, because we know that ntohll is a > noop, and we do not need to pay the cost of the loop at all. > However, it mistakenly assumed that __BYTE_ORDER was always > defined, whereas it may not be on systems which do not > define it by default, and where we did not need to define it > to set up the ntohll macro. This includes OS X and Windows. > > We could muck with the ordering in compat/bswap.h to make > sure it is defined unconditionally, but it is simpler to > still to just execute the loop unconditionally. That avoids > the application code knowing anything about these magic > macros, and lets it depend only on having ntohll defined. > > And since the resulting loop looks like (on a big-endian > system): > > for (i = 0; i < len; i++) > data[i] = data[i]; > > any decent compiler can probably optimize it out. > > Original report and analysis by Brian Gernhardt. > > Signed-off-by: Jeff King > --- > ewah/ewah_io.c | 10 +++--- > 1 file changed, 3 insertions(+), 7 deletions(-) > > diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c > index 4a7fae6..f7f700e 100644 > --- a/ewah/ewah_io.c > +++ b/ewah/ewah_io.c > @@ -113,6 +113,7 @@ int ewah_serialize(struct ewah_bitmap *self, int fd) > int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len) > { > uint8_t *ptr = map; > + size_t i; > > self->bit_size = get_be32(ptr); > ptr += sizeof(uint32_t); > @@ -135,13 +136,8 @@ int ewah_read_mmap(struct ewah_bitmap *self, void *map, > size_t len) > memcpy(self->buffer, ptr, self->buffer_size * sizeof(uint64_t)); > ptr += self->buffer_size * sizeof(uint64_t); > > -#if __BYTE_ORDER != __BIG_ENDIAN > - { > - size_t i; > - for (i = 0; i < self->buffer_size; ++i) > - self->buffer[i] = ntohll(self->buffer[i]); > - } > -#endif > + for (i = 0; i < self->buffer_size; ++i) > + self->buffer[i] = ntohll(self->buffer[i]); > > self->rlw = self->buffer + get_be32(ptr); > > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tree-walk: be more specific about corrupt tree errors
On 02/12/2014 08:49 PM, Jeff King wrote: > When the tree-walker runs into an error, it just calls > die(), and the message is always "corrupt tree file". > However, we are actually covering several cases here; let's > give the user a hint about what happened. > > Let's also avoid using the word "corrupt", which makes it > seem like the data bit-rotted on disk. Our sha1 check would > already have found that. These errors are ones of data that > is malformed in the first place. > > Signed-off-by: Jeff King > --- > Michael and I have been looking off-list at some bogus trees (created by > a non-git.git implementation). git-fsck unhelpfully dies during the > tree-walk: > > $ git fsck > Checking object directories: 100% (256/256), done. > fatal: corrupt tree file > > I think in the long run we want to either teach fsck to avoid the > regular tree-walker or to set a special "continue as much as you can" > flag. That will let us keep going to find more errors, do our usual fsck > error checks (which are more strict), and especially report on _which_ > object was broken (we can't do that here because we are deep in the call > stack and may not even have a real object yet). > > This patch at least gives us slightly more specific error messages (both > for fsck and for other commands). And it may provide a first step in > clarity if we follow the "continue as much as you can" path. > > tree-walk.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/tree-walk.c b/tree-walk.c > index 79dba1d..d53b084 100644 > --- a/tree-walk.c > +++ b/tree-walk.c > @@ -28,11 +28,13 @@ static void decode_tree_entry(struct tree_desc *desc, > const char *buf, unsigned > unsigned int mode, len; > > if (size < 24 || buf[size - 21]) > - die("corrupt tree file"); > + die("truncated tree file"); > I suggest splitting this into two separate checks, because the first boolean definitely indicates a truncated file, whereas the second boolean could indicate malformedness that is not caused by truncation. Another tiny point: I suppose that the number "24" comes from A one-digit mode SP A one-character filename NUL 20-byte SHA1 But given that you are detecting zero-length filenames a few lines later, I think it makes logical sense to admit for that possibility here, by reducing 24 -> 23. (I realize that it doesn't change the end result, because the only syntactically correct situation with length=23 would be a doubly-broken line that has a one-digit mode *and* a zero-length filename, and it's arbitrary which of the forms of brokenness we report in such a case.) > path = get_mode(buf, &mode); > - if (!path || !*path) > - die("corrupt tree file"); > + if (!path) > + die("malformed mode in tree entry"); > + if (!*path) > + die("empty filename in tree entry"); > len = strlen(path) + 1; > > /* Initialize the descriptor entry */ > @@ -81,7 +83,7 @@ void update_tree_entry(struct tree_desc *desc) > unsigned long len = end - (const unsigned char *)buf; > > if (size < len) > - die("corrupt tree file"); > + die("truncated tree file"); > buf = end; > size -= len; > desc->buffer = buf; > Otherwise, I think this is a nice improvement. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
Am 2/12/2014 20:30, schrieb Stefan Zager: > On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees > wrote: >> Am 12.02.2014 19:37, schrieb Erik Faye-Lund: >>> ReOpenFile, that's fantastic. Thanks a lot! >> >> ...but should be loaded dynamically via GetProcAddress, or are we ready to >> drop XP support? > > Right, that is an issue. From our perspective, it's well past time to > drop XP support. Not from mine. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
Mike Hommey writes: > On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote: >> Stefan Zager writes: >> >> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup wrote: >> > >> >> Really, give the above patch a try. I am taking longer to finish it >> >> than anticipated (with a lot due to procrastination but that is, >> >> unfortunately, a large part of my workflow), and it's cutting into my >> >> "paychecks" (voluntary donations which to a good degree depend on timely >> >> and nontrivial progress reports for my freely available work on GNU >> >> LilyPond). >> > >> > I will give that a try. How much of a performance improvement have >> > you clocked? >> >> Depends on file type and size. With large files with lots of small >> changes, performance improvements get more impressive. >> >> Some ugly real-world examples are the Emacs repository, src/xdisp.c >> (performance improvement about a factor of 3), a large file in the style >> of /usr/share/dict/words clocking in at a factor of about 5. >> >> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there >> are no improvements in system time (I/O) except for patch 4 of the >> series which helps perhaps 20% or so. >> >> So the benefits of the patch will come into play mostly for big, bad >> files on Windows: other than that, the I/O time is likely to be the >> dominant player anyway. > > How much fragmentation does that add to the files, though? Uh, git-blame is a read-only operation. It does not add fragmentation to any file. The patch will add a diff of probably a few dozen hunks to builtin/blame.c. Do you call that "fragmentation"? It is small enough that I expect even git blame builtin/blame.c to be faster than before. But that interpretation of your question probably tries to make too much sense out of what is just nonsense in the given context. -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
David Kastrup writes: > Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there > are no improvements in system time (I/O) except for patch 4 of the > series which helps perhaps 20% or so. > > So the benefits of the patch will come into play mostly for big, bad > files on Windows: other than that, the I/O time is likely to be the > dominant player anyway. > > If you have benchmarked the stuff, for annoying cases expect I/O time > to go down maybe 10-20%, and user time to drop by a factor of 4. > Under GNU/Linux, that makes for a significant overall improvement. On > Windows, the payback is likely quite less because of the worse I/O > performance. Pity. But of course, you can significantly reduce the relevant file open/close/search times by running git gc --aggressive While this does not actually help with performance in GNU/Linux (though with file space), dealing with few but compressed files under Windows is likely a reasonably big win since the uncompression happens in user space and cannot be bungled by Microsoft (apart from bad memory management strategies). -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] hackday and GSoC topic suggestions
On Tue, Feb 11, 2014 at 11:38:09AM -0800, Junio C Hamano wrote: > Junio C Hamano writes: > > > Jeff King writes: > > > >> - Branch rename breaks local downstream branches > >>http://article.gmane.org/gmane.comp.version-control.git/241228 > > > > If you have a branch B that builds on A, if you are renaming A to C, > > you may want B to automatically set to build on C in some cases, and > > in other cases your renaming A is done explicitly in order to severe > > the tie between A and B and setting the latter to build on C can be > > a bad thing---after all, the user's intention may be to create a > > branch A starting at some commit immediately after this rename so > > that B will keep building on that updated A. > > > > So I am not sure if this is a bug. > > Having said that, the current behaviour of leaving B half-configured > to build on a missing branch is undesirable. If we were to change > this so that any branch B that used to build on branch A being > renamed to build on the branch under the new name C, the user may > have to do an extra "--set-upstream-to A" on B after recreating A if > this was done to save away the current state of A to C and then keep > building B on an updated A, so we may have to give _some_ clue what > we are doing behind their back when we rename, e.g. > > $ git branch -m A C > warning: branch B is set to build on C now. > > or something. Yeah, I agree. I think most of the time people would want the dependency to move with the branch, and the key is being clear about it so the user can override. I don't think there is a problem with overriding with `--set-upstream-to` after the fact, rather than giving a special option to the move operation. There was a team working on this at the hackday, and they seemed to make reasonable progress, but I haven't seen the final output yet. Most teams did not finish their projects, or if they did, I didn't get a chance to coach them through the patch submission process. I'll hopefully be revisiting those in the next week or two as they finish them up offline. My goal is to get them all posting to the list themselves, but I fear I may have to pick up the pieces in some cases. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
Johannes Sixt writes: > Am 2/12/2014 20:30, schrieb Stefan Zager: >> On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees >> wrote: >>> Am 12.02.2014 19:37, schrieb Erik Faye-Lund: ReOpenFile, that's fantastic. Thanks a lot! >>> >>> ...but should be loaded dynamically via GetProcAddress, or are we >>> ready to drop XP support? >> >> Right, that is an issue. From our perspective, it's well past time to >> drop XP support. > > Not from mine. XP has not even reached end of life yet. As a point of comparison, there are tensions on the Emacs developer list several times a decade because some people suggest it might be time to drop the MSDOS port and/or the associated restriction of having filenames be unique in the 8+3 naming scheme. -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] hackday and GSoC topic suggestions
On Tue, Feb 11, 2014 at 10:57:22AM -0800, Junio C Hamano wrote: > > - git stash doesn't use --index as default > >http://article.gmane.org/gmane.comp.version-control.git/235892 > > I tend to think "git stash" was designed to work this way from day > one. Right. The thing that bothers me is the data loss of: echo 1 >foo git add foo echo 2 >foo git stash git stash pop The content in "1" is stashed, but gets lost on the apply, and our pop drops the stash. > > - using git commit-tree with "-F -" adds trailing newlines > >http://article.gmane.org/gmane.comp.version-control.git/236583 > > According to the initial commit that introduced this option, it > deliberately calls strbuf_complete_line() to make sure we do not > leave an incomplete line at the end. Yeah, I think this one is a non-bug. I hadn't read it carefully but the "trailing newlines" made me think it was adding multiple extra blank lines. But it looks like it is just the normal strbuf_complete_line, and that is the right thing to be doing (it has _never_ been encouraged to have a commit message that does not have a newline at all, and I think stripspace has enforced that from day one). > Many of the other items in your "bugs" section seem to be worth > fixing. As I mentioned elsewhere, I've yet to see the full results of the hackday. Hopefully soon. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Process of updating a local branch from another with tracking
On 13/02/14 10:55, Robert Dailey wrote: > I have the following alias defined: > sync = "!f() { cbr=$(git rev-parse --abbrev-ref HEAD); git co $1 && > git pull && git co $cbr && git rebase $1; }; f" > > The goal is to basically update a local branch which tracks a branch > on a remote, and then rebase my local topic branch onto that updated > local branch. Generally speaking there is little benefit in manually keeping a local branch that just tracks a remote one because you already have a "remote tracking branch" which gets updated whenever you fetch from that remote. So you could be doing something like git fetch origin # or whatever other remotes you may have configured git rebase origin/master # or whatever branch you're tracking You can also tell git this kind of information when you create your topic branch and git will to this all for you. git checkout -b topic1 origin/master git pull -r # -r makes pull run fetch + rebase instead of pull + merge If you already have a topic branch and you want to tell it what remote branch (upstream) to track you can do that too. git checkout topic git branch -u origin/master git pull -r > I do this instead of just rebasing onto origin/master. Example: > > git checkout master > git pull --rebase origin master > git checkout topic1 > git rebase master > > I could do this instead but not sure if it is recommended: > > git checkout topic1 > git fetch > git rebase origin/master I do it all the time. Actually I set the upstream and I can just use 'git pull -r'. > Any thoughts on the alias I created? I'm a Windows user and still new > to linux stuff, so if it sucks or could be simplified just let me > know. Aliases are fine to save yourself some typing. In this case I think you can just set the upstream branch and use 'git pull -r'. There is a config option to make this the default but you probably want to be comfortable with the difference between merging and rebasing before you set that. > And as a secondary question, just curious what people think about > rebasing onto a remote tracking branch. When I do merges I usually > refer to the remote branch, but during rebase I use the local branch > instead, but I don't know if there is any functional reason to not > skip the local branch and go straight to the remote tracking branch > (it saves a step). Branches are branches. You might use multiple local branches to separate dependent topics that you're working on. You might do the same with remote branches if you have topics based on other peoples work. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] hackday and GSoC topic suggestions
On Thu, Feb 06, 2014 at 10:51:54AM +0100, Matthieu Moy wrote: > > Some of Matthieu's students worked on it a few years ago but didn't finish. > > Right. There was still quite some work to do, but this is most likely > too small for a GSoC project. But that could be a part of it. I'm not > sure how google welcomes GSoC projects made of multiple small tasks, but > my experience with students is that it's much better than a single (too) > big task, and I think that was the general feeling on this list when we > discussed it last year. I think Google leaves it up to us to decide. I'd be OK with a project made of multiple small tasks, as I think it would be an interesting experiment. I'd rather not do all of them like that, though. And bonus points if they are on a theme that will let the student use the ramp-up time from one for another. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git blame: "Not Committed Yet" with clean WD
Hi, for files that contain windows line endings in a repository with core.autocrlf=input, git blame will show lines as "Not Committed Yet", even though they were not modified. Example: -- git init git config core.autocrlf false echo "foo" > a unix2dos a git add a git commit -m "initial commit" git config core.autocrlf input git status git blame a -- Output: -- Reinitialized existing Git repository in /.../testblame2/.git/ unix2dos: converting file a to DOS format ... On branch master nothing to commit, working directory clean On branch master nothing to commit, working directory clean (Not Committed Yet 2014-02-13 10:02:43 +0100 1) foo -- Is there an easy way to work around this; is this desired behaviour or mor a bug? Thanks - Eph -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Git GSoC 2014
The Google Summer of Code application process is upon us. We have about 34 hours until the deadline (2014-02-14T19:00 UTC) . That's not very much time, but I know some people have been thinking about projects for a while, so I have hope that we can put together an ideas page. What we need immediately is: - somebody to be the backup admin (I am assuming I'll be the primary admin, but as always, if somebody else wants to...) - ideas ideas ideas The email I sent out last week generated some response, but what I really need are people to write up full ideas as they would be read by students (and by GSoC personnel who are reading our application). As mentioned in a related thread, I'd really like to avoid just reposting proposals that didn't get any traction in previous years. Tomorrow I'll try to look through some of the links that have been posted, or collect stuff from the list. But I really need other people to contribute, or it's going to be a lame list. I've put the application draft here: http://git.github.io/SoC-2014-Org-Application.html You can make suggestions in this thread, or edit it by cloning: https://github.com/git/git.github.io and post a patch, or send a pull request, or I should be able to give people push access to the repo. It's a bit of an experiment, in the same realm as the wiki we've used in the past. The idea page is there, too, but there's nothing good on it yet: http://git.github.io/SoC-2014-Ideas.html -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] hackday and GSoC topic suggestions
On Thu, Feb 13, 2014 at 9:50 AM, Jeff King wrote: > On Thu, Feb 06, 2014 at 10:51:54AM +0100, Matthieu Moy wrote: > >> > Some of Matthieu's students worked on it a few years ago but didn't finish. >> >> Right. There was still quite some work to do, but this is most likely >> too small for a GSoC project. But that could be a part of it. I'm not >> sure how google welcomes GSoC projects made of multiple small tasks, but >> my experience with students is that it's much better than a single (too) >> big task, and I think that was the general feeling on this list when we >> discussed it last year. > > I think Google leaves it up to us to decide. I'd be OK with a project > made of multiple small tasks, as I think it would be an interesting > experiment. I'd rather not do all of them like that, though. And bonus > points if they are on a theme that will let the student use the ramp-up > time from one for another. Yeah, a student working on the "git bisect fix/unfixed" feature, could fix git bisect testing too many merge bases, and if there is still time work on moving more code from shell to C. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
On Thu, Feb 13, 2014 at 07:04:02AM +0100, David Kastrup wrote: > Mike Hommey writes: > > > On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote: > >> Stefan Zager writes: > >> > >> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup wrote: > >> > > >> >> Really, give the above patch a try. I am taking longer to finish it > >> >> than anticipated (with a lot due to procrastination but that is, > >> >> unfortunately, a large part of my workflow), and it's cutting into my > >> >> "paychecks" (voluntary donations which to a good degree depend on timely > >> >> and nontrivial progress reports for my freely available work on GNU > >> >> LilyPond). > >> > > >> > I will give that a try. How much of a performance improvement have > >> > you clocked? > >> > >> Depends on file type and size. With large files with lots of small > >> changes, performance improvements get more impressive. > >> > >> Some ugly real-world examples are the Emacs repository, src/xdisp.c > >> (performance improvement about a factor of 3), a large file in the style > >> of /usr/share/dict/words clocking in at a factor of about 5. > >> > >> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there > >> are no improvements in system time (I/O) except for patch 4 of the > >> series which helps perhaps 20% or so. > >> > >> So the benefits of the patch will come into play mostly for big, bad > >> files on Windows: other than that, the I/O time is likely to be the > >> dominant player anyway. > > > > How much fragmentation does that add to the files, though? > > Uh, git-blame is a read-only operation. It does not add fragmentation > to any file. The patch will add a diff of probably a few dozen hunks to > builtin/blame.c. Do you call that "fragmentation"? It is small enough > that I expect even > > git blame builtin/blame.c > > to be faster than before. But that interpretation of your question > probably tries to make too much sense out of what is just nonsense in > the given context. Sorry, I thought you were talking about write operations, not reads. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
On Thu, Feb 13, 2014 at 06:34:39PM +0900, Mike Hommey wrote: > On Thu, Feb 13, 2014 at 07:04:02AM +0100, David Kastrup wrote: > > Mike Hommey writes: > > > > > On Wed, Feb 12, 2014 at 08:15:24PM +0100, David Kastrup wrote: > > >> Stefan Zager writes: > > >> > > >> > On Wed, Feb 12, 2014 at 10:50 AM, David Kastrup wrote: > > >> > > > >> >> Really, give the above patch a try. I am taking longer to finish it > > >> >> than anticipated (with a lot due to procrastination but that is, > > >> >> unfortunately, a large part of my workflow), and it's cutting into my > > >> >> "paychecks" (voluntary donations which to a good degree depend on > > >> >> timely > > >> >> and nontrivial progress reports for my freely available work on GNU > > >> >> LilyPond). > > >> > > > >> > I will give that a try. How much of a performance improvement have > > >> > you clocked? > > >> > > >> Depends on file type and size. With large files with lots of small > > >> changes, performance improvements get more impressive. > > >> > > >> Some ugly real-world examples are the Emacs repository, src/xdisp.c > > >> (performance improvement about a factor of 3), a large file in the style > > >> of /usr/share/dict/words clocking in at a factor of about 5. > > >> > > >> Again, that's with an SSD and ext4 filesystem on GNU/Linux, and there > > >> are no improvements in system time (I/O) except for patch 4 of the > > >> series which helps perhaps 20% or so. > > >> > > >> So the benefits of the patch will come into play mostly for big, bad > > >> files on Windows: other than that, the I/O time is likely to be the > > >> dominant player anyway. > > > > > > How much fragmentation does that add to the files, though? > > > > Uh, git-blame is a read-only operation. It does not add fragmentation > > to any file. The patch will add a diff of probably a few dozen hunks to > > builtin/blame.c. Do you call that "fragmentation"? It is small enough > > that I expect even > > > > git blame builtin/blame.c > > > > to be faster than before. But that interpretation of your question > > probably tries to make too much sense out of what is just nonsense in > > the given context. > > Sorry, I thought you were talking about write operations, not reads. Specifically, I thought you were talking about git checkout. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] hackday and GSoC topic suggestions
Christian Couder writes: > On Thu, Feb 13, 2014 at 9:50 AM, Jeff King wrote: > >> I think Google leaves it up to us to decide. I'd be OK with a project >> made of multiple small tasks, as I think it would be an interesting >> experiment. I'd rather not do all of them like that, though. And >> bonus points if they are on a theme that will let the student use the >> ramp-up time from one for another. > > Yeah, a student working on the "git bisect fix/unfixed" feature, could > fix git bisect testing too many merge bases, and if there is still > time work on moving more code from shell to C. In the context of programming tasks, "if there is still time" is a prime candidate for the successful application of branch prediction. -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] release notes: typo fixes
Signed-off-by: Michael J Gruber --- Just a few things I spotted while trying to keep myself informed :) Documentation/RelNotes/1.9.txt | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/Documentation/RelNotes/1.9.txt b/Documentation/RelNotes/1.9.txt index 43c7b68..a0cabcd 100644 --- a/Documentation/RelNotes/1.9.txt +++ b/Documentation/RelNotes/1.9.txt @@ -7,19 +7,19 @@ Backward compatibility notes "git submodule foreach $cmd $args" used to treat "$cmd $args" the same way "ssh" did, concatenating them into a single string and letting the shell unquote. Careless users who forget to sufficiently quote $args -gets their argument split at $IFS whitespaces by the shell, and got +get their argument split at $IFS whitespaces by the shell, and got unexpected results due to this. Starting from this release, the command line is passed directly to the shell, if it has an argument. Read-only support for experimental loose-object format, in which users -could optionally choose to write in their loose objects for a short -while between v1.4.3 to v1.5.3 era, has been dropped. +could optionally choose to write their loose objects for a short +while between v1.4.3 and v1.5.3 era, has been dropped. -The meanings of "--tags" option to "git fetch" has changed; the -command fetches tags _in addition to_ what are fetched by the same +The meanings of the "--tags" option to "git fetch" has changed; the +command fetches tags _in addition to_ what is fetched by the same command line without the option. -The way "git push $there $what" interprets $what part given on the +The way "git push $there $what" interprets the $what part given on the command line, when it does not have a colon that explicitly tells us what ref at the $there repository is to be updated, has been enhanced. @@ -96,7 +96,7 @@ UI, Workflows & Features primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This release attempts to allow object transfer out of a shallowly-cloned repository in a - more controlled way (i.e. the receiver become a shallow repository + more controlled way (i.e. the receiver becomes a shallow repository with a truncated history). * Just like we give a reasonable default for "less" via the LESS @@ -107,12 +107,12 @@ UI, Workflows & Features hierarchies, whose variables are predominantly three-level, were not completed by hitting a in bash and zsh completions. - * Fetching 'frotz' branch with "git fetch", while 'frotz/nitfol' + * Fetching a 'frotz' branch with "git fetch", while a 'frotz/nitfol' remote-tracking branch from an earlier fetch was still there, would error out, primarily because the command was not told that it is allowed to lose any information on our side. "git fetch --prune" - now can be used to remove 'frotz/nitfol' to make room to fetch and - store 'frotz' remote-tracking branch. + now can be used to remove 'frotz/nitfol' to make room for fetching and + storing the 'frotz' remote-tracking branch. * "diff.orderfile=" configuration variable can be used to pretend as if the "-O" option were given from the command @@ -218,7 +218,7 @@ track are contained in this release (see the maintenance releases' notes for details). * The pathspec matching code, while comparing two trees (e.g. "git - diff A B -- path1 path2") was too agrresive and failed to match + diff A B -- path1 path2") was too aggressive and failed to match some paths when multiple pathspecs were involved. (merge e4ddb05 as/tree-walk-fix-aggressive-short-cut later to maint). @@ -227,11 +227,11 @@ for details). (merge b861e23 sb/repack-in-c later to maint). * An earlier update in v1.8.4.x to "git rev-list --objects" with - negative ref had performance regression. + negative ref had a performance regression. (merge 200abe7 jk/mark-edges-uninteresting later to maint). * A recent update to "git send-email" broke platforms where - /etc/ssl/certs/ directory exists, but it cannot used as SSL_ca_path + /etc/ssl/certs/ directory exists but cannot be used as SSL_ca_path (e.g. Fedora rawhide). (merge 01645b7 rk/send-email-ssl-cert later to maint). @@ -275,8 +275,8 @@ for details). names. (merge 82246b7 nd/daemon-informative-errors-typofix later to maint). - * There is no reason to have a hardcoded upper limit of the number of - parents for an octopus merge, created via the graft mechanism, but + * There is no reason to have a hardcoded upper limit for the number of + parents of an octopus merge, created via the graft mechanism, but there was. (merge e228c17 js/lift-parent-count-limit later to maint). @@ -306,8 +306,8 @@ for details). * When we figure out how many file descriptors to allocate for keeping packfiles open, a system with non-working getrlimit() could cause us to die(), but because we make t
[no subject]
I CAN HELP YOU WITH THE LOAN YOU NEED AT 3% INTEREST RATE. CONTACT US AT:-ryanwrightfinance...@gmail.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] t5537: move http tests out to t5539
start_httpd is supposed to be at the beginning of the test file, not the middle of it. The "test_seq" line in "no shallow lines.." test is updated to compensate missing refs that are there in t5537, but not in the new t5539. Signed-off-by: Nguyễn Thái Ngọc Duy --- On Thu, Feb 13, 2014 at 8:22 AM, Duy Nguyen wrote: > On Thu, Feb 13, 2014 at 5:12 AM, Jeff King wrote: >> lib-httpd was never designed to be included from anywhere except the >> beginning of the file. But that wouldn't be right for t5537, because it >> wants to run some of the tests, even if apache setup fails. The right >> way to do it is probably to have lib-httpd do all of its work in a lazy >> prereq. I don't know how clunky that will end up, though; it might be >> simpler to just move the shallow http test into one of the http-fetch >> scripts. > > I'll move it out later. Here it is, on top of nd/http-fetch-shallow-fix because the new test in t5537 is picky and a simple merge resolution wouldn't do it. t/t5537-fetch-shallow.sh | 57 --- t/t5539-fetch-http-shallow.sh (new +x) | 82 ++ 2 files changed, 82 insertions(+), 57 deletions(-) diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index 098f220..3ae9092 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -173,61 +173,4 @@ EOF ) ' -if test -n "$NO_CURL" -o -z "$GIT_TEST_HTTPD"; then - say 'skipping remaining tests, git built without http support' - test_done -fi - -. "$TEST_DIRECTORY"/lib-httpd.sh -start_httpd - -test_expect_success 'clone http repository' ' - git clone --bare --no-local shallow "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && - git clone $HTTPD_URL/smart/repo.git clone && - ( - cd clone && - git fsck && - git log --format=%s origin/master >actual && - catdone" ../trace - ) -' - -stop_httpd test_done diff --git a/t/t5539-fetch-http-shallow.sh b/t/t5539-fetch-http-shallow.sh new file mode 100755 index 000..94553e1 --- /dev/null +++ b/t/t5539-fetch-http-shallow.sh @@ -0,0 +1,82 @@ +#!/bin/sh + +test_description='fetch/clone from a shallow clone over http' + +. ./test-lib.sh + +if test -n "$NO_CURL"; then + skip_all='skipping test, git built without http support' + test_done +fi + +. "$TEST_DIRECTORY"/lib-httpd.sh +start_httpd + +commit() { + echo "$1" >tracked && + git add tracked && + git commit -m "$1" +} + +test_expect_success 'setup shallow clone' ' + commit 1 && + commit 2 && + commit 3 && + commit 4 && + commit 5 && + commit 6 && + commit 7 && + git clone --no-local --depth=5 .git shallow && + git config --global transfer.fsckObjects true +' + +test_expect_success 'clone http repository' ' + git clone --bare --no-local shallow "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && + git clone $HTTPD_URL/smart/repo.git clone && + ( + cd clone && + git fsck && + git log --format=%s origin/master >actual && + cat
Re: [PATCH 11/11] tree-diff: reuse base str(buf) memory on sub-tree recursion
On Fri, Feb 07, 2014 at 09:48:52PM +0400, Kirill Smelkov wrote: > instead of allocating it all the time for every subtree in > __diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then all > callee just use it in stacking style, without memory allocations. > > This should be faster, and for me this change gives the following > slight speedups for `git log --raw --no-abbrev --no-renames` > > navy.gitlinux.git v3.10..v3.11 > > before 0.547s 1.791s > after 0.541s 1.777s > speedup 1.1%0.8% The timings above was done with `git log --raw --no-abbrev --no-renames --format='%H'` ^ Please change them to correct timings: navy.gitlinux.git v3.10..v3.11 before 0.618s 1.903s after 0.611s 1.889s speedup 1.1%0.7% Thanks, Kirill -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Multiparent diff tree-walker + combine-diff speedup
Here go combine-diff speedup patches in form of first reworking diff tree-walker to work in general case - when a commit have several parents, not only one - we are traversing all 1+nparent trees in parallel. Then we are taking advantage of the new diff tree-walker for speeding up combine-diff, which for linux.git results in ~14 times speedup. I understand v1.9.0 is going to be released first, but wanted to finally send the patches, so that people could start reviewing them. Please apply on top of ks/tree-diff-more and thanks beforehand, Kirill Kirill Smelkov (2): tree-diff: rework diff_tree() to generate diffs for multiparent cases as well combine-diff: speed it up, by using multiparent diff tree-walker directly combine-diff.c | 85 +- diff.c | 2 + diff.h | 10 ++ tree-diff.c| 501 + 4 files changed, 529 insertions(+), 69 deletions(-) -- 1.9.rc1.181.g641f458 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly
As was recently shown (c839f1bd "combine-diff: optimize combine_diff_path sets intersection"), combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(*), we can call multiparent tree-walker directly and get ready paths. (*) because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov --- combine-diff.c | 85 ++ diff.c | 1 + 2 files changed, 81 insertions(+), 5 deletions(-) diff --git a/combine-diff.c b/combine-diff.c index 1732dfd..ddf7495 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -1303,7 +1303,7 @@ static const char *path_path(void *obj) /* find set of paths that every parent touches */ -static struct combine_diff_path *find_paths(const unsigned char *sha1, +static struct combine_diff_path *find_paths_generic(const unsigned char *sha1, const struct sha1_array *parents, struct diff_options *opt) { struct combine_diff_path *paths = NULL; @@ -1316,6 +1316,7 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, /* tell diff_tree to emit paths in sorted (=tree) order */ opt->orderfile = NULL; + /* D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (wrt paths) */ for (i = 0; i < num_parent; i++) { /* * show stat against the first parent even when doing @@ -1346,6 +1347,35 @@ static struct combine_diff_path *find_paths(const unsigned char *sha1, } +/* + * find set of paths that everybody touches, assuming diff is run without + * rename/copy detection, etc, comparing all trees simultaneously (= faster). + */ +static struct combine_diff_path *find_paths_multitree( + const unsigned char *sha1, const struct sha1_array *parents, + struct diff_options *opt) +{ + int i, nparent = parents->nr; + const unsigned char **parents_sha1; + struct combine_diff_path paths_head; + struct strbuf base; + + parents_sha1 = xmalloc(nparent * sizeof(parents_sha1[0])); + for (i = 0; i < nparent; i++) + parents_sha1[i] = parents->sha1[i]; + + /* fake list head, so worker can assume it is non-NULL */ + paths_head.next = NULL; + + strbuf_init(&base, PATH_MAX); + diff_tree_paths(&paths_head, sha1, parents_sha1, nparent, &base, opt); + + strbuf_release(&base); + free(parents_sha1); + return paths_head.next; +} + + void diff_tree_combined(const unsigned char *sha1, const struct sha1_array *parents, int dense, @@ -1355,6 +1385,7 @@ void diff_tree_combined(const unsigned char *sha1, struct diff_options diffopts; struct combine_diff_path *p, *paths; int i, num_paths, needsep, show_log_first, num_parent = parents->nr; + int need_generic_pathscan; /* nothing to do, if no parents */ if (!num_parent) @@ -1377,11 +1408,55 @@ void diff_tree_combined(const unsigned char *sha1, /* find set of paths that everybody touches * -* NOTE find_paths() also handles --stat, as it computes -* diff(sha1,parent_i) for all i to do the job, specifically -* for parent0. +* NOTE +* +* Diffcore transformations are bound to diff_filespec and logic +* comparing two entries - i.e. they do not apply directly to combine +* diff. +* +* If some of such transformations is requested - we launch generic +* path scanning, which works significantly slower compared to +* simultaneous all-trees-in-one-go scan in find_paths_multitree(). +* +* TODO some of the filters could be ported to work on +* combine_diff_paths - i.e. all
[PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well
Previously diff_tree(), which is now named __diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in c839f1bd (combine-diff: optimize combine_diff_path sets intersection) I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(A,X1...Xn) calculation scheme --- D(A,X1...Xn) = D(A,X1) ^ ... ^ D(A,Xn) (regarding resulting paths set) D(A,Xj) - diff between A..Xj D(A,X1...Xn)- combined diff from A to parents X1,...,Xn We start from all trees, which are sorted, and compare their entries in lock-step: A X1 Xn - -- |a| |x1| |xn| |-| |--| ... |--| i = argmin(x1...xn) | | | | | | |-| |--| |--| |.| |. | |. | . .. . .. at any time there could be 3 cases: 1) a < xi; 2) a > xi; 3) a = xi. Schematic deduction of what every case means, and what to do, follows: 1) a < xi -> ∀j a ∉ Xj -> "+a" ∈ D(A,Xj) -> D += "+a"; a↓ 2) a > xi 2.1) ∃j: xj > xi -> "-xi" ∉ D(A,Xj) -> D += ø; ∀ xk=xi xk↓ 2.2) ∀j xj = xi -> xj ∉ A -> "-xj" ∈ D(A,Xj) -> D += "-xi"; ∀j xj↓ 3) a = xi 3.1) ∃j: xj > xi -> "+a" ∈ D(A,Xj) -> only xk=xi remains to investigate 3.2) xj = xi -> investigate δ(a,xj) | | v 3.1+3.2) looking at δ(a,xk) ∀k: xk=xi - if all != ø -> ⎧δ(a,xk) - if xk=xi -> D += ⎨ ⎩"+a" - if xk>xi in any case a↓ ∀ xk=xi xk↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme - A B - - |a| |b|a < b -> a ∉ B -> D(A,B) += +aa↓ |-| |-|a > b -> b ∉ A -> D(A,B) += -bb↓ | | | |a = b -> investigate δ(a,b)a↓ b↓ |-| |-| |.| |.| . . . . This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents tree
Re: [PATCH 00/11] More preparatory work for multiparent tree-walker
On Wed, Feb 12, 2014 at 09:25:51AM -0800, Junio C Hamano wrote: > Junio C Hamano writes: > > > Kirill Smelkov writes: > > > >> Sorry for the confusion. Could you please do the following: > >> > >> Patches should be applied over to ks/tree-diff-walk > >> (74aa4a18). Before applying the patches, please cherry-pick > >> > >> c90483d9(tree-diff: no need to manually verify that there is no > >> mode change for a path) > >> > >> ef4f0928(tree-diff: no need to pass match to > >> skip_uninteresting()) > >> > >> into that branch, and drop them from ks/combine-diff - we'll have one > >> branch reworking the diff tree-walker, and the other taking advantage of > >> it for combine-diff. > > > > As long as that does not lose changes to tests and clean-ups, I'm > > fine with that direction. For example, I do not know if you want to > > lose e3f62d12 (diffcore-order: export generic ordering interface, > > 2014-01-20), which is part of the combine-diff topic. > > Ahh, sorry, I misread the "drop" as "salvage these two and drop the > rest". The new series does apply cleanly on a commit in master..pu > that has both ks/tree-diff-walk and ks/combine-diff, and the latter > is not yet in 'next' so we are free to reorganize. > > Let me flip the latter topic around, also queue these updates and > push the result out on 'pu'. > > Thanks. Thank you. As we've managed to apply this to pu, I've send the final speedup patches. Please review them as time permits. Thanks beforehand, Kirill -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] tree-diff: consolidate code for emitting diffs and recursion in one place
Kirill Smelkov writes: > +static void show_path(struct strbuf *base, struct diff_options *opt, > + struct tree_desc *t1, struct tree_desc *t2) > { > unsigned mode; > const char *path; > - const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode); > - int pathlen = tree_entry_len(&desc->entry); > + const unsigned char *sha1; > + int pathlen; > int old_baselen = base->len; > + int isdir, recurse = 0, emitthis = 1; > + > + /* at least something has to be valid */ > + assert(t1 || t2); > + > + if (t2) { > + /* path present in resulting tree */ > + sha1 = tree_entry_extract(t2, &path, &mode); > + pathlen = tree_entry_len(&t2->entry); > + isdir = S_ISDIR(mode); > + } > + else { > + /* a path was removed - take path from parent. Also take > + * mode from parent, to decide on recursion. > + */ > + tree_entry_extract(t1, &path, &mode); > + pathlen = tree_entry_len(&t1->entry); > + > + isdir = S_ISDIR(mode); > + sha1 = NULL; > + mode = 0; > + } > + > + if (DIFF_OPT_TST(opt, RECURSIVE) && isdir) { > + recurse = 1; > + emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE); > + } > > strbuf_add(base, path, pathlen); > - if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode)) { > - if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) > - opt->add_remove(opt, *prefix, mode, sha1, 1, base->buf, > 0); > > + if (emitthis) > + emit_diff(opt, base, t1, t2); > + > + if (recurse) { > strbuf_addch(base, '/'); > - diff_tree_sha1(*prefix == '-' ? sha1 : NULL, > -*prefix == '+' ? sha1 : NULL, base->buf, opt); > - } else > - opt->add_remove(opt, prefix[0], mode, sha1, 1, base->buf, 0); > + diff_tree_sha1(t1 ? t1->entry.sha1 : NULL, > +t2 ? t2->entry.sha1 : NULL, base->buf, opt); > + } After this step, "sha1" is assigned but never gets used. Please double-check the fix-up I queued in the series before merging it to 'pu'. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/11] tree-diff: consolidate code for emitting diffs and recursion in one place
On Thu, Feb 13, 2014 at 09:43:27AM -0800, Junio C Hamano wrote: > Kirill Smelkov writes: > > > +static void show_path(struct strbuf *base, struct diff_options *opt, > > + struct tree_desc *t1, struct tree_desc *t2) > > { > > unsigned mode; > > const char *path; > > - const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode); > > - int pathlen = tree_entry_len(&desc->entry); > > + const unsigned char *sha1; > > + int pathlen; > > int old_baselen = base->len; > > + int isdir, recurse = 0, emitthis = 1; > > + > > + /* at least something has to be valid */ > > + assert(t1 || t2); > > + > > + if (t2) { > > + /* path present in resulting tree */ > > + sha1 = tree_entry_extract(t2, &path, &mode); > > + pathlen = tree_entry_len(&t2->entry); > > + isdir = S_ISDIR(mode); > > + } > > + else { > > + /* a path was removed - take path from parent. Also take > > +* mode from parent, to decide on recursion. > > +*/ > > + tree_entry_extract(t1, &path, &mode); > > + pathlen = tree_entry_len(&t1->entry); > > + > > + isdir = S_ISDIR(mode); > > + sha1 = NULL; > > + mode = 0; > > + } > > + > > + if (DIFF_OPT_TST(opt, RECURSIVE) && isdir) { > > + recurse = 1; > > + emitthis = DIFF_OPT_TST(opt, TREE_IN_RECURSIVE); > > + } > > > > strbuf_add(base, path, pathlen); > > - if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode)) { > > - if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) > > - opt->add_remove(opt, *prefix, mode, sha1, 1, base->buf, > > 0); > > > > + if (emitthis) > > + emit_diff(opt, base, t1, t2); > > + > > + if (recurse) { > > strbuf_addch(base, '/'); > > - diff_tree_sha1(*prefix == '-' ? sha1 : NULL, > > - *prefix == '+' ? sha1 : NULL, base->buf, opt); > > - } else > > - opt->add_remove(opt, prefix[0], mode, sha1, 1, base->buf, 0); > > + diff_tree_sha1(t1 ? t1->entry.sha1 : NULL, > > + t2 ? t2->entry.sha1 : NULL, base->buf, opt); > > + } > > > After this step, "sha1" is assigned but never gets used. Please > double-check the fix-up I queued in the series before merging it to > 'pu'. Your fixup is correct - it was my overlook when preparing the patch - sha1 is needed for later patch (multiparent diff tree-walker), but I've mistakenly left it here. The two interesting patches I've sent you today, are already adjusted to this correction - it is safe to squash the fixup in. Thanks for noticing, Kirill -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] for-each-ref: add option to omit newlines
Even when having specified a format string for-each-ref still prints a newline after printing each ref according to the format. This breaks splitting the output on newlines, for example. Signed-off-by: Øystein Walle --- I was somewhat surprised by this behaviour; I expected to be in full control of the output but I could not get rid of the newlines. As far as I'm *personally* concerned the line "putchar('\n');" could just as well be removed completely and an '\n' appended to the default format string. But since this command (and this behaviour) as been around since 2006 I assume that it's in Git's best interest to not change the default behaviour. Hence the introduction of this option. Although I was taken aback by this behaviour, is it patch-worthy? The fix isn't very pretty. On to the patch itself: I contemplated putting '\n' in the default format and removing it if -n was given, which would get rid of the need to pass an exta argument to show_ref(). But that means we would need to *insert it* when a format is given and -n is not... I've changed how the default format is assigned so that we can check if it's NULL after option parsing to spit out an error. Alternatively we could just allow it. Documentation/git-for-each-ref.txt | 3 +++ builtin/for-each-ref.c | 20 +++- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index 4240875..805914a 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -62,6 +62,9 @@ OPTIONS the specified host language. This is meant to produce a scriptlet that can directly be `eval`ed. +-n:: +--no-newlines:: + Do not print a newline after printing each formatted ref. FIELD NAMES --- diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index 51798b4..32d6b12 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -998,7 +998,8 @@ static void emit(const char *cp, const char *ep) } } -static void show_ref(struct refinfo *info, const char *format, int quote_style) +static void show_ref(struct refinfo *info, const char *format, int quote_style, +int no_newlines) { const char *cp, *sp, *ep; @@ -1023,7 +1024,8 @@ static void show_ref(struct refinfo *info, const char *format, int quote_style) resetv.s = color; print_value(&resetv, quote_style); } - putchar('\n'); + if (!no_newlines) + putchar('\n'); } static struct ref_sort *default_sort(void) @@ -1067,9 +1069,9 @@ static char const * const for_each_ref_usage[] = { int cmd_for_each_ref(int argc, const char **argv, const char *prefix) { int i, num_refs; - const char *format = "%(objectname) %(objecttype)\t%(refname)"; + const char *format = NULL; struct ref_sort *sort = NULL, **sort_tail = &sort; - int maxcount = 0, quote_style = 0; + int maxcount = 0, quote_style = 0, no_newlines = 0; struct refinfo **refs; struct grab_ref_cbdata cbdata; @@ -1082,6 +1084,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) N_("quote placeholders suitably for python"), QUOTE_PYTHON), OPT_BIT(0 , "tcl", "e_style, N_("quote placeholders suitably for tcl"), QUOTE_TCL), + OPT_BOOL('n' , "no-newlines", &no_newlines, + N_("do not output a newline after each ref")), OPT_GROUP(""), OPT_INTEGER( 0 , "count", &maxcount, N_("show only matched refs")), @@ -1100,6 +1104,12 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) error("more than one quoting style?"); usage_with_options(for_each_ref_usage, opts); } + if (no_newlines && !format) { + error("--no-newlines without --format does not make sense"); + usage_with_options(for_each_ref_usage, opts); + } + if (!format) + format = "%(objectname) %(objecttype)\t%(refname)"; if (verify_format(format)) usage_with_options(for_each_ref_usage, opts); @@ -1120,6 +1130,6 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) if (!maxcount || num_refs < maxcount) maxcount = num_refs; for (i = 0; i < maxcount; i++) - show_ref(refs[i], format, quote_style); + show_ref(refs[i], format, quote_style, no_newlines); return 0; } -- 1.8.5 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] for-each-ref: add option to omit newlines
Øystein Walle gmail.com> writes: > > splitting the output on newlines, for example. > Ugh, I mean on null bytes, naturally. Øsse. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Why is there no execution of the pre-commit hook while rebasing?
Hi, I'm wondering why there is no execution of the pre-commit hook when rebasing. When I do a rebase of N commits (say, 3 or so) onto another branch, each of the rebased commits gets "recommitted" to the branch, am I right? If yes, wouldn't it make sense to run the pre-commit hook after each applied commit? Just wondering... -- Mit freundlichen Grüßen, Kind regards, Matthias Beyer Proudly sent with mutt. Happily signed with gnupg. pgpjBB2ibZKuU.pgp Description: PGP signature
Re: Make the git codebase thread-safe
On Thu, Feb 13, 2014 at 12:27 AM, Johannes Sixt wrote: > Am 2/12/2014 20:30, schrieb Stefan Zager: >> On Wed, Feb 12, 2014 at 11:22 AM, Karsten Blees >> wrote: >>> Am 12.02.2014 19:37, schrieb Erik Faye-Lund: ReOpenFile, that's fantastic. Thanks a lot! >>> >>> ...but should be loaded dynamically via GetProcAddress, or are we ready to >>> drop XP support? >> >> Right, that is an issue. From our perspective, it's well past time to >> drop XP support. > > Not from mine. All this really means is that the build config will test WIN_VER, and there will need to be an additional binary distribution of msysgit for newer Windows. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
Karsten Blees gmail.com> writes: > > Am 12.02.2014 19:37, schrieb Erik Faye-Lund: > > On Wed, Feb 12, 2014 at 7:34 PM, Stefan Zager google.com> wrote: > >> On Wed, Feb 12, 2014 at 10:27 AM, Erik Faye-Lund gmail.com> wrote: > >>> On Wed, Feb 12, 2014 at 7:20 PM, Stefan Zager google.com> wrote: > > I don't want to steal the thunder of my coworker, who wrote the > implementation. He plans to submit it upstream soon-ish. It relies > on using the lpOverlapped argument to ReadFile(), with some additional > tomfoolery to make sure that the implicit position pointer for the > file descriptor doesn't get modified. > >>> > >>> Is the code available somewhere? I'm especially interested in the > >>> "additional tomfoolery to make sure that the implicit position pointer > >>> for the file descriptor doesn't get modified"-part, as this was what I > >>> ended up butting my head into when trying to do this myself. > >> > >> https://chromium-review.googlesource.com/#/c/186104/ > > > > ReOpenFile, that's fantastic. Thanks a lot! > > ...but should be loaded dynamically via GetProcAddress, or are we ready to drop XP support? > > Original patch author here. In trying to prepare this patch to use GetProcAddress to load dynamically, I've run into a bit of a snag. NO_THREAD_SAFE_PREAD is a compile-time flag, which will be incompatible with any attempt to make this a runtime decision a la LoadLibrary / GetProcAddress. On XP, we would need to fallback to the single-threaded path, and on Vista+ we would use the thread-able path, and obviously this decision could not be made until runtime. If MinGW were the only configuration using NO_THREAD_SAFE_PREAD, I would just remove it entirely, but it appears Cygwin configuration uses it also. Suggestions? One possibility is to disallow (by convention, perhaps), the use of pread() and read() against the same fd. The only reason ReOpenFile is necessary at all is because some code somewhere is mixing read-styles against the same fd. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] contrib/diff-highlight: multibyte characters diff
Thanks for reviewing. as you wrote, diff content may not be utf8 at all. and we don't know that the user's terminal watns is utf8. I think your trying utf8 decode and fall back approach is better than my patch, and do work well. is using "$@" for catching error like the patch below? According to perldoc Encode.pm, encode/decode with "FB_CROAK" may destroy original string. We should probabry use "LEAVE_SRC" on decode_utf8's second argument. --- contrib/diff-highlight/diff-highlight | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight index c4404d4..0743851 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -2,6 +2,7 @@ use warnings FATAL => 'all'; use strict; +use Encode qw(decode_utf8 encode_utf8); # Highlight by reversing foreground and background. You could do # other things like bold or underline if you prefer. @@ -73,13 +74,23 @@ sub show_hunk { my @queue; for (my $i = 0; $i < @$a; $i++) { - my ($rm, $add) = highlight_pair($a->[$i], $b->[$i]); - print $rm; - push @queue, $add; + my ($a_dec, $encode_rm) = decode($a->[$i]); + my ($b_dec, $encode_add) = decode($b->[$i]); + my ($rm, $add) = highlight_pair($a_dec, $b_dec); + print $encode_rm->($rm); + push @queue, $encode_add->($add); } print @queue; } +sub decode { + my $orig = shift; + my $decoded = eval { decode_utf8($orig, Encode::FB_CROAK | Encode::LEAVE_SRC) }; + return $@ ? + ($orig, sub { shift }) : + ($decoded, sub { encode_utf8(shift) }); +} + sub highlight_pair { my @a = split_line(shift); my @b = split_line(shift); -- 1.8.5.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tests: turn on network daemon tests by default
Junio C Hamano writes: > Jeff King writes: > >> test_normalize_tristate GIT_TEST_DAEMON > > Heh, great minds think alike. This is what I am playing with, > without committing (because I do like your "ask config if this is a > kind of various boolean 'false' representations, which I haven't > managed to add to it). And this is with the "ask config" helper. Two tangents. - We may want to do something similar in cvsserver and git-gui to make them more robust. $ git grep -e true --and -e 1 --and -e yes - Do we want to do something similar to GIT_TEST_CREDENTIAL_HELPER? -- >8 -- From: Jeff King Date: Mon, 10 Feb 2014 16:29:37 -0500 Subject: [PATCH] tests: turn on network daemon tests by default We do not run the httpd nor git-daemon tests by default, as they are rather heavyweight and require network access (albeit over localhost). However, it would be nice if more pepole ran them, for two reasons: 1. We would get more test coverage on more systems. 2. The point of the test suite is to find regressions. It is very easy to change some of the underlying code and break the httpd code without realizing you are even affecting it. Running the httpd tests helps find these problems sooner (ideally before the patches even hit the list). We still want to leave an "out", though, for people who really do not want to run them. For that reason, the GIT_TEST_HTTPD and GIT_TEST_GIT_DAEMON variables are now tri-state booleans (true/false/auto), so you can say GIT_TEST_HTTPD=false to turn the tests back off. To support those who want a stable single way to disable these tests across versions of Git before and after this change, an empty string explicitly set to these variables is also taken as "false", so the behaviour changes only for those who: a. did not express any preference by leaving these variables unset. They did not test these features before, but now they do; or b. did express that they want to test these features by setting GIT_TEST_FEATURE=false (or any equivalent other ways to tell "false" to Git, e.g. "0"), which has been a valid but funny way to say that they do want to test the feature only because we used to interpret any non-empty string to mean "yes please test". They no longer test that feature. In addition, we are forgiving of common setup failures (e.g., you do not have apache installed, or have an old version) when the tri-state is "auto" (or empty), but report an error when it is "true". This makes "auto" a sane default, as we should not cause failures on setups where the tests cannot run. But it allows people who use "true" to catch regressions in their system (e.g., they uninstalled apache, but were expecting their automated test runs to test git-httpd, and would want to be notified). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/lib-git-daemon.sh | 8 --- t/lib-httpd.sh | 22 +-- t/test-lib-functions.sh | 58 + 3 files changed, 74 insertions(+), 14 deletions(-) diff --git a/t/lib-git-daemon.sh b/t/lib-git-daemon.sh index 394b06b..615bf5d 100644 --- a/t/lib-git-daemon.sh +++ b/t/lib-git-daemon.sh @@ -16,9 +16,10 @@ # stop_git_daemon # test_done -if test -z "$GIT_TEST_GIT_DAEMON" +test_tristate GIT_TEST_GIT_DAEMON +if test "$GIT_TEST_GIT_DAEMON" = false then - skip_all="git-daemon testing disabled (define GIT_TEST_GIT_DAEMON to enable)" + skip_all="git-daemon testing disabled (unset GIT_TEST_GIT_DAEMON to enable)" test_done fi @@ -58,7 +59,8 @@ start_git_daemon() { kill "$GIT_DAEMON_PID" wait "$GIT_DAEMON_PID" trap 'die' EXIT - error "git daemon failed to start" + test_skip_or_die $GIT_TEST_GIT_DAEMON \ + "git daemon failed to start" fi } diff --git a/t/lib-httpd.sh b/t/lib-httpd.sh index bfdff2a..f9c2e22 100644 --- a/t/lib-httpd.sh +++ b/t/lib-httpd.sh @@ -30,9 +30,10 @@ # Copyright (c) 2008 Clemens Buchacher # -if test -z "$GIT_TEST_HTTPD" +test_tristate GIT_TEST_HTTPD +if test "$GIT_TEST_HTTPD" = false then - skip_all="Network testing disabled (define GIT_TEST_HTTPD to enable)" + skip_all="Network testing disabled (unset GIT_TEST_HTTPD to enable)" test_done fi @@ -76,8 +77,7 @@ GIT_VALGRIND_OPTIONS=$GIT_VALGRIND_OPTIONS; export GIT_VALGRIND_OPTIONS if ! test -x "$LIB_HTTPD_PATH" then - skip_all="skipping test, no web server found at '$LIB_HTTPD_PATH'" - test_done + test_skip_or_die $GIT_TEST_HTTPD "no web server found at '$LIB_HTTPD_PATH'" fi HTTPD_VERSION=`$LIB_HTTPD_PATH -v | \ @@ -89,19 +89,20 @@ then then if ! test $HTTPD_VERSION -ge 2 then - skip_all="skipping test, at least Apache version 2 is required" - test_d
Re: [PATCH 1/2] tree-diff: rework diff_tree() to generate diffs for multiparent cases as well
Kirill Smelkov writes: > + /* until we go to it next round, .next holds how many bytes we > + * allocated (for faster realloc - we don't need copying old > data). > + */ > + p->next = (struct combine_diff_path *)alloclen; I am getting this here: tree-diff.c: In function '__path_appendnew': tree-diff.c:140:13: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] combine-diff: speed it up, by using multiparent diff tree-walker directly
Kirill Smelkov writes: > + if (need_generic_pathscan) { > + /* NOTE generic case also handles --stat, as it computes > + * diff(sha1,parent_i) for all i to do the job, specifically > + * for parent0. > + */ > + paths = find_paths_generic(sha1, parents, &diffopts); > + } > + else { > + paths = find_paths_multitree(sha1, parents, &diffopts); > + > + /* show stat against the first parent even > + * when doing combined diff. > + */ > + int stat_opt = (opt->output_format & > + (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT)); /* * We see decl-after-stmt here. * Also please have slash-asterisk and asterisk-slash * at the beginning and the end of a multi-line comment * block on their own line. */ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] for-each-ref: add option to omit newlines
Øystein Walle writes: > On to the patch itself: I contemplated putting '\n' in the default format and > removing it if -n was given, which would get rid of the need to pass an exta > argument to show_ref(). But that means we would need to *insert it* when a > format is given and -n is not... I would rather see us go in the direction to add "-z" output option, which is what everybody else that produces NUL terminated entries in our suite of subcommands does. IOW, something along with this line (untested). builtin/for-each-ref.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c index 51798b4..2c8cac8 100644 --- a/builtin/for-each-ref.c +++ b/builtin/for-each-ref.c @@ -94,6 +94,7 @@ static const char **used_atom; static cmp_type *used_atom_type; static int used_atom_cnt, need_tagged, need_symref; static int need_color_reset_at_eol; +static int line_termination = '\n'; /* * Used to parse format string and sort specifiers @@ -1023,7 +1024,7 @@ static void show_ref(struct refinfo *info, const char *format, int quote_style) resetv.s = color; print_value(&resetv, quote_style); } - putchar('\n'); + putchar(line_termination); } static struct ref_sort *default_sort(void) @@ -1088,6 +1089,9 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix) OPT_STRING( 0 , "format", &format, N_("format"), N_("format to use for the output")), OPT_CALLBACK(0 , "sort", sort_tail, N_("key"), N_("field name to sort on"), &opt_parse_sort), + OPT_SET_INT('z', NULL, &line_termination, + N_("Use NUL instead of LF to end each output records"), + '\0'), OPT_END(), }; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t5537: move http tests out to t5539
Nguyễn Thái Ngọc Duy writes: > start_httpd is supposed to be at the beginning of the test file, not > the middle of it. The "test_seq" line in "no shallow lines.." test is > updated to compensate missing refs that are there in t5537, but not in > the new t5539. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > On Thu, Feb 13, 2014 at 8:22 AM, Duy Nguyen wrote: > > On Thu, Feb 13, 2014 at 5:12 AM, Jeff King wrote: > >> lib-httpd was never designed to be included from anywhere except the > >> beginning of the file. But that wouldn't be right for t5537, because it > >> wants to run some of the tests, even if apache setup fails. The right > >> way to do it is probably to have lib-httpd do all of its work in a lazy > >> prereq. I don't know how clunky that will end up, though; it might be > >> simpler to just move the shallow http test into one of the http-fetch > >> scripts. > > > > I'll move it out later. > > Here it is, on top of nd/http-fetch-shallow-fix because the new test > in t5537 is picky and a simple merge resolution wouldn't do it. Will queue; thanks. > > t/t5537-fetch-shallow.sh | 57 --- > t/t5539-fetch-http-shallow.sh (new +x) | 82 > ++ > 2 files changed, 82 insertions(+), 57 deletions(-) > > diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh > index 098f220..3ae9092 100755 > --- a/t/t5537-fetch-shallow.sh > +++ b/t/t5537-fetch-shallow.sh > @@ -173,61 +173,4 @@ EOF > ) > ' > > -if test -n "$NO_CURL" -o -z "$GIT_TEST_HTTPD"; then > - say 'skipping remaining tests, git built without http support' > - test_done > -fi > - > -. "$TEST_DIRECTORY"/lib-httpd.sh > -start_httpd > - > -test_expect_success 'clone http repository' ' > - git clone --bare --no-local shallow > "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && > - git clone $HTTPD_URL/smart/repo.git clone && > - ( > - cd clone && > - git fsck && > - git log --format=%s origin/master >actual && > - cat-7 > -6 > -5 > -4 > -3 > -EOF > - test_cmp expect actual > - ) > -' > - > -# This test is tricky. We need large enough "have"s that fetch-pack > -# will put pkt-flush in between. Then we need a "have" the server > -# does not have, it'll send "ACK %s ready" > -test_expect_success 'no shallow lines after receiving ACK ready' ' > - ( > - cd shallow && > - for i in $(test_seq 10) > - do > - git checkout --orphan unrelated$i && > - test_commit unrelated$i && > - git push -q "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \ > - refs/heads/unrelated$i:refs/heads/unrelated$i && > - git push -q ../clone/.git \ > - refs/heads/unrelated$i:refs/heads/unrelated$i || > - exit 1 > - done && > - git checkout master && > - test_commit new && > - git push "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" master > - ) && > - ( > - cd clone && > - git checkout --orphan newnew && > - test_commit new-too && > - GIT_TRACE_PACKET="$TRASH_DIRECTORY/trace" git fetch --depth=2 && > - grep "fetch-pack< ACK .* ready" ../trace && > - ! grep "fetch-pack> done" ../trace > - ) > -' > - > -stop_httpd > test_done > diff --git a/t/t5539-fetch-http-shallow.sh b/t/t5539-fetch-http-shallow.sh > new file mode 100755 > index 000..94553e1 > --- /dev/null > +++ b/t/t5539-fetch-http-shallow.sh > @@ -0,0 +1,82 @@ > +#!/bin/sh > + > +test_description='fetch/clone from a shallow clone over http' > + > +. ./test-lib.sh > + > +if test -n "$NO_CURL"; then > + skip_all='skipping test, git built without http support' > + test_done > +fi > + > +. "$TEST_DIRECTORY"/lib-httpd.sh > +start_httpd > + > +commit() { > + echo "$1" >tracked && > + git add tracked && > + git commit -m "$1" > +} > + > +test_expect_success 'setup shallow clone' ' > + commit 1 && > + commit 2 && > + commit 3 && > + commit 4 && > + commit 5 && > + commit 6 && > + commit 7 && > + git clone --no-local --depth=5 .git shallow && > + git config --global transfer.fsckObjects true > +' > + > +test_expect_success 'clone http repository' ' > + git clone --bare --no-local shallow > "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" && > + git clone $HTTPD_URL/smart/repo.git clone && > + ( > + cd clone && > + git fsck && > + git log --format=%s origin/master >actual && > + cat +7 > +6 > +5 > +4 > +3 > +EOF > + test_cmp expect actual > + ) > +' > + > +# This test is tricky. We need large enough "have"s that fetch-pack > +# will put pkt-flush in between. Then we need a "have" the server > +# does not have, it'll send "ACK %s ready" > +test_expect_success 'no shallow lines after receiving ACK
Re: [PATCH] release notes: typo fixes
Michael J Gruber writes: > Signed-off-by: Michael J Gruber > --- > Just a few things I spotted while trying to keep myself informed :) Thanks. Very much appreciated. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Jeff King writes: > - somebody to be the backup admin (I am assuming I'll be the primary > admin, but as always, if somebody else wants to...) I can be backup, if Shawn doesn't want it. > - ideas ideas ideas Here's my moonshot: --- 8< --- Replace object loading/writing layer by libgit2 Git reads objects from storage (loose and packed) through functions in sha1_file.c. Most commands only require very simple, opaque read and write access to the object storage. As a weatherballoon, show that it is feasible to use libgit2 git_odb_* routines for these simple callers. Aim for passing the git test suite using git_odb_* object storage access, except for tests that verify behavior in the face of storage corruption, replacement objects, alternate storage locations, and similar quirks. Of course it is even better if you pass the test suite without exception. Language: C Difficulty: hard Possible mentors: Thomas Rast and --- >8 --- That absolutely requires a co-mentor from the libgit2 side to do, however. Perhaps you could talk someone into it? ;-) Motivation: I believe that migrating to libgit2 is the better approach, medium term, than rewriting everything ourselves to be nice, clean and thread-safe. I took a shot a while ago at making the pack reading code thread-safe, but it's adding mess when we could simply replace it all by the already thread-safe libgit2 calls. It also helps shake out incompatibilities in libgit2. Downside: not listing "code merged" as a goal may not make the project as shiny, neither for Git nor for the student. -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] Git v1.8.5.5
The latest maintenance release Git v1.8.5.5 is now available at the usual places. Hopefully this will be the last update to the 1.8.5.x series. The release tarballs are found at: http://code.google.com/p/git-core/downloads/list and their SHA-1 checksums are: 7bb4ea883b1f8f6f7f927035f85e8e27b57e0194 git-1.8.5.5.tar.gz 39dd7979c8757d2dc4bc3aaa82741ba93557d566 git-htmldocs-1.8.5.5.tar.gz a4a2aef1440d4751f37c65359da57c9bd51a7beb git-manpages-1.8.5.5.tar.gz The following public repositories all have a copy of the v1.8.5.5 tag and the maint branch that the tag points at: url = https://kernel.googlesource.com/pub/scm/git/git url = git://repo.or.cz/alt-git.git url = https://code.google.com/p/git-core/ url = git://git.sourceforge.jp/gitroot/git-core/git.git url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core url = https://github.com/gitster/git Also, http://www.kernel.org/pub/software/scm/git/ has copies of the release tarballs. Git v1.8.5.5 Release Notes == Fixes since v1.8.5.4 * The pathspec matching code, while comparing two trees (e.g. "git diff A B -- path1 path2") was too aggressive and failed to match some paths when multiple pathspecs were involved. * "git repack --max-pack-size=8g" stopped being parsed correctly when the command was reimplemented in C. * A recent update to "git send-email" broke platforms where /etc/ssl/certs/ directory exists but cannot be used as SSL_ca_path (e.g. Fedora rawhide). * A handful of bugs around interpreting $branch@{upstream} notation and its lookalike, when $branch part has interesting characters, e.g. "@", and ":", have been fixed. * "git clone" would fail to clone from a repository that has a ref directly under "refs/", e.g. "refs/stash", because different validation paths do different things on such a refname. Loosen the client side's validation to allow such a ref. * "git log --left-right A...B" lost the "leftness" of commits reachable from A when A is a tag as a side effect of a recent bugfix. This is a regression in 1.8.4.x series. * "git merge-base --octopus" used to leave cleaning up suboptimal result to the caller, but now it does the clean-up itself. * "git mv A B/", when B does not exist as a directory, should error out, but it didn't. Also contains typofixes, documentation updates and trivial code clean-ups. Changes since v1.8.5.4 are as follows: Andy Spencer (1): tree_entry_interesting: match against all pathspecs Jeff King (9): fetch-pack: do not filter out one-level refs interpret_branch_name: factor out upstream handling interpret_branch_name: rename "cp" variable to "at" interpret_branch_name: always respect "namelen" parameter interpret_branch_name: avoid @{upstream} past colon interpret_branch_name: find all possible @-marks repack: fix typo in max-pack-size option repack: make parsed string options const-correct repack: propagate pack-objects options as strings Junio C Hamano (5): merge-base: separate "--independent" codepath into its own helper merge-base --octopus: reduce the result from get_octopus_merge_bases() revision: mark contents of an uninteresting tree uninteresting revision: propagate flag bits from tags to pointees Git 1.8.5.5 Ruben Kerkhof (1): send-email: /etc/ssl/certs/ directory may not be usable as ca_path -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Thomas Rast writes: > Downside: not listing "code merged" as a goal may not make the project > as shiny, neither for Git nor for the student. I'd actually view that as an upside. This sounds like a good first step for a feasibility study that is really necessary. I wonder why the handling of storage corruption and replacement could be left broken, though. Is that because libgit2 has known breakages in these areas, or is there some other reason? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
Am 13.02.2014 19:38, schrieb Zachary Turner: > The only reason ReOpenFile is necessary at > all is because some code somewhere is mixing read-styles against the same > fd. > I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the HANDLE's file position, so you should be able to mix read()/pread() however you like (as long as read() is only called from one thread). I tried without ReOpenFile and it seems to work like a charm, or am I missing something? 8< ssize_t mingw_pread(int fd, void *buf, size_t count, off64_t offset) { DWORD bytes_read; OVERLAPPED overlapped; memset(&overlapped, 0, sizeof(overlapped)); overlapped.Offset = (DWORD) offset; overlapped.OffsetHigh = (DWORD) (offset >> 32); if (!ReadFile((HANDLE) _get_osfhandle(fd), buf, count, &bytes_read, &overlapped)) { errno = err_win_to_posix(GetLastError()); return -1; } return (ssize_t) bytes_read; } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
On Thu, Feb 13, 2014 at 2:51 PM, Karsten Blees wrote: > Am 13.02.2014 19:38, schrieb Zachary Turner: > >> The only reason ReOpenFile is necessary at >> all is because some code somewhere is mixing read-styles against the same >> fd. >> > > I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the > HANDLE's file position, so you should be able to mix read()/pread() however > you like (as long as read() is only called from one thread). That is, apparently, a bald-faced lie in the ReadFile API doc. First implementation didn't use ReOpenFile, and it crashed all over the place. ReOpenFile fixed it. Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Make the git codebase thread-safe
To elaborate a little bit more, you can verify with a sample program that ReadFile with OVERLAPPED does in fact modify the HANDLE's file position. The documentation doesn't actually state one way or another. My original attempt at a patch didn't have the ReOpenFile, and we experienced regular read corruption. We scratched our heads over it for a bit, and then hypothesized that someone must be mixing read styles, which led to this ReOpenFile workaround, which incidentally also solved the corruption problems. We wrote a similar sample program to verify that when using ReOpenHandle, and changing the file pointer of the duplicated handle, that the file pointer of the original handle is not modified. We did not actually try to identify the source of the mixed read styles, but it seems like the only possible explanation. On Thu, Feb 13, 2014 at 2:53 PM, Stefan Zager wrote: > On Thu, Feb 13, 2014 at 2:51 PM, Karsten Blees > wrote: >> Am 13.02.2014 19:38, schrieb Zachary Turner: >> >>> The only reason ReOpenFile is necessary at >>> all is because some code somewhere is mixing read-styles against the same >>> fd. >>> >> >> I don't understand...ReadFile with OVERLAPPED parameter doesn't modify the >> HANDLE's file position, so you should be able to mix read()/pread() however >> you like (as long as read() is only called from one thread). > > That is, apparently, a bald-faced lie in the ReadFile API doc. First > implementation didn't use ReOpenFile, and it crashed all over the > place. ReOpenFile fixed it. > > Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] Make the global packed_git variable static to sha1_file.c.
I uploaded a new patch; a few comments inline below... On Wed, Feb 12, 2014 at 1:19 PM, Junio C Hamano wrote: > sza...@chromium.org writes: > > Also I'd suggest s/pack_data_fn/collect_pack_data/ or something. > "_fn" may be a good suffix for typedef'ed typename used in a > callback function, but for a concrete function, it only adds noise > without giving us anything new. Fixed this everywhere. >> static struct cached_object *find_cached_object(const unsigned char *sha1) >> @@ -468,7 +469,6 @@ static unsigned int pack_open_fds; >> static unsigned int pack_max_fds; >> static size_t peak_pack_mapped; >> static size_t pack_mapped; >> -struct packed_git *packed_git; > > Hmm, any particular reason why only this variable and not others are > moved up? No, just need packed_git declared before use. I moved all the static variables up, for clarity. >> + foreach_packed_git(find_pack_fn, NULL, &fpd); >> + if (fpd.found_pack && !exclude && >> + (incremental || >> + (local && fpd.found_non_local_pack) || >> + (ignore_packed_keep && fpd.found_pack_keep))) >> + return 0; > > When told to do --incremental, we used to return 0 from this > function immediately once we find the object in one pack, without > going thru the list of packs. Now we let foreach to loop thru all > of them and then return 0. Does this difference matter? A similar > difference may exist for local/keep but I did not think it through. Fixed. Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Jeff King wrote: > - ideas ideas ideas I'll throw in a few ideas from half-finished work. 1. Speed up git-rebase--am.sh Currently, git-rebase--am.sh is really slow because it dumps each patch to a file using git-format-patch, and picks it up to apply subsequently using git-am. Find a way to speed this up, without sacrificing safety. You can use the continuation features of cherry-pick, and dump to file only to persist state in the case of a failure. Language: Shell script, C Difficulty: Most of the difficulty lies in "what to do", not so much "how to do it". Might require modifying cherry-pick to do additional work on failure. 2. Invent new conflict style As an alternative to the diff3 conflict style, invent a conflict style that shows the original unpatched segment along with the raw patch text. The user can then apply the patch by hand. Language: C Difficulty: Since it was first written, very few people have touched the xdiff portion of the code. Since the area is very core to git, the series will have to go through a ton of iterations. 3. Rewrite git-branch to use git-for-each-ref For higher flexibility in command-line options and output format, use git for-each-ref to re-implement git-branch. The first task is to grow features that are in branch but not fer into fer (like --column, --merged, --contains). The second task is to refactor fer so that an external program can call into it. Language: C Difficulty: fer was never written with the idea of being reusable; it therefore persists a lot of global state, and even leaks memory in some places. Refactoring it to be more modern is definitely a challenge. 4. Implement @{publish} (I just can't find the time to finish this) @{publish} is a feature like @{upstream}, showing the state of the publish-point in the case of triangular workflows. Implement this while sharing code with git-push, and polish it until the prompt shows publish-state. Language: C, Shell script Difficulty: Once you figure out how to share code with git-push, this task should be relatively straightforward. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Make the global packed_git variable static to sha1_file.c.
This is a first step in making the codebase thread-safe. By and large, the operations which might benefit from threading are those that work with pack files (e.g., checkout, blame), so the focus of this patch is stop leaking the global list of pack files outside of sha1_file.c. The next step will be to control access to the list of pack files with a mutex. However, that alone is not enough to make pack file access thread safe. Even in a read-only operation, the window list associated with each pack file will need to be controlled. Additionally, the global counters in sha1_file.c will need to be controlled. This patch is a pure refactor with no functional changes, so it shouldn't require any additional tests. Adding the actual locks will be a functional change, and will require additional tests. Signed-off-by: Stefan Zager --- builtin/count-objects.c | 44 ++- builtin/fsck.c | 46 +++- builtin/gc.c | 26 +++ builtin/pack-objects.c | 189 --- builtin/pack-redundant.c | 37 +++--- cache.h | 18 - fast-import.c| 4 +- http-backend.c | 28 --- http-push.c | 4 +- http-walker.c| 2 +- pack-revindex.c | 20 ++--- server-info.c| 36 - sha1_file.c | 52 ++--- sha1_name.c | 18 - 14 files changed, 327 insertions(+), 197 deletions(-) diff --git a/builtin/count-objects.c b/builtin/count-objects.c index a7f70cb..a27c006 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -83,14 +83,32 @@ static char const * const count_objects_usage[] = { NULL }; +struct pack_data { + unsigned long packed; + off_t size_pack; + unsigned long num_pack; +}; + +static int count_pack_objects(struct packed_git *p, void *data) +{ + struct pack_data *pd = (struct pack_data *) data; + if (p->pack_local && !open_pack_index(p)) { + pd->packed += p->num_objects; + pd->size_pack += p->pack_size + p->index_size; + pd->num_pack++; + } + return 0; +} + int cmd_count_objects(int argc, const char **argv, const char *prefix) { int i, verbose = 0, human_readable = 0; const char *objdir = get_object_directory(); int len = strlen(objdir); char *path = xmalloc(len + 50); - unsigned long loose = 0, packed = 0, packed_loose = 0; + unsigned long loose = 0, packed_loose = 0; off_t loose_size = 0; + struct pack_data pd = {0, 0, 0}; struct option opts[] = { OPT__VERBOSE(&verbose, N_("be verbose")), OPT_BOOL('H', "human-readable", &human_readable, @@ -118,41 +136,29 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix) closedir(d); } if (verbose) { - struct packed_git *p; - unsigned long num_pack = 0; - off_t size_pack = 0; struct strbuf loose_buf = STRBUF_INIT; struct strbuf pack_buf = STRBUF_INIT; struct strbuf garbage_buf = STRBUF_INIT; - if (!packed_git) - prepare_packed_git(); - for (p = packed_git; p; p = p->next) { - if (!p->pack_local) - continue; - if (open_pack_index(p)) - continue; - packed += p->num_objects; - size_pack += p->pack_size + p->index_size; - num_pack++; - } + prepare_packed_git(); + foreach_packed_git(count_pack_objects, NULL, &pd); if (human_readable) { strbuf_humanise_bytes(&loose_buf, loose_size); - strbuf_humanise_bytes(&pack_buf, size_pack); + strbuf_humanise_bytes(&pack_buf, pd.size_pack); strbuf_humanise_bytes(&garbage_buf, size_garbage); } else { strbuf_addf(&loose_buf, "%lu", (unsigned long)(loose_size / 1024)); strbuf_addf(&pack_buf, "%lu", - (unsigned long)(size_pack / 1024)); + (unsigned long)(pd.size_pack / 1024)); strbuf_addf(&garbage_buf, "%lu", (unsigned long)(size_garbage / 1024)); } printf("count: %lu\n", loose); printf("size: %s\n", loose_buf.buf); - printf("in-pack: %lu\n", packed); - printf("packs: %lu\n", num_pack); + printf("in-pack: %lu\n", pd.packed); + printf("packs: %lu\n", pd.num_pack); printf("size-pac
git stash and --skip-worktree files
My team is starting to use the --skip-worktree flag on files to allow them to keep modified files in their working directories. Ideally, they'd like an option for "git-stash" to apply to skip-worktree files, just as it can (optionally) be applied to untracked files, so that skip-worktree files could be included when saving and restored (ideally, with the skip-worktree bit set) on a git-stash pop. (1) I don't see an option for "git-stash" to support skip-worktree files...is it there and I'm just not seeing it? (2) Does anyone know if there's a reason why this is not possible/unreasonable to support/philosophically discouraged? (3) If it is reasonable, where would the right place be to suggest/request this feature? I'm happy to look at making such changes to git-stash.sh, but before I do so, I'd love to know that such work would be aligned with the design philosophy of git-stash. (For example, if the git team would only want such an option if it also included the ability to stash assume-unchanged files, too, that would be useful to know before I start making changes.) Mick -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git GSoC 2014
Junio C Hamano writes: > Thomas Rast writes: > >> Downside: not listing "code merged" as a goal may not make the project >> as shiny, neither for Git nor for the student. > > I'd actually view that as an upside. This sounds like a good first > step for a feasibility study that is really necessary. > > I wonder why the handling of storage corruption and replacement > could be left broken, though. Is that because libgit2 has known > breakages in these areas, or is there some other reason? It's because I don't know enough about what libgit2's state is, and I wanted to keep the scope limited. Naturally, the next step would then be to implement the lacking functionality (if any) in libgit2 so that the test suite passes. I just don't know if that's trivial, or something for the "if we have time" section of the project, or too much work. (I did do a quick "can we reasonably link against libgit2" test where I gave git-cat-file a --libgit2 option that loads blobs with libgit2. There are some name collisions in the git_config* identifiers that need to be resolved, but otherwise it seems entirely possible.) -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html