[PATCH] gc: support temporarily preserving garbage
This patch adds a gc.garbageexpire setting that, when not set to "now", makes gc (and prune, prune-packed, and repack) move garbage into a temporary garbage directory instead of deleting it immediately. The garbage directory is then cleared out based on gc.garbageexpire. The motivation for this setting is to work around various NFS servers not supporting delete-on-last-close semantics between NFS clients. Without proper support for that, gc could potentially delete objects and packs that are in use by git processes on other NFS clients. If another git process has a deleted pack file mmap()ed, it could crash with a SIGBUS error on Linux. Signed-off-by: Brodie Rao --- .gitignore | 1 + Documentation/config.txt | 20 + Documentation/git-gc.txt | 7 Documentation/git-prune-garbage.txt| 55 Documentation/git-prune-packed.txt | 9 Documentation/git-prune.txt| 9 Documentation/git-repack.txt | 6 +++ Documentation/git.txt | 6 +++ Makefile | 2 + builtin.h | 1 + builtin/gc.c | 20 + builtin/prune-garbage.c| 77 ++ builtin/prune-packed.c | 3 +- builtin/prune.c| 5 ++- builtin/repack.c | 7 ++-- cache.h| 2 + command-list.txt | 1 + contrib/completion/git-completion.bash | 2 + environment.c | 12 +- gc.c | 60 ++ gc.h | 16 +++ git.c | 1 + t/t6502-gc-garbage-expire.sh | 60 ++ 23 files changed, 375 insertions(+), 7 deletions(-) create mode 100644 Documentation/git-prune-garbage.txt create mode 100644 builtin/prune-garbage.c create mode 100644 gc.c create mode 100644 gc.h create mode 100755 t/t6502-gc-garbage-expire.sh diff --git a/.gitignore b/.gitignore index a052419..a9a4e30 100644 --- a/.gitignore +++ b/.gitignore @@ -107,6 +107,7 @@ /git-parse-remote /git-patch-id /git-prune +/git-prune-garbage /git-prune-packed /git-pull /git-push diff --git a/Documentation/config.txt b/Documentation/config.txt index 9220725..0106d8f 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1213,6 +1213,26 @@ gc.autodetach:: Make `git gc --auto` return immediately and run in background if the system supports it. Default is true. +gc.garbageexpire:: + When 'git gc' is run, objects and packs that are pruned are + immediately deleted from the file system. This setting can be + overridden to move pruned objects and packs to the garbage + directory. That leftover garbage will be deleted after the + specified grace period. The default value is "now", meaning + garbage is deleted immediately. ++ +Setting this to something other than "now" (e.g., "1.day.ago") can help +work around issues with NFS servers that don't support +delete-on-last-close semantics between NFS clients. 'git gc' will not +unlink files immediately, so a git process on another NFS client that +might be reading a garbage collected file will not crash. ++ +Note that this setting can cause the repository's size to increase as +garbage collection passes are made. Care should be taken to make sure +the grace period isn't too long. A grace period of one day might be +reasonable if you make the assumption that your git processes over NFS +won't run longer or have files open longer than one day. + gc.packrefs:: Running `git pack-refs` in a repository renders it unclonable by Git versions prior to 1.5.1.2 over dumb diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 273c466..f90dc0a 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -131,6 +131,12 @@ The optional configuration variable 'gc.pruneExpire' controls how old the unreferenced loose objects have to be before they are pruned. The default is "2 weeks ago". +The optional configurable variable 'gc.garbageexpire' controls how +pruned objects and packs are deleted. This can be overridden to move +pruned objects and packs to the garbage directory. That leftover garbage +will be deleted after the specified grace period. The default value is +"now", meaning garbage is deleted immediately. + Notes - @@ -156,6 +162,7 @@ linkgit:githooks[5] for more information. SEE ALSO linkgit:git-prune[1] +linkgit:git-prune-garbage[1] linkgit:git-reflog[1] linkgit:git-repack[1] linkgit:git-rerere[1] diff --git a/Documentation/git-prune-garbage.
[PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false
This change ensures get_sha1_basic() doesn't try to resolve full hashes as refs when ambiguous ref warnings are disabled. This provides a substantial performance improvement when passing many hashes to a command (like "git rev-list --stdin") when core.warnambiguousrefs is false. The check incurs 6 stat()s for every hash supplied, which can be costly over NFS. --- sha1_name.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index e9c2999..10bd007 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -451,9 +451,9 @@ static int get_sha1_basic(const char *str, int len, unsigned char *sha1) int at, reflog_len, nth_prior = 0; if (len == 40 && !get_sha1_hex(str, sha1)) { - if (warn_on_object_refname_ambiguity) { + if (warn_ambiguous_refs && warn_on_object_refname_ambiguity) { refs_found = dwim_ref(str, len, tmp_sha1, &real_ref); - if (refs_found > 0 && warn_ambiguous_refs) { + if (refs_found > 0) { warning(warn_msg, len, str); if (advice_object_name_warning) fprintf(stderr, "%s\n", _(object_name_msg)); -- 1.8.3.4 (Apple Git-47) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false
On Mon, Jan 6, 2014 at 7:32 PM, Brodie Rao wrote: > This change ensures get_sha1_basic() doesn't try to resolve full hashes > as refs when ambiguous ref warnings are disabled. > > This provides a substantial performance improvement when passing many > hashes to a command (like "git rev-list --stdin") when > core.warnambiguousrefs is false. The check incurs 6 stat()s for every > hash supplied, which can be costly over NFS. Forgot to add: Signed-off-by: Brodie Rao > --- > sha1_name.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sha1_name.c b/sha1_name.c > index e9c2999..10bd007 100644 > --- a/sha1_name.c > +++ b/sha1_name.c > @@ -451,9 +451,9 @@ static int get_sha1_basic(const char *str, int len, > unsigned char *sha1) > int at, reflog_len, nth_prior = 0; > > if (len == 40 && !get_sha1_hex(str, sha1)) { > - if (warn_on_object_refname_ambiguity) { > + if (warn_ambiguous_refs && warn_on_object_refname_ambiguity) { > refs_found = dwim_ref(str, len, tmp_sha1, &real_ref); > - if (refs_found > 0 && warn_ambiguous_refs) { > + if (refs_found > 0) { > warning(warn_msg, len, str); > if (advice_object_name_warning) > fprintf(stderr, "%s\n", > _(object_name_msg)); > -- > 1.8.3.4 (Apple Git-47) > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false
This change ensures get_sha1_basic() doesn't try to resolve full hashes as refs when ambiguous ref warnings are disabled. This provides a substantial performance improvement when passing many hashes to a command (like "git rev-list --stdin") when core.warnambiguousrefs is false. The check incurs 6 stat()s for every hash supplied, which can be costly over NFS. Signed-off-by: Brodie Rao --- sha1_name.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sha1_name.c b/sha1_name.c index e9c2999..10bd007 100644 --- a/sha1_name.c +++ b/sha1_name.c @@ -451,9 +451,9 @@ static int get_sha1_basic(const char *str, int len, unsigned char *sha1) int at, reflog_len, nth_prior = 0; if (len == 40 && !get_sha1_hex(str, sha1)) { - if (warn_on_object_refname_ambiguity) { + if (warn_ambiguous_refs && warn_on_object_refname_ambiguity) { refs_found = dwim_ref(str, len, tmp_sha1, &real_ref); - if (refs_found > 0 && warn_ambiguous_refs) { + if (refs_found > 0) { warning(warn_msg, len, str); if (advice_object_name_warning) fprintf(stderr, "%s\n", _(object_name_msg)); -- 1.8.5.2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gc: support temporarily preserving garbage
On Mon, Nov 17, 2014 at 1:34 PM, Jeff King wrote: > On Fri, Nov 14, 2014 at 03:01:05PM -0800, Junio C Hamano wrote: > >> > 23 files changed, 375 insertions(+), 7 deletions(-) >> [...] >> >> I am not sure if this much of code churn is warranted to work around >> issues that only happen on repositories on NFS servers that do not >> keep open-but-deleted files available. Is it an option to instead >> have a copy of repository locally off NFS? > > I think it is also not sufficient. This patch seems to cover only > objects. But we assume that we can atomically rename() new versions of > files into place whenever we like without disrupting existing readers. > This is the case for ref updates (and packed-refs), as well as the index > file. The destination end of the rename is an unlink() in disguise, and > would be susceptible to the same problems. I'm not aware of renaming over files happening anywhere in gc-related code. Do you think that's something that would need to be addressed in the rest of the code base before going forward with this garbage directory approach? If so, do you have any suggestions on how to tackle that problem? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html