[Previously sent to the git-users mailing list, but it probably should
be addressed here.]
A number of commands invoke "git gc --auto" to clean up the repository
when there might be a lot of dangling objects and/or there might be
far too many unpacked files. The manual pages say:
git gc:
> From: Jeff King
> why are you setting the packsize limit to 99m in the first place?
I want to copy the Git repository to box.com as a backup measure, and
my account on box.com limits files to 100 MB.
> There are more delta opportunities
In this repository, only the smallest files are text fi
> From: Junio C Hamano
> But if your definition of the boundary between "small" and "large"
> is unreasonably low (and/or your definition of "too many" is
> unreasonably small), you will always have the problem you found.
I would propose that a pack whose size is "close enough" to
packSizeLimit
> From: Jeff King
> That makes sense, though I question whether packs are really helping you
> in the first place. I wonder if you would be better off keep your
> non-delta binaries as loose objects (this would require a new option to
> pack-objects and teaching "gc --auto" to ignore these when c
I have an unpleasant bug in git-gc:
Git version: 1.8.3.1
Running on: Fedora 19 Gnu/Linux
I have an 11 GB repository. It passes git-fsck (though with a number
of dangling objects). But when I run git-gc on it, the file
refs/heads/master disappears. Since HEAD points to refs/heads/master,
this
> From: Jeff King
>
> > I have an 11 GB repository. It passes git-fsck (though with a number
> > of dangling objects). But when I run git-gc on it, the file
> > refs/heads/master disappears.
>
> That's the expected behavior. Gc runs "git pack-refs", which puts an
> entry into packed-refs and p
I've discovered a problem using Git. It's not clear to me what the
"correct" behavior should be, but it seems to me that Git is failing
in an undesirable way.
The problem arises when trying to handle a very large file. For
example:
$ git --version
git version 1.8.3.1
$ mkdir $$
> From: Duy Nguyen
> I don't know how many commands are hit by this. If you have time and
> gdb, please put a break point in die_builtin() function and send
> backtraces for those that fail. You could speed up the process by
> creating a smaller file and set the environment variable
> GIT_ALLOC_L
> From: Junio C Hamano
> You need to have enough memory (virtual is fine if you have enough
> time) to do fsck. Some part of index-pack could be refactored into
> a common helper function that could be called from fsck, but I think
> it would be a lot of work.
How much memory is "enough"? And
> From: David Lang
> Git was designed to track source code, there are warts that show up
> in the implementation when you use individual files >4GB
I'd expect that if you want to deal with files over 100k, you should
assume that it doesn't all fit in memory.
Dale
--
To unsubscribe from this lis
> From: David Lang
> well, as others noted, the problem is actually caused by doing the diffs, and
> that is something that is a very common thing to do with source code.
To some degree, my attitude comes from When I Was A Boy, when you got
16k for both your bytecode and your data, so you never
I have a large repository (17 GiB of disk used), although no single
file in the repository is over 1 GiB. (I have pack.packSizeLimit set
to "1g".) I don't know how many files are in the repository, but it
shouldn't exceed several tens of commits each containing several tens
of thousands of files.
> From: Jeff King
>
> On Mon, Dec 16, 2013 at 11:05:32AM -0500, Dale R. Worley wrote:
>
> > # git fsck
> > Checking object directories: 100% (256/256), done.
> > fatal: Out of memory, malloc failed (tried to allocate 80530636801 bytes)
> > #
>
> Can
> From: Jeff King
> One of the problems I ran into recently is that
> corrupt data can cause it to make a large allocation
One thing I notice is that in unpack_compressed_entry() in
sha1_file.c, there is a mallocz of "size" bytes. It appears that
"size" is the size of the object that is being u
> From: Junio C Hamano
>
> > I've noticed that Git by default puts long output through "less" as a
> > pager. I don't like that, but this is not the time to change
> > established behavior. But while tracking that down, I noticed that
> > the paging behavior is controlled by at least 5 things:
So I set out to verify in the code that the order of priority of pager
specification is
GIT_PAGER > core.pager > PAGER > default
I discovered that there is also a pager. configuration
variable.
I was expecting the code to be simple, uniform (with regard to the 5
sources), and reasonably well
> From: Junio C Hamano
> diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
> index b1630ba..33fbd8c 100644
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -28,11 +28,15 @@ two blob objects, or changes between two files on disk.
> words, the diff
> From: Matthieu Moy
> > const char *git_pager(int stdout_is_tty)
> > {
> > const char *pager;
> >
> > if (!stdout_is_tty)
> > return NULL;
> >
> > pager = getenv("GIT_PAGER");
> > if (!pager) {
> > if
> I've noticed that Git by default puts long output through "less" as a
> pager. I don't like that, but this is not the time to change
> established behavior. But while tracking that down, I noticed that
> the paging behavior is controlled by at least 5 things:
>
> the -p/--paginate/--no-pager o
I'm working on using "git filter-branch" to remove the history of a
large file from my repository so as to reduce the size of the
repository. This pattern of use is effective for me:
1. $ git filter-branch --index-filter 'git rm --cached --ignore-unmatch
core.4563' HEAD
2. edit .git/packed-refs
In Git, one can set up a repository with a "detached worktree", where
the .git directory is not a subdirectory of the top directory of the
work tree.
In general, Git commands on a repository with a detached worktree can
be executed by cd'ing into the directory containing the .git
directory, and ex
> From: Junio C Hamano
>
> wor...@alum.mit.edu (Dale R. Worley) writes:
>
> > In general, Git commands on a repository with a detached worktree can
> > be executed by cd'ing into the directory containing the .git
> > directory, ...
>
> Eh? News to
> From: Junio C Hamano
> Side note: without GIT_WORK_TREE environment (or
> core.worktree), there is no way to tell where the top level
> is, so you were limited to always be at the top level of
> your working tree if you used GIT_DIR to refer to a
> repository that
> From: Junio C Hamano
> Now, when you say "the cwd contains the .git directory", do you mean
>
> cd /repositories
> git add ../working/trees/proj-wt1/file
>
> updates "file" in the /repositories/proj.git/index? Or do you mean
> this?
The pattern I use is to have this:
> From: Junio C Hamano
> It was unclear to me which part of our documentation needs updating
> and how, and that was (and still is) what I was primarily interested
> in finding out.
It seems to me that what is missing is a description of the
circumstances under which Git can be run. With Subver
> From: Junio C Hamano
> > ... it's not clear why GIT_WORK_TREE exists, ...
>
> The configuration item came _way_ later than the environment, and we
> need to keep users and scripts from old world working, that is why.
OK, that explains a great deal. IIRC, I first became aware that
detached wo
> The pattern I use is to have this:
>
> /repository/.git
> with core.worktree = /working
> /working/...
>
> then
>
> cd /repository
> git add /working/x/y
> git ...
The point I'm trying to make is that it appears that all of the Git
command
Here's a slightly simpler test case for adding a symbolic link. This
test exploits the fact that on my system, /bin/awk is a symbolic link
to "gawk". As you can see, the behavior of Git differs if the link's
path is given to "git add" as an absolute path or a relative path.
Here is the test scri
(The original problem and the discussion that ensued is on the
git-users mailing list:
https://groups.google.com/forum/#!topic/git-users/lNQ7Cn35EqA)
"git commit" (and probably other operations) fail if standard input
(fd 0) is closed when git starts. A simple test case follows. (The
execution i
I've been looking into writing a proper test for this patch. My first
attempt tests the symptom that was seen initially, that "git commit"
fails if fd 0 is closed.
One problem is how to arrange for fd 0 to be closed. I could use the
bash redirection "<&-", but I think you want to be more portabl
> From: Junio C Hamano
>
> That's just a plain-vanilla part of POSIX shell behaviour, no?
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_05
"Close standard input" is so weird I never thought it was Posix. In
that case, we can eliminate the C helper progr
> From: Junio C Hamano
> > +test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
> > + git init &&
> > + echo Test. >test-file &&
> > + git add test-file &&
> > + git commit -m Message. <&-
> > +'
> > +
>
> Yup. I wonder how it would fail without the fix, though ;
I've run into a problem (with Git 1.8.3.3) where I cannot add a
symbolic link (as such) to the repository *if* its path is given
absolutely; instead Git adds the file the symbolic link points to.
(If I give the path relatively, Git does what I expect, that is, adds
the symbolic link.)
I've written
I'm working on writing a patch, but I'm running into a problem. The
patch itself is from this commit:
$ git log -1
commit 07a25537909dd277426818a39d9bc4235e755383
Author: Dale Worley
Date: Thu Jul 18 18:43:12 2013 -0400
open() returns -1 on failure, and indeed 0 is a p
> From: Duy Nguyen
> > With the above change, the test suite runs with zero failures, so it
> > doesn't affect any common Git usage.
>
> It means the test suite is incomplete. As you can see, the commit
> introducing this change does not come with a test case to catch people
> changing this.
Wh
> From: Duy Nguyen
> > Can someone give me advice on what this code *should* do?
>
> It does as the function name says: given cwd, a prefix (i.e. a
> relative path with no ".." components) and a path relative to
> cwd+prefix, convert 'path' to something relative to cwd. In the
> simplest case, p
> From: John Keeping
>
> git-format-patch(1) says:
>
> By default, the subject of a single patch is "[PATCH] " followed
> by the concatenation of lines from the commit message up to the
> first blank line (see the DISCUSSION section of git-commit(1)).
>
> I think that ac
Commit 52749 fixes a bug regarding testing the return of an open()
call for success/failure. Improve the testsuite test for that fix by
removing the helper program 'test-close-fd-0' and replacing it with
the shell redirection '<&-'. (The redirection is Posix, so it should
be portable.)
Signed-of
Commit a2cb86 ("git_mkstemps: correctly test return value of open()",
12 Jul 2013) fixes a bug regarding testing the return of an open()
call for success/failure. Add a testsuite test for that fix. The
test exercises a situation where that open() is known to return 0.
Signed-off-by: Dale Worley
---
This is a first draft of a patch that clarifies a number of points
about how patches should be formatted that have tripped me up. I have
re-filled a few of the paragraphs, which makes it hard to see from the
diff what I've changed. This listing shows the changed words between
{ ... }:
{
> From: Junio C Hamano
>
> Thanks. I thought I've already queued
>
> Message-ID: <7vfvuokpr0@alter.siamese.dyndns.org>
> aka
> http://article.gmane.org/gmane.comp.version-control.git/231680
>
> which tests
>
> git commit --allow-empty -m message <&-
My mistake... I've been so inten
> git commit --allow-empty -m message <&-
Though as of [fb56570] "Sync with maint to grab trivial doc fixes",
that test doesn't fail for me if I revert to
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
if (fd > 0)
return fd;
I hav
Clarify documentation for git-diff: State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).
Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.
Signed-off-by: Dale Worley
Clarify documentation for git-diff: State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).
Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.
Signed-off-by: Dale Worley
> From: Junio C Hamano
> I suspect that it may be a good idea to split the section altogether
> to reduce confusion like what triggered this thread, e.g.
>
> 'git diff' [--options] [--] [...]::
>
> This form is to view the changes you made relative to
> the index (st
I've noticed that Git by default puts long output through "less" as a
pager. I don't like that, but this is not the time to change
established behavior. But while tracking that down, I noticed that
the paging behavior is controlled by at least 5 things:
the -p/--paginate/--no-pager options
the G
ine will have no place to go. And perhaps
it is an important part of the patch, since "git format-patch" outputs
it?
If you could give me some guidance in regard to the "From e87227..."
line, that would be helpful. (I suppose I should try to improve that
paragraph of SubmittingPatche
Describe how 'add' sets the submodule's logical name, which is used in
the configuration entry names.
Clarify that 'init' only sets up the configuration entries for
submodules that have already been added elsewhere. Describe that
arguments limit the submodules that are configured.
Signed-off-by
I'm having a problem with "git add" in version 1.7.7.6.
The situation is that I have a repository that is contained in a
second-level directory, a sub-sub-directory of "/". The core.worktree
of the repository is "/", so the working directory is the entire file
tree. I want this repository to tra
Several people have made similar mistakes in beliving that "git
submodule init" can be used for adding submodules to a working
directory, whereas "git submodule add" is the command that should be
used. That *is* documented at the top of the manual page for "git
submodule", but my error was enhance
While learning about making a documentation patch, I noticed that
Documentation/CodingGuideles isn't as clear as it could be regarding
how to edit the documentation. In particular, it says "Most (if not
all) of the documentation pages are written in AsciiDoc - and
processed into HTML output and ma
>From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
From: Dale Worley
Date: Tue, 7 May 2013 13:39:46 -0400
Subject: [PATCH] CodingGuidelines: make it clear which files in
Documentation/ are the sources
Signed-off-by: Dale R. Worley
---
While learning about makin
I have found a situation where "git log" produces (apparently)
endless output. Presumably this is a bug. Following is a (Linux)
script that reliably reproduces the error for me (on Fedora 16):
--
set -ve
# Print the git version.
git --version
# Create respository.
rm -rf .git
git init
> From: Matthieu Moy
>
> In any case, I can't reproduce with 1.8.1.2.526.gf51a757: I don't get
> undless output. On the other hand, I get a slightly misformatted output:
>
> * commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from
> 2c1e6a36f4b712e914fac994463da7d0fdb2bc6d)
> |\ Merge: 2c1e6a
(git version 1.7.7.6)
I've been learning how to use Git. While exploring "git rebase", I've
discovered that if the branch being rebased contains an "evil" merge,
that is, a merge which contains changes that are in addition to the
changes in any of the parent commits, the rebase operation will
sil
This is how I see what rebase should do:
The simple case for rebase starts from
P---Q---R---S master
\
A---B---C topic
Then "git checkout topic ; git rebase master" will change it to
P---Q---R---S master
> From: Thomas Rast
>
> wor...@alum.mit.edu (Dale R. Worley) writes:
> [...snip...]
>
> Isn't that just a very long-winded way of restating what Junio said
> earlier:
>
> > > It was suggested to make it apply the first-parent diff and record
>
57 matches
Mail list logo