[PATCH v3 3/6] pack-objects: add --sparse option

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Add a '--sparse' option flag to the pack-objects builtin. This allows the user to specify that they want to use the new logic for walking trees. This logic currently does not differ from the existing output, but will in a later change. Create a new test script, t5322-pack-ob

[PATCH v3 1/6] revision: add mark_tree_uninteresting_sparse

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In preparation for a new algorithm that walks fewer trees when creating a pack from a set of revisions, create a method that takes an oidset of tree oids and marks reachable objects as UNINTERESTING. The current implementation uses the existing mark_tree_uninteresting to rec

[PATCH v3 0/6] Add a new "sparse" tree walk algorithm

2018-12-10 Thread Derrick Stolee via GitGitGadget
One of the biggest remaining pain points for users of very large repositories is the time it takes to run 'git push'. We inspected some slow pushes by our developers and found that the "Enumerating Objects" phase of a push was very slow. This is unsurprising, because this is why reachability bitmap

[PATCH v3 6/6] pack-objects: create GIT_TEST_PACK_SPARSE

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse object walk algorithm by default during the test suite. Enabling this variable ensures coverage in many interesting cases, such as shallow clones, partial clones, and missing objects. Signed-off-by: Derrick Sto

[PATCH v3 5/6] pack-objects: create pack.useSparse setting

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The '--sparse' flag in 'git pack-objects' changes the algorithm used to enumerate objects to one that is faster for individual users pushing new objects that change only a small cone of the working directory. The sparse algorithm is not recommended for a server, which likely

[PATCH v3 2/6] list-objects: consume sparse tree walk

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When creating a pack-file using 'git pack-objects --revs' we provide a list of interesting and uninteresting commits. For example, a push operation would make the local topic branch be interesting and the known remote refs as uninteresting. We want to discover the set of new

[PATCH v3 4/6] revision: implement sparse algorithm

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When enumerating objects to place in a pack-file during 'git pack-objects --revs', we discover the "frontier" of commits that we care about and the boundary with commit we find uninteresting. From that point, we walk trees to discover which trees and blobs are uninteresting.

[PATCH 0/5] Create 'expire' and 'repack' verbs for git-multi-pack-index

2018-12-10 Thread Derrick Stolee via GitGitGadget
The multi-pack-index provides a fast way to find an object among a large list of pack-files. It stores a single pack-reference for each object id, so duplicate objects are ignored. Among a list of pack-files storing the same object, the most-recently modified one is used. Create new verbs for the

[PATCH 4/5] multi-pack-index: prepare 'repack' verb

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-fi

[PATCH 1/5] multi-pack-index: prepare for 'expire' verb

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate

[PATCH 5/5] midx: implement midx_repack()

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee To repack using a multi-pack-index, first sort all pack-files by their modified time. Second, walk those pack-files from oldest to newest, adding the packs to a list if they are smaller than the given pack-size. Finally, collect the objects from the multi-pack- index that are

[PATCH 2/5] midx: refactor permutation logic

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When writing a multi-pack-index, we keep track of an integer permutation, tracking the list of pack-files that we know about (both from the existing multi-pack-index and the new pack-files being introduced) and converting them into a sorted order for the new multi-pack-index.

[PATCH 3/5] multi-pack-index: implement 'expire' verb

2018-12-10 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The 'git multi-pack-index expire' command looks at the existing mult-pack-index, counts the number of objects referenced in each pack-file, deletes the pack-fils with no referenced objects, and rewrites the multi-pack-index to no longer reference those packs. Refactor the wr

[PATCH 0/1] .gitattributes: ensure t/oid-info/* has eol=lf

2018-12-11 Thread Derrick Stolee via GitGitGadget
I noticed that our CI builds (see [1] for an example) were returning success much faster than they did before Git v2.20.0. Turns out that there was a test script failure involving the new test hash logic. error: bug in the test script: bad hash algorithm make[1]: *** [Makefile:56: t-basic.sh]

[PATCH 1/1] .gitattributes: ensure t/oid-info/* has eol=lf

2018-12-11 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The new test_oid machinery in the test library requires reading some information from t/oid-info/hash-info and t/oid-info/oid. The shell logic that reads from these files is sensitive to CRLF line endings, causing a problem when the test suite is run on a Windows machine that

[PATCH v2 1/2] .gitattributes: ensure t/oid-info/* has eol=lf

2018-12-12 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The new test_oid machinery in the test library requires reading some information from t/oid-info/hash-info and t/oid-info/oid. The shell logic that reads from these files is sensitive to CRLF line endings, causing a problem when the test suite is run on a Windows machine that

[PATCH v2 0/2] Add more eol=lf to .gitattributes

2018-12-12 Thread Derrick Stolee via GitGitGadget
I noticed that our CI builds (see [1] for an example) were returning success much faster than they did before Git v2.20.0. Turns out that there was a test script failure involving the new test hash logic. error: bug in the test script: bad hash algorithm make[1]: *** [Makefile:56: t-basic.sh]

[PATCH v4 0/6] Add a new "sparse" tree walk algorithm

2018-12-14 Thread Derrick Stolee via GitGitGadget
One of the biggest remaining pain points for users of very large repositories is the time it takes to run 'git push'. We inspected some slow pushes by our developers and found that the "Enumerating Objects" phase of a push was very slow. This is unsurprising, because this is why reachability bitmap

[PATCH v4 2/6] list-objects: consume sparse tree walk

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When creating a pack-file using 'git pack-objects --revs' we provide a list of interesting and uninteresting commits. For example, a push operation would make the local topic branch be interesting and the known remote refs as uninteresting. We want to discover the set of new

[PATCH v4 6/6] pack-objects: create GIT_TEST_PACK_SPARSE

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse object walk algorithm by default during the test suite. Enabling this variable ensures coverage in many interesting cases, such as shallow clones, partial clones, and missing objects. Signed-off-by: Derrick Sto

[PATCH v4 3/6] pack-objects: add --sparse option

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Add a '--sparse' option flag to the pack-objects builtin. This allows the user to specify that they want to use the new logic for walking trees. This logic currently does not differ from the existing output, but will in a later change. Create a new test script, t5322-pack-ob

[PATCH v4 4/6] revision: implement sparse algorithm

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When enumerating objects to place in a pack-file during 'git pack-objects --revs', we discover the "frontier" of commits that we care about and the boundary with commit we find uninteresting. From that point, we walk trees to discover which trees and blobs are uninteresting.

[PATCH v4 1/6] revision: add mark_tree_uninteresting_sparse

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In preparation for a new algorithm that walks fewer trees when creating a pack from a set of revisions, create a method that takes an oidset of tree oids and marks reachable objects as UNINTERESTING. The current implementation uses the existing mark_tree_uninteresting to rec

[PATCH v4 5/6] pack-objects: create pack.useSparse setting

2018-12-14 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The '--sparse' flag in 'git pack-objects' changes the algorithm used to enumerate objects to one that is faster for individual users pushing new objects that change only a small cone of the working directory. The sparse algorithm is not recommended for a server, which likely

[PATCH 0/1] commit-graph: writing missing parents is a BUG

2018-12-19 Thread Derrick Stolee via GitGitGadget
A user complained that they had the following message in a git command: fatal: invalid parent position 2147483647 In hex, this value is 0x7fff, corresponding to the GRAPH_MISSING_PARENT constant. This constant was intended as a way to have the commit-graph store commits with parents that are

[PATCH 1/1] commit-graph: writing missing parents is a BUG

2018-12-19 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When writing a commit-graph, we write GRAPH_MISSING_PARENT if the parent's object id does not appear in the list of commits to be written into the commit-graph. This was done as the initial design allowed commits to have missing parents, but the final version requires the com

[PATCH v2 1/7] repack: refactor pack deletion for future use

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The repack builtin deletes redundant pack-files and their associated .idx, .promisor, .bitmap, and .keep files. We will want to re-use this logic in the future for other types of repack, so pull the logic into 'unlink_pack_path()' in packfile.c. The 'ignore_keep' parameter i

[PATCH v2 3/7] multi-pack-index: prepare for 'expire' subcommand

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate

[PATCH v2 7/7] midx: implement midx_repack()

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee To repack using a multi-pack-index, first sort all pack-files by their modified time. Second, walk those pack-files from oldest to newest, adding the packs to a list if they are smaller than the given pack-size. Finally, collect the objects from the multi-pack- index that are

[PATCH v2 6/7] multi-pack-index: prepare 'repack' subcommand

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-fi

[PATCH v2 4/7] midx: refactor permutation logic

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When writing a multi-pack-index, we keep track of an integer permutation, tracking the list of pack-files that we know about (both from the existing multi-pack-index and the new pack-files being introduced) and converting them into a sorted order for the new multi-pack-index.

[PATCH v2 5/7] multi-pack-index: implement 'expire' verb

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The 'git multi-pack-index expire' command looks at the existing mult-pack-index, counts the number of objects referenced in each pack-file, deletes the pack-fils with no referenced objects, and rewrites the multi-pack-index to no longer reference those packs. Refactor the wr

[PATCH v2 2/7] Docs: rearrange subcommands for multi-pack-index

2018-12-21 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee We will add new subcommands to the multi-pack-index, and that will make the documentation a bit messier. Clean up the 'verb' descriptions by renaming the concept to 'subcommand' and removing the reference to the object directory. Helped-by: Stefan Beller Helped-by: Szeder G

[PATCH v2 0/7] Create 'expire' and 'repack' verbs for git-multi-pack-index

2018-12-21 Thread Derrick Stolee via GitGitGadget
The multi-pack-index provides a fast way to find an object among a large list of pack-files. It stores a single pack-reference for each object id, so duplicate objects are ignored. Among a list of pack-files storing the same object, the most-recently modified one is used. Create new subcommands fo

[PATCH 1/1] git-gc.txt: fix typo about gc.writeCommitGraph

2019-01-08 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Reported-by: Stefan Haller Signed-off-by: Derrick Stolee --- Documentation/git-gc.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index c20ee6c789..a7442499f6 100644 --- a/Documentation/git-gc.txt

[PATCH 0/1] git-gc.txt: fix typo about gc.writeCommitGraph

2019-01-08 Thread Derrick Stolee via GitGitGadget
Thanks to Stefan Haller for sending me a private message about this typo. Derrick Stolee (1): git-gc.txt: fix typo about gc.writeCommitGraph Documentation/git-gc.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) base-commit: c7e8ce6d1dd02f6569ea785eebc8692e8e2edf72 Published-As: htt

[PATCH v3 0/9] Create 'expire' and 'repack' verbs for git-multi-pack-index

2019-01-09 Thread Derrick Stolee via GitGitGadget
The multi-pack-index provides a fast way to find an object among a large list of pack-files. It stores a single pack-reference for each object id, so duplicate objects are ignored. Among a list of pack-files storing the same object, the most-recently modified one is used. Create new subcommands fo

[PATCH v3 2/9] Docs: rearrange subcommands for multi-pack-index

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee We will add new subcommands to the multi-pack-index, and that will make the documentation a bit messier. Clean up the 'verb' descriptions by renaming the concept to 'subcommand' and removing the reference to the object directory. Helped-by: Stefan Beller Helped-by: Szeder G

[PATCH v3 6/9] multi-pack-index: implement 'expire' verb

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The 'git multi-pack-index expire' command looks at the existing mult-pack-index, counts the number of objects referenced in each pack-file, deletes the pack-fils with no referenced objects, and rewrites the multi-pack-index to no longer reference those packs. Refactor the wr

[PATCH v3 5/9] midx: refactor permutation logic and pack sorting

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In anticipation of the expire subcommand, refactor the way we sort the packfiles by name. This will greatly simplify our approach to dropping expired packs from the list. First, create 'struct pack_info' to replace 'struct pack_pair'. This struct contains the necessary infor

[PATCH v3 8/9] midx: implement midx_repack()

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee To repack using a multi-pack-index, first sort all pack-files by their modified time. Second, walk those pack-files from oldest to newest, adding the packs to a list if they are smaller than the given pack-size. Finally, collect the objects from the multi-pack- index that are

[PATCH v3 4/9] midx: simplify computation of pack name lengths

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Before writing the multi-pack-index, we compute the length of the pack-index names concatenated together. This forms the data in the pack name chunk, and we precompute it to compute chunk offsets. The value is also modified to fit alignment needs. Previously, this computatio

[PATCH v3 1/9] repack: refactor pack deletion for future use

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The repack builtin deletes redundant pack-files and their associated .idx, .promisor, .bitmap, and .keep files. We will want to re-use this logic in the future for other types of repack, so pull the logic into 'unlink_pack_path()' in packfile.c. The 'ignore_keep' parameter i

[PATCH v3 7/9] multi-pack-index: prepare 'repack' subcommand

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-fi

[PATCH v3 3/9] multi-pack-index: prepare for 'expire' subcommand

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate

[PATCH v3 9/9] multi-pack-index: test expire while adding packs

2019-01-09 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee During development of the multi-pack-index expire subcommand, a version went out that improperly computed the pack order if a new pack was introduced while other packs were being removed. Part of the subtlety of the bug involved the new pack being placed before other packs th

[PATCH v5 2/5] list-objects: consume sparse tree walk

2019-01-16 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When creating a pack-file using 'git pack-objects --revs' we provide a list of interesting and uninteresting commits. For example, a push operation would make the local topic branch be interesting and the known remote refs as uninteresting. We want to discover the set of new

[PATCH v5 0/5] Add a new "sparse" tree walk algorithm

2019-01-16 Thread Derrick Stolee via GitGitGadget
One of the biggest remaining pain points for users of very large repositories is the time it takes to run 'git push'. We inspected some slow pushes by our developers and found that the "Enumerating Objects" phase of a push was very slow. This is unsurprising, because this is why reachability bitmap

[PATCH v5 1/5] revision: add mark_tree_uninteresting_sparse

2019-01-16 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In preparation for a new algorithm that walks fewer trees when creating a pack from a set of revisions, create a method that takes an oidset of tree oids and marks reachable objects as UNINTERESTING. The current implementation uses the existing mark_tree_uninteresting to rec

[PATCH v5 4/5] pack-objects: create pack.useSparse setting

2019-01-16 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The '--sparse' flag in 'git pack-objects' changes the algorithm used to enumerate objects to one that is faster for individual users pushing new objects that change only a small cone of the working directory. The sparse algorithm is not recommended for a server, which likely

[PATCH v5 5/5] pack-objects: create GIT_TEST_PACK_SPARSE

2019-01-16 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Create a test variable GIT_TEST_PACK_SPARSE to enable the sparse object walk algorithm by default during the test suite. Enabling this variable ensures coverage in many interesting cases, such as shallow clones, partial clones, and missing objects. Signed-off-by: Derrick Sto

[PATCH v5 3/5] revision: implement sparse algorithm

2019-01-16 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When enumerating objects to place in a pack-file during 'git pack-objects --revs', we discover the "frontier" of commits that we care about and the boundary with commit we find uninteresting. From that point, we walk trees to discover which trees and blobs are uninteresting.

[PATCH 10/14] pack-objects: add trace2 regions

2019-01-22 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When studying the performance of 'git push' we would like to know how much time is spent at various parts of the command. One area that could cause performance trouble is 'git pack-objects'. Add trace2 regions around the three main actions taken in this command: 1. Enumerat

[PATCH 1/6] commit-graph: return with errors during write

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() method uses die() to report failure and exit when confronted with an unexpected condition. This use of die() in a library function is incorrect and is now replaced by error() statements and an int return type. Now that we use 'goto cleanup' to jump t

[PATCH 0/6] Create commit-graph file format v2

2019-01-23 Thread Derrick Stolee via GitGitGadget
The commit-graph file format has some shortcomings that were discussed on-list: 1. It doesn't use the 4-byte format ID from the_hash_algo. 2. There is no way to change the reachability index from generation numbers to corrected commit date [1]. 3. The unused byte in the

[PATCH 5/6] commit-graph: implement file format version 2

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The commit-graph file format had some shortcomings which we now correct: 1. The hash algorithm was determined by a single byte, instead of the 4-byte format identifier. 2. There was no way to update the reachability index we used. We currently only support gen

[PATCH 2/6] commit-graph: collapse parameters into flags

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() and write_commit_graph_reachable() methods currently take two boolean parameters: 'append' and 'report_progress'. We will soon expand the possible options to send to these methods, so instead of complicating the parameter list, first simplify it. Col

[PATCH 4/6] commit-graph: add --version= option

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Allo the commit-graph builtin to specify the file format version using the '--version=' option. Specify the version exactly in the verification tests as using a different version would change the offsets used in those tests. Signed-off-by: Derrick Stolee --- Documentation/

[PATCH 6/6] commit-graph: test verifying a corrupt v2 header

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The commit-graph file format v2 changes the v1 data only in the header information. Add tests that check the 'verify' subcommand catches corruption in the v2 header. Signed-off-by: Derrick Stolee --- t/t5318-commit-graph.sh | 31 +++ 1 file chan

[PATCH 3/6] commit-graph: create new version flags

2019-01-23 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In anticipation of a new commit-graph file format version, create a flag for the write_commit_graph() and write_commit_graph_reachable() methods to take a version number. When there is no specified version, the implementation selects a default value. Currently, the only vali

[PATCH v4 09/10] multi-pack-index: test expire while adding packs

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee During development of the multi-pack-index expire subcommand, a version went out that improperly computed the pack order if a new pack was introduced while other packs were being removed. Part of the subtlety of the bug involved the new pack being placed before other packs th

[PATCH v4 04/10] midx: simplify computation of pack name lengths

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Before writing the multi-pack-index, we compute the length of the pack-index names concatenated together. This forms the data in the pack name chunk, and we precompute it to compute chunk offsets. The value is also modified to fit alignment needs. Previously, this computatio

[PATCH v4 01/10] repack: refactor pack deletion for future use

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The repack builtin deletes redundant pack-files and their associated .idx, .promisor, .bitmap, and .keep files. We will want to re-use this logic in the future for other types of repack, so pull the logic into 'unlink_pack_path()' in packfile.c. The 'ignore_keep' parameter i

[PATCH v4 00/10] Create 'expire' and 'repack' verbs for git-multi-pack-index

2019-01-24 Thread Derrick Stolee via GitGitGadget
The multi-pack-index provides a fast way to find an object among a large list of pack-files. It stores a single pack-reference for each object id, so duplicate objects are ignored. Among a list of pack-files storing the same object, the most-recently modified one is used. Create new subcommands fo

[PATCH v4 05/10] midx: refactor permutation logic and pack sorting

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In anticipation of the expire subcommand, refactor the way we sort the packfiles by name. This will greatly simplify our approach to dropping expired packs from the list. First, create 'struct pack_info' to replace 'struct pack_pair'. This struct contains the necessary infor

[PATCH v4 02/10] Docs: rearrange subcommands for multi-pack-index

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee We will add new subcommands to the multi-pack-index, and that will make the documentation a bit messier. Clean up the 'verb' descriptions by renaming the concept to 'subcommand' and removing the reference to the object directory. Helped-by: Stefan Beller Helped-by: Szeder G

[PATCH v4 03/10] multi-pack-index: prepare for 'expire' subcommand

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate

[PATCH v4 07/10] multi-pack-index: prepare 'repack' subcommand

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-fi

[PATCH v4 06/10] multi-pack-index: implement 'expire' subcommand

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The 'git multi-pack-index expire' subcommand looks at the existing mult-pack-index, counts the number of objects referenced in each pack-file, deletes the pack-fils with no referenced objects, and rewrites the multi-pack-index to no longer reference those packs. Refactor the

[PATCH v4 08/10] midx: implement midx_repack()

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee To repack using a multi-pack-index, first sort all pack-files by their modified time. Second, walk those pack-files from oldest to newest, adding the packs to a list if they are smaller than the given pack-size. Finally, collect the objects from the multi-pack- index that are

[PATCH v4 10/10] midx: add test that 'expire' respects .keep files

2019-01-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The 'git multi-pack-index expire' subcommand may delete packs that are not needed from the perspective of the multi-pack-index. If a pack has a .keep file, then we should not delete that pack. Add a test that ensures we preserve a pack that would otherwise be expired. First,

[PATCH v2 10/14] pack-objects: add trace2 regions

2019-01-28 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When studying the performance of 'git push' we would like to know how much time is spent at various parts of the command. One area that could cause performance trouble is 'git pack-objects'. Add trace2 regions around the three main actions taken in this command: 1. Enumerat

[PATCH 0/1] Makefile: add prove and coverage-prove targets

2019-01-29 Thread Derrick Stolee via GitGitGadget
Sometimes there are test failures in the 'pu' branch. This is somewhat expected for a branch that takes the very latest topics under development, and those sometimes have semantic conflicts that only show up during test runs. This also can happen when running the test suite with different GIT_TEST_

[PATCH 1/1] Makefile: add prove and coverage-prove targets

2019-01-29 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When running the test suite for code coverage using 'make coverage-test', a single test failure stops the test suite from completing. This leads to significant undercounting of covered blocks. Add two new targets to the Makefile: * 'prove' runs the test suite using 'prove'.

[PATCH v2 0/1] Makefile: add prove and coverage-prove targets

2019-01-29 Thread Derrick Stolee via GitGitGadget
Sometimes there are test failures in the 'pu' branch. This is somewhat expected for a branch that takes the very latest topics under development, and those sometimes have semantic conflicts that only show up during test runs. This also can happen when running the test suite with different GIT_TEST_

[PATCH v2 1/1] Makefile: add coverage-prove target

2019-01-29 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Sometimes there are test failures in the 'pu' branch. This is somewhat expected for a branch that takes the very latest topics under development, and those sometimes have semantic conflicts that only show up during test runs. This also can happen when running the test suite w

[PATCH v3 10/14] pack-objects: add trace2 regions

2019-01-30 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When studying the performance of 'git push' we would like to know how much time is spent at various parts of the command. One area that could cause performance trouble is 'git pack-objects'. Add trace2 regions around the three main actions taken in this command: 1. Enumerat

[PATCH v4 10/14] pack-objects: add trace2 regions

2019-01-30 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When studying the performance of 'git push' we would like to know how much time is spent at various parts of the command. One area that could cause performance trouble is 'git pack-objects'. Add trace2 regions around the three main actions taken in this command: 1. Enumerat

[PATCH v5 10/15] trace2:data: pack-objects: add trace2 regions

2019-02-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee When studying the performance of 'git push' we would like to know how much time is spent at various parts of the command. One area that could cause performance trouble is 'git pack-objects'. Add trace2 regions around the three main actions taken in this command: 1. Enumerat

[PATCH v2 0/5] Create commit-graph file format v2

2019-04-24 Thread Derrick Stolee via GitGitGadget
The commit-graph file format has some shortcomings that were discussed on-list: 1. It doesn't use the 4-byte format ID from the_hash_algo. 2. There is no way to change the reachability index from generation numbers to corrected commit date [1]. 3. The unused byte in the

[PATCH v2 5/5] commit-graph: implement file format version 2

2019-04-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The commit-graph file format had some shortcomings which we now correct: 1. The hash algorithm was determined by a single byte, instead of the 4-byte format identifier. 2. There was no way to update the reachability index we used. We currently only support gen

[PATCH v2 3/5] commit-graph: create new version flags

2019-04-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In anticipation of a new commit-graph file format version, create a flag for the write_commit_graph() and write_commit_graph_reachable() methods to take a version number. When there is no specified version, the implementation selects a default value. Currently, the only vali

[PATCH v2 1/5] commit-graph: return with errors during write

2019-04-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() method uses die() to report failure and exit when confronted with an unexpected condition. This use of die() in a library function is incorrect and is now replaced by error() statements and an int return type. Now that we use 'goto cleanup' to jump t

[PATCH v2 2/5] commit-graph: collapse parameters into flags

2019-04-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() and write_commit_graph_reachable() methods currently take two boolean parameters: 'append' and 'report_progress'. We will soon expand the possible options to send to these methods, so instead of complicating the parameter list, first simplify it. Col

[PATCH v2 4/5] commit-graph: add --version= option

2019-04-24 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Allow the commit-graph builtin to specify the file format version using the '--version=' option. Specify the version exactly in the verification tests as using a different version would change the offsets used in those tests. Signed-off-by: Derrick Stolee --- Documentation

[PATCH 1/1] commit-graph: improve error messages

2019-04-26 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The error messages when reading a commit-graph have a few problems: 1. Some values are output in hexadecimal, but that is not made clear by the message. Prepend "0x" to these values. 2. The version number does not need to be hexadecimal, and also should mention a "max

[PATCH 0/1] commit-graph: improve error messages

2019-04-26 Thread Derrick Stolee via GitGitGadget
Here is a small patch that revises the error messages from ab/commit-graph-fixes, as recommended by Ævar. Hopefully, it can be merged faster than the commit-graph v2 stuff, and I can update that series to include this change if we agree it is a good one. Thanks, -Stolee Cc: ava...@gmail.com In-R

[PATCH 2/2] midx: add packs to packed_git linked list

2019-04-29 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The multi-pack-index allows searching for objects across multiple packs using one object list. The original design gains many of these performance benefits by keeping the packs in the multi-pack-index out of the packed_git list. Unfortunately, this has one major drawback. If

[PATCH 1/2] midx: pass a repository pointer

2019-04-29 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Much of the multi-pack-index code focuses on the multi_pack_index struct, and so we only pass a pointer to the current one. However, we will insert a dependency on the packed_git linked list in a future change, so we will need a repository reference. Inserting these parameter

[PATCH 0/2] Multi-pack-index: Fix "too many file descriptors" bug

2019-04-29 Thread Derrick Stolee via GitGitGadget
Thanks to Jeff H for finding the problem with the multi-pack-index regarding many packs. Specifically: if we open too many packs, the close_one_pack() method cannot find the packs from the multi-pack-index to close. Jeff already fixed the problem explicitly in 'git multi-pack-index verify' which w

[PATCH v3 3/6] commit-graph: create new version parameter

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee In anticipation of a new commit-graph file format version, create a parameter for the write_commit_graph() and write_commit_graph_reachable() methods to take a version number. When the given version is zero, the implementation selects a default value. Currently, the only val

[PATCH v3 1/6] commit-graph: return with errors during write

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() method uses die() to report failure and exit when confronted with an unexpected condition. This use of die() in a library function is incorrect and is now replaced by error() statements and an int return type. Now that we use 'goto cleanup' to jump t

[PATCH v3 2/6] commit-graph: collapse parameters into flags

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The write_commit_graph() and write_commit_graph_reachable() methods currently take two boolean parameters: 'append' and 'report_progress'. We will soon expand the possible options to send to these methods, so instead of complicating the parameter list, first simplify it. Col

[PATCH v3 5/6] commit-graph: implement file format version 2

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The commit-graph file format had some shortcomings which we now correct: 1. The hash algorithm was determined by a single byte, instead of the 4-byte format identifier. 2. There was no way to update the reachability index we used. We currently only support gen

[PATCH v3 0/6] Create commit-graph file format v2

2019-05-01 Thread Derrick Stolee via GitGitGadget
The commit-graph file format has some shortcomings that were discussed on-list: 1. It doesn't use the 4-byte format ID from the_hash_algo. 2. There is no way to change the reachability index from generation numbers to corrected commit date [1]. 3. The unused byte in the

[PATCH v3 4/6] commit-graph: add --version= option

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Allow the commit-graph builtin to specify the file format version using the '--version=' option. Specify the version exactly in the verification tests as using a different version would change the offsets used in those tests. Signed-off-by: Derrick Stolee --- Documentation

[PATCH v3 6/6] commit-graph: remove Future Work section

2019-05-01 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The commit-graph feature began with a long list of planned benefits, most of which are now complete. The future work section has only a few items left. As for making more algorithms aware of generation numbers, some are only waiting for generation number v2 to ensure the per

[PATCH 17/17] fetch: add fetch.writeCommitGraph config setting

2019-05-08 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Signed-off-by: Derrick Stolee --- builtin/fetch.c | 17 + 1 file changed, 17 insertions(+) diff --git a/builtin/fetch.c b/builtin/fetch.c index b620fd54b4..cf0944bad5 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -23,6 +23,7 @@ #include "packfile.h

[PATCH 14/17] commit-graph: load split commit-graph files

2019-05-08 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee Starting with commit-graph, load commit-graph files in a sequence as follows: commit-graph commit-graph-1 commit-graph-2 ... commit-graph-N This creates N + 1 files in order. Signed-off-by: Derrick Stolee --- commit-graph.c | 39

[PATCH 01/17] commit-graph: fix the_repository reference

2019-05-08 Thread Derrick Stolee via GitGitGadget
From: Derrick Stolee The parse_commit_buffer() method takes a repository pointer, so it should not refer to the_repository anymore. Signed-off-by: Derrick Stolee --- commit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/commit.c b/commit.c index a5333c7ac6..e4d1233226 10

  1   2   3   4   5   6   >