Re: [PATCH] SQUASH
Am 10.08.19 um 05:03 schrieb Carlo Marcelo Arenas Belón: > Make using a general context (that is only needed with NED) to depend > on NED being selected at compile time. A custom general context is needed to convince PCRE2 to use xmalloc() instead of mallo(). That is independent of the choice of allocator. So this change would be a step backwards? René
Re: [RFC PATCH v5 0/3] grep: almost no more leaks, hopefully no crashes
Am 10.08.19 um 05:05 schrieb Carlo Arenas: > in macOS (obviously testing without NED) the following is the output > of (a hacked version) of p7801 for maint (against chromium's > repository), with René's patch on top Do you mean p7820? And what did you change? Looking at the results you removed basic and extended from the list of regex engines, right? Ugh, cloning https://chromium.googlesource.com/chromium/src.git sends more than 16GB across the wire. Is that even the right repository? Not sure if my machine can keep the relevant parts cached while grepping -- I/O times could drown out any difference due to context allocation and memory allocator choice. Let's see... > > Test HEAD^ HEAD > -- > 7820.1: perl grep 'how.to' 0.51(0.35+1.11) > 0.48(0.33+1.16) -5.9% > 7820.2: perl grep '^how to'0.47(0.33+1.08) > 0.45(0.34+1.11) -4.3% > 7820.3: perl grep '[how] to' 0.49(0.40+1.11) > 0.53(0.41+1.13) +8.2% > 7820.4: perl grep '(e.t[^ ]*|v.ry) rare' 68.72(68.77+1.14) > 72.10(72.15+1.20) +4.9% > 7820.5: perl grep 'm(ú|u)lt.b(æ|y)te' 0.48(0.35+1.12) > 0.50(0.40+1.23) +4.2% > > and this is with my squashed[2] changed on top of that : > > Test HEAD^ HEAD > -- > 7820.1: perl grep 'how.to' 0.48(0.36+1.16) > 0.46(0.33+1.09) -4.2% > 7820.2: perl grep '^how to'0.45(0.34+1.12) > 0.42(0.29+0.99) -6.7% > 7820.3: perl grep '[how] to' 0.48(0.40+1.13) > 0.52(0.43+1.16) +8.3% > 7820.4: perl grep '(e.t[^ ]*|v.ry) rare' 69.12(69.10+1.07) > 69.19(69.19+1.18) +0.1% > 7820.5: perl grep 'm(ú|u)lt.b(æ|y)te' 0.49(0.38+1.17) > 0.46(0.35+1.13) -6.1% > > the degenerate case is not something we can't fix anyway, since it is > likely a locking issue inside PCRE2 (I see at most 1 CPU doing work), > and the numbers are noisy because of the other problems I mentioned > before (hardcoded to 8 threads, running in a laptop with low number of > cores), which is why testing for performance regressions in windows is > strongly encouraged, regardless > > Carlo > > [1] > https://public-inbox.org/git/capuespgh1v1zo7smzqwcv4rx9pkvklv84gdsfcpdt7lffqx...@mail.gmail.com/ > [2] https://public-inbox.org/git/20190810030315.7519-1-care...@gmail.com/ > So I pointed GIT_PERF_LARGE_REPO to the monster mentioned above, ran the test once for warmup and here are the results of the second run: Testorigin/master pcre2-xmalloc pcre2-xmalloc+nedmalloc - 7820.1: basic grep 'how.to' 1.59(2.93+1.75) 1.60(3.04+1.74) +0.6% 1.64(2.87+1.90) +3.1% 7820.2: extended grep 'how.to' 1.59(2.98+1.66) 1.55(2.83+1.76) -2.5% 1.67(3.15+1.70) +5.0% 7820.3: perl grep 'how.to' 1.25(1.21+2.13) 1.25(1.24+2.08) +0.0% 1.29(1.32+2.08) +3.2% 7820.5: basic grep '^how to'1.52(2.82+1.66) 1.51(2.68+1.77) -0.7% 1.64(3.07+1.69) +7.9% 7820.6: extended grep '^how to' 1.57(2.84+1.75) 1.51(2.76+1.73) -3.8% 1.61(2.95+1.75) +2.5% 7820.7: perl grep '^how to' 1.21(1.15+2.10) 1.22(1.26+1.98) +0.8% 1.27(1.22+2.09) +5.0% 7820.9: basic grep '[how] to' 1.95(4.51+1.68) 1.96(4.48+1.69) +0.5% 2.00(4.66+1.65) +2.6% 7820.10: extended grep '[how] to' 1.96(4.54+1.65) 1.94(4.46+1.70) -1.0% 2.04(4.78+1.65) +4.1% 7820.11: perl grep '[how] to' 1.29(1.58+1.88) 1.28(1.50+1.94) -0.8% 1.34(1.51+2.06) +3.9% 7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 8.17(13.18+1.50) 8.29(13.36+1.37) +1.5%8.31(13.33+1.60) +1.7% 7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 8.13(13.03+1.59) 8.14(13.12+1.47) +0.1%8.30(13.35+1.56) +2.1% 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 34.96(35.80+1.68) 34.99(35.60+1.91) +0.1% 35.18(35.83+1.90) +0.6% 7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 1.57(3.03+1.64) 1.53(2.76+1.75) -2.5% 1.60(2.89+1.77) +1.9% 7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 1.52(2.83+1.69) 1.52(2.89+1.63) +0.0% 1.58(2.80+1.84) +3.9% 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 1.20(1.25+2.02) 1.21(1.30+1.96) +0.8% 1.25(1.22+2.11) +4.2% pcre2-xmalloc is my patch on top of master, +nedmalloc has the warning fixes I sent earlier and sets USE_NED_MALLOC. I don't understand why my performance is lower by factor 2.5 than yours for all perl regex tests except 7820.15 (your 7820.4), where my system is two times faster. Debian Testing, GCC 9.1.0, i5-9600K, 16GB RAM. Anyway, nedmal
Chère,
Chère, s'il vous plaît, j'attends toujours de vos nouvelles après vous avoir envoyé le second message
Re: [PATCH] SQUASH
On Sat, Aug 10, 2019 at 12:57 AM René Scharfe wrote: > > Am 10.08.19 um 05:03 schrieb Carlo Marcelo Arenas Belón: > > Make using a general context (that is only needed with NED) to depend > > on NED being selected at compile time. > > A custom general context is needed to convince PCRE2 to use xmalloc() > instead of mallo(). That is independent of the choice of allocator. > So this change would be a step backwards? My bad, you are correct. Do you mind then if I "adopt" your patch and submit a reroll with it, will also add an "equivalent" one to fix PCRE1 as well then, and we will tackle any performance deficiencies or other issues in a future series. Carlo
[L10N] Kickoff for Git 2.23.0 round #2
Hi, Git v2.23.0-rc2 has been released, and it's time to start new round of git l10n. This time there are 4 updated messages need to be translated since last update: l10n: git.pot: v2.23.0 round 2 (4 new, 6 removed) Generate po/git.pot from v2.23.0-rc2 for git v2.23.0 l10n round 2. Signed-off-by: Jiang Xin You can get it from the usual place: https://github.com/git-l10n/git-po/ As how to update your XX.po and help to translate Git, please see "Updating a XX.po file" and other sections in "po/README" file. -- Jiang Xin
Re: [RFC PATCH v5 0/3] grep: almost no more leaks, hopefully no crashes
On Sat, Aug 10, 2019 at 12:57 AM René Scharfe wrote: > > Am 10.08.19 um 05:05 schrieb Carlo Arenas: > > in macOS (obviously testing without NED) the following is the output > > of (a hacked version) of p7801 for maint (against chromium's > > repository), with René's patch on top > > Do you mean p7820? And what did you change? Looking at the results you > removed basic and extended from the list of regex engines, right? correct, that was a weird typo there; apologize for the confusion. you were correct about my changes; ironically, I didn't spell those changes out to avoid confusion. > Ugh, cloning https://chromium.googlesource.com/chromium/src.git sends > more than 16GB across the wire. Is that even the right repository? noticed it was mentioned before by people[1] doing performance testing on grep specifically and thought it was better than try to come with another one, the linux kernel wouldn't work in macOS because it breaks "run" as it fails to checkout in a case insensitive filesystem. > Not > sure if my machine can keep the relevant parts cached while grepping -- > I/O times could drown out any difference due to context allocation and > memory allocator choice. Let's see... ;), try setting /proc/sys/vm/swappiness to 0 or get more RAM > I don't understand why my performance is lower by factor 2.5 than yours > for all perl regex tests except 7820.15 (your 7820.4), where my system > is two times faster. Debian Testing, GCC 9.1.0, i5-9600K, 16GB RAM. interesting; did you also see at most 100% of one CPU being used? yours seem to be faster than mine so this might be representative of single threaded performance. > Anyway, nedmalloc is slower across the board, but the impact of my > patch is in the noise. Right? yes, and there is a lot of noise. Carlo [1] https://public-inbox.org/git/e72330c6747218545cce1b6b1edfd1e448141a8f.1563570204.git.matheus.bernard...@usp.br/
Re: [PATCH] SQUASH
Hi Carlo, On Fri, 9 Aug 2019, Carlo Marcelo Arenas Belón wrote: > diff --git a/grep.c b/grep.c > index 8255ec956e..233072ed80 100644 > --- a/grep.c > +++ b/grep.c > @@ -482,6 +482,7 @@ static void free_pcre1_regexp(struct grep_pat *p) > #endif /* !USE_LIBPCRE1 */ > > #ifdef USE_LIBPCRE2 > +#ifdef USE_NED_ALLOCATOR I really, really, really, really, _really_ don't want this to be a compile-time thing. Really. No, really. I mean it. Ciao, Dscho > static void *pcre2_malloc(PCRE2_SIZE size, MAYBE_UNUSED void *memory_data) > { > return xmalloc(size); > @@ -491,6 +492,7 @@ static void pcre2_free(void *pointer, MAYBE_UNUSED void > *memory_data) > { > free(pointer); > } > +#endif > > static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt > *opt) > { > @@ -505,7 +507,9 @@ static void compile_pcre2_pattern(struct grep_pat *p, > const struct grep_opt *opt > > assert(opt->pcre2); > > +#ifdef USE_NED_ALLOCATOR > p->pcre2_general_context = pcre2_general_context_create(pcre2_malloc, > pcre2_free, NULL); > +#endif > p->pcre2_compile_context = > pcre2_compile_context_create(p->pcre2_general_context); > > if (opt->ignore_case) { > -- > 2.23.0.rc2 > >
Non-robust lock files in containers can lead to repo corruption
I tracked down a source of Git corrupting repositories to lock file design not being robust when containers / PID namespaces are present. In my case, the corruption stemmed from premature release of the `git gc` lock in the gc.pid file. But since the lock file code for that file is in gc.c, there could be other lock files in Git affected by the same design limitation as well. The lock design of gc.pid stores the current hostname and PID of the locking process in the file. If another process comes along and its hostname matches the stored hostname, it checks to see if the listed PID exists. If the PID is missing, it assumes the lock is stale and releases the lock. A limitation with this approach is it isn't robust in the presence of containers / PID namespaces. In containers, it is common for the hostname to match the container host's hostname. Or the hostname will be static string. In Kubernetes, all containers within a pod share the same hostname. Containers (almost always) run in separate PID namespaces, so PIDs from outside the container aren't visible to the container itself. This means that if e.g. 2 `git gc` processes are running with the same hostname in separate containers / PID namespaces, Git could prematurely release the lock file because it thinks the "other" PID is dead and repo corruption could ensue due to the 2 `git gc` processes racing with each other. The on-disk format of lock files obviously needs to be backwards compatible with older clients. One backwards compatible solution is to append something to the hostname to disambiguate containers / PID namespaces. Mercurial appends the current PID namespace identifier to the hostname [1] and my experience is that this is sufficient to mitigate the issue. It is possible more robust solutions are achievable. Gregory [1] https://www.mercurial-scm.org/repo/hg/rev/1f151a33af8e
git-svn help for non-standard tags/branches !!
Hello, I have been using git-svn tool for SVN to GIT migration. I see issues in migration of branches/tags located in SVN at different level of hierarchy. To some extent I can define the layout in config file but it's difficult in few cases. like https:/svnrepo a/b/branches/ a/b/c/branches/ a/b/c//d/e/branches/ a/b/c/d/tags/ a/tags/ How to manage this ? Is there any wildcard or regular expression support that be used in config file. Thanks Ibrahim
Re: [PATCH] SQUASH
Am 10.08.19 um 10:42 schrieb Carlo Arenas: > On Sat, Aug 10, 2019 at 12:57 AM René Scharfe wrote: >> >> Am 10.08.19 um 05:03 schrieb Carlo Marcelo Arenas Belón: >>> Make using a general context (that is only needed with NED) to depend >>> on NED being selected at compile time. >> >> A custom general context is needed to convince PCRE2 to use xmalloc() >> instead of mallo(). That is independent of the choice of allocator. >> So this change would be a step backwards? > > My bad, you are correct. > > Do you mind then if I "adopt" your patch and submit a reroll with it, > will also add an "equivalent" one to fix PCRE1 as well then, and we > will tackle any performance deficiencies or other issues in a future > series. I don't mind, sounds good. René
Incorrect behavior from git checkout --patch
I am observing git checkout --patch making changes to the wrong lines of a file. This is with a clean install of git version 2.20.1 on a debian docker container (image tag 10.0 which is also "latest" as of this writing). With a diff that looks like the following: diff --git a/file.txt b/file.txt index 868aa22..ea4d786 100644 --- a/file.txt +++ b/file.txt @@ -1,35 +1,51 @@ block_one { line 1 +line 1.5 line 2 +line 2.5 line 3 2 4 6 8 line 4 +line 4.5 line 5 +line 5.5 line 6 2 4 6 8 line 7 +line 7.5 line 8 +line 8.5 line 9 2 4 6 8 line 10 +line 10.5 line 11 +line 11.5 line 12 } block_two { line 1 +line 1.5 line 2 +line 2.5 line 3 -2 5 6 9 +2 4 6 8 line 4 +line 4.5 line 5 +line 5.5 line 6 -2 5 6 9 +2 4 6 8 line 7 +line 7.5 line 8 +line 8.5 line 9 -2 5 6 9 +2 4 6 8 line 10 +line 10.5 line 11 +line 11.5 line 12 } doing a `git checkout --patch -- ./file.txt`, splitting the diff into hunks, and discarding all of the hunks that begin with numbers, e.g. @@ -22,3 +32,3 @@ line 3 -2 5 6 9 +2 4 6 8 line 4 the expected state of the file in the working directory is this: diff --git a/file.txt b/file.txt index 868aa22..9ab67a1 100644 --- a/file.txt +++ b/file.txt @@ -1,35 +1,51 @@ block_one { line 1 +line 1.5 line 2 +line 2.5 line 3 2 4 6 8 line 4 +line 4.5 line 5 +line 5.5 line 6 2 4 6 8 line 7 +line 7.5 line 8 +line 8.5 line 9 2 4 6 8 line 10 +line 10.5 line 11 +line 11.5 line 12 } block_two { line 1 +line 1.5 line 2 +line 2.5 line 3 2 5 6 9 line 4 +line 4.5 line 5 +line 5.5 line 6 2 5 6 9 line 7 +line 7.5 line 8 +line 8.5 line 9 2 5 6 9 line 10 +line 10.5 line 11 +line 11.5 line 12 } but instead the actual state of the file is this: diff --git a/file.txt b/file.txt index 868aa22..76fe65d 100644 --- a/file.txt +++ b/file.txt @@ -1,35 +1,51 @@ block_one { line 1 +line 1.5 line 2 +line 2.5 line 3 2 4 6 8 line 4 +line 4.5 line 5 +line 5.5 line 6 2 4 6 8 line 7 +line 7.5 line 8 +line 8.5 line 9 -2 4 6 8 +2 5 6 9 line 10 +line 10.5 line 11 +line 11.5 line 12 } block_two { line 1 +line 1.5 line 2 +line 2.5 line 3 2 5 6 9 line 4 +line 4.5 line 5 +line 5.5 line 6 2 5 6 9 line 7 +line 7.5 line 8 +line 8.5 line 9 -2 5 6 9 +2 4 6 8 line 10 +line 10.5 line 11 +line 11.5 line 12 } See the changes between "line 9" and "line 10" in both blocks that are not correct.
[GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases
They were disabled at 53b8d93 ("grep: disable threading in non-worktree case", 12-12-2011), due to observable performance drops. But now that zlib inflation can be performed in parallel, for some of git-grep's options, we can regain the speedup. Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*' ("Regex 2") at chromium's repository[1] I got: Threads | Regex 1 | Regex 2 -||--- 1| 17.3557s | 20.8410s 2| 9.7170s | 11.2415s 8| 6.1723s | 6.9378s These are all means of 30 executions after 2 warmup runs. All tests were executed on an i7-7700HQ with 16GB of RAM and SSD. But to make sure the optimization also performs well on HDD, the tests were repeated on an AMD Turion 64 X2 TL-62 (dual-core) with 4GB of RAM and HDD (SATA-150, 5400 rpm): Threads | Regex 1 | Regex 2 -||--- 1| 40.3347s | 47.6173s 2| 27.6547s | 35.1797s Unfortunately, textconv and submodules' operations remain thread-unsafe, needing locks to be safely executed when threaded. Because of that, it's not currently worthy to grep in parallel with them. So, when --textconv or --recurse-submodules are given for a non-worktree case, threads are kept disabled. In order to clarify this behavior, let's also add a "NOTES" section to Documentation/git-grep.txt explaining the thread usage details. [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. Signed-off-by: Matheus Tavares --- Documentation/git-grep.txt | 12 builtin/grep.c | 3 ++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt index 2d27969057..9686875fbc 100644 --- a/Documentation/git-grep.txt +++ b/Documentation/git-grep.txt @@ -330,6 +330,18 @@ EXAMPLES `git grep solution -- :^Documentation`:: Looks for `solution`, excluding files in `Documentation`. +NOTES +- + +The --threads option (and grep.threads configuration) will be ignored when +--open-files-in-pager is used, forcing a single-threaded execution. + +When grepping the index file (with --cached or giving tree objects), the +following options will also suppress thread creation: + + --recurse_submodules + --textconv + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/grep.c b/builtin/grep.c index fa5139..e5a9da471a 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1073,7 +1073,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) pathspec.recursive = 1; pathspec.recurse_submodules = !!recurse_submodules; - if (list.nr || cached || show_in_pager) { + if (show_in_pager || + ((list.nr || cached) && (recurse_submodules || opt.allow_textconv))) { if (num_threads > 1) warning(_("invalid option combination, ignoring --threads")); num_threads = 1; -- 2.22.0
[GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended()
Allow read_object_file_extended() to be called by multiple threads protecting it with a lock. The lock usage can be toggled with enable_obj_read_lock() and disable_obj_read_lock(). Probably there are many spots in read_object_file_extended()'s call chain that could be executed unlocked (and thus, in parallel). But, for now, we are only interested in allowing parallel access to zlib inflation. This is one of the sections where object reading spends most of the time and it's already thread-safe. So, to take advantage of that, the lock is released when entering it and re-acquired right after. We may refine the lock to also exploit other possible parallel spots in the future, but threaded zlib inflation should already give great speedups. Note that add_delta_base_cache() was also modified to skip adding already present entries to the cache. This wasn't possible before, but now it is since phase I and phase III of unpack_entry() may execute concurrently. Signed-off-by: Matheus Tavares --- object-store.h | 4 packfile.c | 7 ++ sha1-file.c| 61 +- 3 files changed, 67 insertions(+), 5 deletions(-) diff --git a/object-store.h b/object-store.h index 7f7b3cdd80..cfc9484995 100644 --- a/object-store.h +++ b/object-store.h @@ -159,6 +159,10 @@ const char *loose_object_path(struct repository *r, struct strbuf *buf, void *map_loose_object(struct repository *r, const struct object_id *oid, unsigned long *size); +void enable_obj_read_lock(void); +void disable_obj_read_lock(void); +void obj_read_lock(void); +void obj_read_unlock(void); void *read_object_file_extended(struct repository *r, const struct object_id *oid, enum object_type *type, diff --git a/packfile.c b/packfile.c index fc43a6c52c..de93dc50e2 100644 --- a/packfile.c +++ b/packfile.c @@ -1115,7 +1115,9 @@ unsigned long get_size_from_delta(struct packed_git *p, do { in = use_pack(p, w_curs, curpos, &stream.avail_in); stream.next_in = in; + obj_read_unlock(); st = git_inflate(&stream, Z_FINISH); + obj_read_lock(); curpos += stream.next_in - in; } while ((st == Z_OK || st == Z_BUF_ERROR) && stream.total_out < sizeof(delta_head)); @@ -1468,6 +1470,9 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset, struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent)); struct list_head *lru, *tmp; + if (get_delta_base_cache_entry(p, base_offset)) + return; + delta_base_cached += base_size; list_for_each_safe(lru, tmp, &delta_base_cache_lru) { @@ -1597,7 +1602,9 @@ static void *unpack_compressed_entry(struct packed_git *p, do { in = use_pack(p, w_curs, curpos, &stream.avail_in); stream.next_in = in; + obj_read_unlock(); st = git_inflate(&stream, Z_FINISH); + obj_read_lock(); if (!stream.avail_out) break; /* the payload is larger than it should be */ curpos += stream.next_in - in; diff --git a/sha1-file.c b/sha1-file.c index 84fd02f107..f5ff51aedb 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1560,16 +1560,54 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type, return 0; } +static pthread_mutex_t obj_read_mutex; +static int obj_read_use_lock = 0; + +/* + * Enabling the object read lock allows multiple threads to safely call the + * following functions in parallel: repo_read_object_file(), read_object_file() + * and read_object_file_extended(). + */ +void enable_obj_read_lock(void) +{ + if (obj_read_use_lock) + return; + + obj_read_use_lock = 1; + pthread_mutex_init(&obj_read_mutex, NULL); +} + +void disable_obj_read_lock(void) +{ + if (!obj_read_use_lock) + return; + + obj_read_use_lock = 0; + pthread_mutex_destroy(&obj_read_mutex); +} + +void obj_read_lock(void) +{ + if(obj_read_use_lock) + pthread_mutex_lock(&obj_read_mutex); +} + +void obj_read_unlock(void) +{ + if(obj_read_use_lock) + pthread_mutex_unlock(&obj_read_mutex); +} + /* * This function dies on corrupt objects; the callers who want to * deal with them should arrange to call read_object() and give error * messages themselves. */ -void *read_object_file_extended(struct repository *r, - const struct object_id *oid, - enum object_type *type, - unsigned long *size, - int lookup_replace) +static void *do_read_object_file_extended(struct repository *r, + const struct object_id *oid, +
[GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation
This series focuses on allowing parallel access to zlib inflation and using that to perform a faster git-grep in the non-worktree case. Threads were disabled for this case at 53b8d93 ("grep: disable threading in non-worktree case", 12-12-2011), due to performance drops. However, by allowing threads to perform inflation in parallel, we can regain the speedup. This is a good hotspot for parallelism as some test cases[1] showed that it can account for up to 48% of execution time. And besides that, inflation tasks are already independent of each other. As a result, grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*' ("Regex 2") at chromium's repository[2], I got (means of 30 executions): Threads | Regex 1 | Regex 2 -||--- 1| 17.3557s | 20.8410s 2| 9.7170s | 11.2415s 8| 6.1723s | 6.9378s As a reference, just enabling threads in the non-worktree case, without parallel inflation, I got: Threads | Regex 1 | Regex 2 -||--- 1| 17.1359s | 20.8306s 2| 14.5036s | 15.4172s 8| 13.6304s | 13.8659s For now, the optimization is not supported when --textconv or --recurse-submodules are used, but I hope to send another patchset for that still during GSoC. We may also try to allow even more parallelism, refining the added 'obj_read_lock'. [1]: https://matheustavares.gitlab.io/posts/week-6-working-at-zlib-inflation#multithreading-zlib-inflation [2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”, 04-06-2019), after a 'git gc' execution. travis build: https://travis-ci.org/matheustavares/git/builds/570255029 Matheus Tavares (4): object-store: add lock to read_object_file_extended() grep: allow locks to be enabled individually grep: disable grep_read_mutex when possible grep: re-enable threads in some non-worktree cases Documentation/git-grep.txt | 12 builtin/grep.c | 22 +++--- grep.c | 4 +-- grep.h | 8 +++-- object-store.h | 4 +++ packfile.c | 7 + sha1-file.c| 61 ++ 7 files changed, 105 insertions(+), 13 deletions(-) -- 2.22.0
[GSoC][PATCH 2/4] grep: allow locks to be enabled individually
git-grep has some internal locks to protect thread-unsafe operations when running with threads. The usage of these locks can be toggled through the variable 'grep_use_locks'. However, it's not currently possible to enable each lock individually. And since object reading has its own locks now, it is desirable to disable the respective grep lock (and only that) in cases where we can do so. To do that, transform 'grep_use_locks' from a binary variable to a bitmask, which controls each lock individually. The actual disabling of grep_read_lock, when possible, will be done in the following patch. Signed-off-by: Matheus Tavares --- builtin/grep.c | 2 +- grep.c | 4 ++-- grep.h | 8 ++-- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index 560051784e..a871bad8ad 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -205,7 +205,7 @@ static void start_threads(struct grep_opt *opt) pthread_cond_init(&cond_add, NULL); pthread_cond_init(&cond_write, NULL); pthread_cond_init(&cond_result, NULL); - grep_use_locks = 1; + grep_use_locks = GREP_USE_ALL_LOCKS; for (i = 0; i < ARRAY_SIZE(todo); i++) { strbuf_init(&todo[i].out, 0); diff --git a/grep.c b/grep.c index cd952ef5d3..3aca0db435 100644 --- a/grep.c +++ b/grep.c @@ -1523,13 +1523,13 @@ pthread_mutex_t grep_attr_mutex; static inline void grep_attr_lock(void) { - if (grep_use_locks) + if (grep_use_locks & GREP_USE_ATTR_LOCK) pthread_mutex_lock(&grep_attr_mutex); } static inline void grep_attr_unlock(void) { - if (grep_use_locks) + if (grep_use_locks & GREP_USE_ATTR_LOCK) pthread_mutex_unlock(&grep_attr_mutex); } diff --git a/grep.h b/grep.h index 1875880f37..02bffacfa2 100644 --- a/grep.h +++ b/grep.h @@ -229,6 +229,10 @@ int grep_source(struct grep_opt *opt, struct grep_source *gs); struct grep_opt *grep_opt_dup(const struct grep_opt *opt); int grep_threads_ok(const struct grep_opt *opt); +#define GREP_USE_READ_LOCK (1 << 0) +#define GREP_USE_ATTR_LOCK (1 << 1) +#define GREP_USE_ALL_LOCKS (~0) + /* * Mutex used around access to the attributes machinery if * opt->use_threads. Must be initialized/destroyed by callers! @@ -239,13 +243,13 @@ extern pthread_mutex_t grep_read_mutex; static inline void grep_read_lock(void) { - if (grep_use_locks) + if (grep_use_locks & GREP_USE_READ_LOCK) pthread_mutex_lock(&grep_read_mutex); } static inline void grep_read_unlock(void) { - if (grep_use_locks) + if (grep_use_locks & GREP_USE_READ_LOCK) pthread_mutex_unlock(&grep_read_mutex); } -- 2.22.0
[GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible
git-grep uses 'grep_read_mutex' to protect some object reading operations. But these have their own internal lock now, which ensure a better performance (with more parallel regions). So, disable the former when it's possible to use the latter, with enable_obj_read_lock(). Signed-off-by: Matheus Tavares --- builtin/grep.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/builtin/grep.c b/builtin/grep.c index a871bad8ad..fa5139 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -205,7 +205,17 @@ static void start_threads(struct grep_opt *opt) pthread_cond_init(&cond_add, NULL); pthread_cond_init(&cond_write, NULL); pthread_cond_init(&cond_result, NULL); - grep_use_locks = GREP_USE_ALL_LOCKS; + if (recurse_submodules || opt->allow_textconv) { + /* +* textconv and submodules' operations are not thread-safe yet +* so we must use grep_read_lock when grepping multithreaded +* with these options. +*/ + grep_use_locks = GREP_USE_ALL_LOCKS; + } else { + grep_use_locks = GREP_USE_ATTR_LOCK; + enable_obj_read_lock(); + } for (i = 0; i < ARRAY_SIZE(todo); i++) { strbuf_init(&todo[i].out, 0); @@ -227,7 +237,7 @@ static void start_threads(struct grep_opt *opt) } } -static int wait_all(void) +static int wait_all(struct grep_opt *opt) { int hit = 0; int i; @@ -263,6 +273,9 @@ static int wait_all(void) pthread_cond_destroy(&cond_write); pthread_cond_destroy(&cond_result); grep_use_locks = 0; + if (!recurse_submodules && !opt->allow_textconv) { + disable_obj_read_lock(); + } return hit; } @@ -1140,7 +1153,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix) } if (num_threads > 1) - hit |= wait_all(); + hit |= wait_all(&opt); if (hit && show_in_pager) run_pager(&opt, prefix); clear_pathspec(&pathspec); -- 2.22.0
Re: [RFC PATCH v5 0/3] grep: almost no more leaks, hopefully no crashes
Am 10.08.19 um 14:40 schrieb Carlo Arenas: > On Sat, Aug 10, 2019 at 12:57 AM René Scharfe wrote: >> I don't understand why my performance is lower by factor 2.5 than yours >> for all perl regex tests except 7820.15 (your 7820.4), where my system >> is two times faster. Debian Testing, GCC 9.1.0, i5-9600K, 16GB RAM. > > interesting; did you also see at most 100% of one CPU being used? For 7820.15 yes -- 100% of a single core, the rest idling. The other tests had only brief sequential phases, if at all, and used all cores. René
[RFC PATCH] http: use xmalloc with cURL
4a30976e28 ([PATCH] support older versions of libcurl, 2005-07-28) added support for conditionally initializing cURL but when f0ed8226c9 (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31) that support wasn't updated to make sure cURL will use the same allocator than git if compiled with USE_NED_ALLOCATOR=YesPlease (usually done in Windows) tested in macOS 10.14.6 with the system provided cURL (7.54.0) and latest (7.65.3) and while the API used should be added starting around 7.12.0 (mid 2014). couldn't get a release that old to build and therefore the current mismatch is unlikely to be found while testing because of that. cURL is very strict about its allocator being thread safe and so that might be an issue to look for. Signed-off-by: Carlo Marcelo Arenas Belón --- http.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/http.h b/http.h index b429f1cf04..59ec4cbd30 100644 --- a/http.h +++ b/http.h @@ -27,6 +27,9 @@ #endif #if LIBCURL_VERSION_NUM < 0x070800 #define curl_global_init(a) do { /* nothing */ } while (0) +#else +#define curl_global_init(a) curl_global_init_mem(a, xmalloc, free, \ + xrealloc, xstrdup, xcalloc) #endif #if (LIBCURL_VERSION_NUM < 0x070c04) || (LIBCURL_VERSION_NUM == 0x071000) -- 2.23.0.rc2
Re: [RFC PATCH] http: use xmalloc with cURL
On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote: tested in macOS 10.14.6 with the system provided cURL (7.54.0) and latest (7.65.3) and while the API used should be added starting around 7.12.0 (mid 2014). Let me just gently point out that 7.12.0 was relased mid *2004*, see https://curl.haxx.se/docs/releases.html -- / daniel.haxx.se
Re: [RFC PATCH] http: use xmalloc with cURL
On Sat, Aug 10, 2019 at 3:17 PM Daniel Stenberg wrote: > > On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote: > > > tested in macOS 10.14.6 with the system provided cURL (7.54.0) > > and latest (7.65.3) and while the API used should be added starting around > > 7.12.0 (mid 2014). > > Let me just gently point out that 7.12.0 was relased mid *2004*, see > https://curl.haxx.se/docs/releases.html Indeed; just a typo but relevant to get right for the discussion, thanks for the gentle clarification Carlo
How to determine when to stop receiving pack content
Hi, I am trying to write an implementation of git clone over ssh and am a little confused how to determine a server response has ended. Specifically, after a client sends its requested 'want', the server sends the pack content over. However, how does the client know to stop reading data? If I run a simple read() of the file descriptor: A. If I use reading blocking, the client will wait until new data is available, potentially forever. B. If I use non-blocking, the client might terminate reading for new data, when in reality new data is in transit. I do not see a mechanism to specify the size or to indicate the end of the pack content. Am I missing something? Thanks --- Farhan Khan PGP Fingerprint: 1312 89CE 663E 1EB2 179C 1C83 C41D 2281 F8DA C0DE