git gui blame: Possible to pass more options in?
Hi all, I tried to run git gui blame -C somefile but I got the following error message: error: Unknown switch `C' usage: git cat-file (-t [What exactly "git cat-file -C" prints out for me] Is there a way to pass in additional options to git gui blame for now? Thanks! Hong
Re: [PATCH v2] transport-helper: enforce atomic in push_refs_with_push
Junio, On Thu, Jul 18, 2019 at 05:22:34PM +0200, SZEDER Gábor wrote: > Subject: [PATCH] travis-ci: build with GCC 4.8 as well This patch conflicts with topic 'js/trace2-json-schema', and the current conflict resolution in 'pu' is not quite correct. > diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh > index cdd2913440..ff0ef7f08e 100755 > --- a/ci/run-build-and-tests.sh > +++ b/ci/run-build-and-tests.sh > @@ -11,9 +11,9 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw > "$cache_dir/.prove")";; > esac > > make > -make test > -if test "$jobname" = "linux-gcc" > -then > +case "$jobname" in > +linux-gcc) > + make test This 'make test' here is important, but the confict resolution accidentally removed it. > export GIT_TEST_SPLIT_INDEX=yes > export GIT_TEST_FULL_IN_PACK_ARRAY=true > export GIT_TEST_OE_SIZE=10 > @@ -21,7 +21,16 @@ then > export GIT_TEST_COMMIT_GRAPH=1 > export GIT_TEST_MULTI_PACK_INDEX=1 > make test > -fi > + ;; > +linux-gcc-4.8) > + # Don't run the tests; we only care about whether Git can be > + # built with GCC 4.8, as it errors out on some undesired (C99) > + # constructs that newer compilers seem to quietly accept. > + ;; > +*) > + make test > + ;; > +esac > > check_unignored_build_artifacts > > -- > 2.22.0.810.g50207c7d84 > >
Re: [PATCH v2] transport-helper: enforce atomic in push_refs_with_push
SZEDER Gábor writes: > Junio, > > On Thu, Jul 18, 2019 at 05:22:34PM +0200, SZEDER Gábor wrote: >> Subject: [PATCH] travis-ci: build with GCC 4.8 as well > > This patch conflicts with topic 'js/trace2-json-schema', and the > current conflict resolution in 'pu' is not quite correct. Thanks. "git diff ...MERGE_HEAD" during a merge of js/trace2-json-schema gives me this patch: diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index cdd2913440..ec38bf379a 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -14,6 +14,8 @@ make make test if test "$jobname" = "linux-gcc" then + make -C t/trace_schema_validator + export GIT_TRACE2_EVENT=$(mktemp) export GIT_TEST_SPLIT_INDEX=yes export GIT_TEST_FULL_IN_PACK_ARRAY=true export GIT_TEST_OE_SIZE=10 @@ -21,6 +23,10 @@ then export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_MULTI_PACK_INDEX=1 make test + t/trace_schema_validator/trace_schema_validator \ + --trace2_event_file=${GIT_TRACE2_EVENT} \ + --schema_file=t/trace_schema_validator/strict_schema.json \ + --progress=1 fi check_unignored_build_artifacts i.e. they want to run an extra make in that validator directory, export another environment variable, and then run the validator *after* running the normal "make test", in linux-gcc job. >> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh >> index cdd2913440..ff0ef7f08e 100755 >> --- a/ci/run-build-and-tests.sh >> +++ b/ci/run-build-and-tests.sh >> @@ -11,9 +11,9 @@ windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw >> "$cache_dir/.prove")";; >> esac >> >> make >> -make test >> -if test "$jobname" = "linux-gcc" >> -then >> +case "$jobname" in >> +linux-gcc) >> +make test > > This 'make test' here is important, but the confict resolution > accidentally removed it. Right. Thanks for spotting. I can see in "git diff pu~2 pu~1" that indeed the first 'make test' that we want to run without any of these environment variables is lost in the merge. We want to run two tests, with or without these environment variables, and the validator wants to piggyback on the second one. Will fix in the meantime, but I was expecting that this "validator in CI" business to be redone differently, perhaps with a different validator implementation and either in a separate job or just part of an existing job but trace enabled only for some subset of the tests and/or only for new tests specifically written for trace coverage, so after that happens this may turn into a moot point. The change the merge brings in to the file now reads like this. Thanks, again. diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index ff0ef7f08e..35ff4d3038 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -14,6 +14,9 @@ make case "$jobname" in linux-gcc) make test + + make -C t/trace_schema_validator + export GIT_TRACE2_EVENT=$(mktemp) export GIT_TEST_SPLIT_INDEX=yes export GIT_TEST_FULL_IN_PACK_ARRAY=true export GIT_TEST_OE_SIZE=10 @@ -21,6 +24,11 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_MULTI_PACK_INDEX=1 make test + + t/trace_schema_validator/trace_schema_validator \ + --trace2_event_file=${GIT_TRACE2_EVENT} \ + --schema_file=t/trace_schema_validator/strict_schema.json \ + --progress=1 ;; linux-gcc-4.8) # Don't run the tests; we only care about whether Git can be
Push force-with-lease with multi-URL remote
Hi folks, When a single remote has multiple push URLs, Git’s force-with-lease logic appears to be: For each URL: 1. Read refs/heads/mybranch (call this commit X) 2. Read refs/remotes/myremote/mybranch (call this commit Y) 3. Send to the URL an atomic compare-and-swap, replacing Y with X. 4. If step 3 succeeded, change refs/remotes/myremote/mybranch to X. This means that, assuming both URLs start out identical, the second URL will always fail because refs/remots/myremote/mybranch has been updated from Y to X, and therefore the second compare-and-swap fails. I can’t imagine any situation in which this behaviour is actually useful. This is what I would expect: 1. Read refs/heads/mybranch (call this commit X) 2. Read refs/remotes/myremote/mybranch (call this commit Y) 3. For each URL: 3a. Send to the URL an atomic compare-and-swap, replacing Y with X. 4. If any (or maybe all) of the CAS operations in 3a succeeded, change refs/remotes/myremote/mybranch to X. Thoughts? Does anyone have a use case for the existing behaviour? I have attached a shell script which constructs some repos and demonstrates the situation. Thanks! -- Christopher Head test Description: Binary data
Re: Push force-with-lease with multi-URL remote
Christopher Head writes: > Hi folks, > When a single remote has multiple push URLs, Git’s force-with-lease > logic appears to be: > > For each URL: > 1. Read refs/heads/mybranch (call this commit X) > 2. Read refs/remotes/myremote/mybranch (call this commit Y) > 3 > from Y to X, and therefore the second compare-and-swap fails. I can’t > imagine any situation in which this behaviour is actually useful. Quite honestly, the true culprit of the above story is that you are letting multiple logically different remote repositories [*1*] use the same single remote-tracking refes/remotes/myremote/ hierarchy. If your previous "git push myremote" (with or without lease does not matter, as this discussion is to illustrate that your setup is fundamentally wrong) updated X but for some reason failed to update Y (perhaps the network to Y was unreachable back then), and refs/remotes/myremote/mybrach got updated to reflect the update to X, what happens to your next "git push myremote" (or more specifically, "git push Y")? The repository on your local side thinks that the other party has already took the previous push but in reality that is the state of X, and Y hasn't seen that previous push.
Re: Push force-with-lease with multi-URL remote
On July 27, 2019 10:46:43 AM PDT, Junio C Hamano wrote: >Christopher Head writes: > >Quite honestly, the true culprit of the above story is that you are >letting multiple logically different remote repositories [*1*] use >the same single remote-tracking refes/remotes/myremote/ hierarchy. They weren’t supposed to be logically different (if I understand what you mean by that phrase). My intention was for the different URLs to be mirrors of each other, and multiple push URLs seemed to be the easiest way to update all the mirrors without having to mess around with making them trust each other and sending notifications and such. If not this, then what are multiple push URLs on a single remote meant for? >If your previous "git push myremote" (with or without lease does not >matter, as this discussion is to illustrate that your setup is >fundamentally wrong) updated X but for some reason failed to update >Y (perhaps the network to Y was unreachable back then), and >refs/remotes/myremote/mybrach got updated to reflect the update to >X, what happens to your next "git push myremote" (or more >specifically, "git push Y")? The repository on your local side >thinks that the other party has already took the previous push but >in reality that is the state of X, and Y hasn't seen that previous >push. Of course it wouldn’t be perfect even with my proposed behaviour. It just seemed more useful than the existing behaviour, which will essentially *never* do anything useful as far as I can tell. -- Christopher Head
Re: What's cooking in git.git (Jul 2019, #06; Thu, 25)
Hi Junio On Thu, 25 Jul 2019 17:19:23 -0700 Junio C Hamano wrote: > > [...] > [New Topics] > > * bb/grep-pcre2-bug-message-fix (2019-07-23) 1 commit > (merged to 'next' on 2019-07-23 at 8bd5a68618) > + grep: print the pcre2_jit_on value > > BUG() message fix. > > The codepath may want to just simply be removed, though. > > > * ra/rebase-i-more-options (2019-07-23) 4 commits > - SQUASH??? There are only 3 commits in this "series". > - rebase -i: support --committer-date-is-author-date > - sequencer: add NULL checks under read_author_script > - rebase -i: add --ignore-whitespace flag The correct order should be: - rebase -i: add --ignore-whitespace flag - sequencer: add NULL checks under read_author_script - rebase -i: support --committer-date-is-author-date I'll soon send another revision and while on it, let's merge these topics into one. Should I also rebase them on the tip of git/git's master? > [...] Best Rohit
[PATCH 1/3] grep: make pcre1_tables version agnostic
6d4b5747f0 ("grep: change internal *pcre* variable & function names to be *pcre1*", 2017-05-25), renamed most variables to be PCRE1 specific to give space to similarly named variables for PCRE2, but in this case the change wasn't needed as the types were compatible enough (const unsigned char* vs const uint8_t*) to be shared. Revert that change, as 94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) failed to create an equivalent PCRE2 version. Signed-off-by: Carlo Marcelo Arenas Belón --- grep.c | 6 +++--- grep.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/grep.c b/grep.c index f7c3a5803e..cc65f7a987 100644 --- a/grep.c +++ b/grep.c @@ -389,14 +389,14 @@ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) if (opt->ignore_case) { if (has_non_ascii(p->pattern)) - p->pcre1_tables = pcre_maketables(); + p->pcre_tables = pcre_maketables(); options |= PCRE_CASELESS; } if (is_utf8_locale() && has_non_ascii(p->pattern)) options |= PCRE_UTF8; p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset, - p->pcre1_tables); + p->pcre_tables); if (!p->pcre1_regexp) compile_regexp_failed(p, error); @@ -462,7 +462,7 @@ static void free_pcre1_regexp(struct grep_pat *p) { pcre_free(p->pcre1_extra_info); } - pcre_free((void *)p->pcre1_tables); + pcre_free((void *)p->pcre_tables); } #else /* !USE_LIBPCRE1 */ static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt *opt) diff --git a/grep.h b/grep.h index 1875880f37..d34f66b384 100644 --- a/grep.h +++ b/grep.h @@ -89,7 +89,7 @@ struct grep_pat { pcre *pcre1_regexp; pcre_extra *pcre1_extra_info; pcre_jit_stack *pcre1_jit_stack; - const unsigned char *pcre1_tables; + const unsigned char *pcre_tables; int pcre1_jit_on; pcre2_code *pcre2_pattern; pcre2_match_data *pcre2_match_data; -- 2.22.0
[PATCH 2/3] grep: use pcre_tables also for PCRE2
94da9193a6 ("grep: add support for PCRE v2", 2017-06-01) added a local variable to keep track of the chartables (used when locale is not UTF-8 but non-ASCII characters are needed) Remove that local variable in favor of the shared one within the grep structure and that can be shared by all functions. Signed-off-by: Carlo Marcelo Arenas Belón --- grep.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/grep.c b/grep.c index cc65f7a987..d04635fad4 100644 --- a/grep.c +++ b/grep.c @@ -488,7 +488,6 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt PCRE2_UCHAR errbuf[256]; PCRE2_SIZE erroffset; int options = PCRE2_MULTILINE; - const uint8_t *character_tables = NULL; int jitret; int patinforet; size_t jitsizearg; @@ -499,9 +498,9 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt if (opt->ignore_case) { if (has_non_ascii(p->pattern)) { - character_tables = pcre2_maketables(NULL); + p->pcre_tables = pcre2_maketables(NULL); p->pcre2_compile_context = pcre2_compile_context_create(NULL); - pcre2_set_character_tables(p->pcre2_compile_context, character_tables); + pcre2_set_character_tables(p->pcre2_compile_context, p->pcre_tables); } options |= PCRE2_CASELESS; } -- 2.22.0
[PATCH 3/3] grep: plug leak of pcre chartables in PCRE2
Just as it is done with PCRE1, make sure that the allocated chartables get free at cleanup time. This assumes no global context is used (NULL passed when created the tables), but will likely be updated in tandem if that ever changes. Signed-off-by: Carlo Marcelo Arenas Belón --- grep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/grep.c b/grep.c index d04635fad4..d9768c5f05 100644 --- a/grep.c +++ b/grep.c @@ -604,6 +604,7 @@ static void free_pcre2_pattern(struct grep_pat *p) pcre2_match_data_free(p->pcre2_match_data); pcre2_jit_stack_free(p->pcre2_jit_stack); pcre2_match_context_free(p->pcre2_match_context); + free((void *)p->pcre_tables); } #else /* !USE_LIBPCRE2 */ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt) -- 2.22.0
[PATCH 0/3] grep: memory leak in PCRE2
This series fixes a small memory leak that was introduced when PCRE2 support was added, and that should be visible with t7813 and valgrind Applies cleanly to maint and master but will conflict slightly with ab/no-kwset (currently in next) and most likely other changes introduced about this same codebase (ex: ab/pcre-jit-fixes) but hopefully the spreading on short simple commits helps with reviewing. Carlo Marcelo Arenas Belón (3): grep: make pcre1_tables version agnostic grep: use pcre_tables also for PCRE2 grep: plug leak of pcre chartables in PCRE2 grep.c | 12 ++-- grep.h | 2 +- 2 files changed, 7 insertions(+), 7 deletions(-) -- 2.22.0
Re: What's cooking in git.git (Jul 2019, #06; Thu, 25)
Hi Rohit, Let me attempt to answer on Junio's behalf... On Sat, Jul 27, 2019 at 12:48 PM Rohit Ashiwal wrote: > > Hi Junio > > On Thu, 25 Jul 2019 17:19:23 -0700 Junio C Hamano wrote: > > > > * ra/rebase-i-more-options (2019-07-23) 4 commits > > - SQUASH??? > > There are only 3 commits in this "series". There are four, including Junio's commit he had to add in order to make the series merge with pu (a rename of your t3431 to the unoccupied t3433 slot). He labelled that commit "SQUASH???" and it's still quoted above. However, in general, when you submit the next round of your series, you should certainly include his fixups from his squash (or alternative fixes) inside your commits in order to get rid of the need for the squash commit. > > - rebase -i: support --committer-date-is-author-date > > - sequencer: add NULL checks under read_author_script > > - rebase -i: add --ignore-whitespace flag > > The correct order should be: >- rebase -i: add --ignore-whitespace flag >- sequencer: add NULL checks under read_author_script >- rebase -i: support --committer-date-is-author-date Are you thinking in order of application, or order that would be shown by `git log --oneline`? Junio includes the latter in his report. > I'll soon send another revision and while on it, let's merge > these topics into one. Should I also rebase them on the tip > of git/git's master? What do you mean by merge these topics into one? Do you mean merge all the commits into a single commit (which would be bad), or that your two original topics should be one, much like Junio already did? In general, once submitted, avoid rebasing unless needed to integrate with someone else's work and clean up conflicts. Hope that helps, Elijah
Re: Push force-with-lease with multi-URL remote
Christopher Head writes: > On July 27, 2019 10:46:43 AM PDT, Junio C Hamano wrote: >>Christopher Head writes: >> >>Quite honestly, the true culprit of the above story is that you are >>letting multiple logically different remote repositories [*1*] use >>the same single remote-tracking refes/remotes/myremote/ hierarchy. > > They weren’t supposed to be logically different (if I understand > what you mean by that phrase). My intention was for the different > URLs to be mirrors of each other,... What I would call "logically the same" is the set of repositories that are synchronized at the server side, which may or may not have multiple URLs to reach it, but behave as if it is just a single one without your doing anything special. Your wanting to actively make them in sync by the above definition makes them logically different set of repositories. But the phrasing does not matter much. One repository at a hosting service (which may iternally be replicated, but that is not even observable from outside) may be reached over https:// or ssh://, and the result of pushing to either one of the URLs would be observable by immediately fetching back from either one. Having both URLs and trying to use either one that works may help under flaky proxy situation, for example. In the reverse direction, I think "git fetch" supports the notion of of repositories, so that fetch from multiple remotes can be initiated with a single command, but I am not sure if we added the same concept to the pushing side. I personally want to have finer control, so when I push my work to multiple repositories, each of them are treated as totally different push targets, and a script controls multiple "git push" processes to each of them in parallel, with retries and all. I think having multiple pushURL and pushing to them is sort-of OK, but what is broken in your configuration is that you have a single remote-tracking branch hierarchy---if you get rid of it, so that refs/remotes/myremote/ does not exist and does not get updated, I think things would work fine.
Re: What's cooking in git.git (Jul 2019, #06; Thu, 25)
Hi Elijah On Sat, 27 Jul 2019 13:40:13 -0700 Elijah Newren wrote: > > Let me attempt to answer on Junio's behalf... :) > [...] > There are four, including Junio's commit he had to add in order to > make the series merge with pu (a rename of your t3431 to the > unoccupied t3433 slot). He labelled that commit "SQUASH???" and it's > still quoted above. However, in general, when you submit the next > round of your series, you should certainly include his fixups from his > squash (or alternative fixes) inside your commits in order to get rid > of the need for the squash commit. Understood! > > > - rebase -i: support --committer-date-is-author-date > > > - sequencer: add NULL checks under read_author_script > > > - rebase -i: add --ignore-whitespace flag > > > > The correct order should be: > >- rebase -i: add --ignore-whitespace flag > >- sequencer: add NULL checks under read_author_script > >- rebase -i: support --committer-date-is-author-date > > Are you thinking in order of application, or order that would be shown > by `git log --oneline`? Junio includes the latter in his report. If applied in this order, I think, there is no need of fixups. But renaming t3431 to t3433 is still required. > > I'll soon send another revision and while on it, let's merge > > these topics into one. Should I also rebase them on the tip > > of git/git's master? > > What do you mean by merge these topics into one? Do you mean merge > all the commits into a single commit (which would be bad), or that > your two original topics should be one, much like Junio already did? I am thinking of mergin the original topics, yes, just like Junio did. > In general, once submitted, avoid rebasing unless needed to integrate > with someone else's work and clean up conflicts. I have not checked but git/git:master is like 569 commits ahead of r1walz/git:master, there _might_ be conflicts. Should I rebase if need be? Thanks Rohit
Re: [GSoC] Blogging with Rohit
Hey Everyone! Here an update about my last week[1]. There were reviews on patches I submitted which need to be accounted. There is also some _mess_ (commit ordering and merging topics) that needs to be taken out. Polishing them will teach rebase -i some new flags! Thanks Rohit [1]: https://rashiwal.me/2019/second-evaluation/
Re: What's cooking in git.git (Jul 2019, #06; Thu, 25)
On Sat, Jul 27, 2019 at 2:00 PM Rohit Ashiwal wrote: > > In general, once submitted, avoid rebasing unless needed to integrate > > with someone else's work and clean up conflicts. > > I have not checked but git/git:master is like 569 commits ahead of > r1walz/git:master, there _might_ be conflicts. Should I rebase if > need be? First get your topic ready using the same base as you did for your earlier submission(s) of your series. Then when your changes are ready: * Try to merge with origin/master. If there are no conflicts, undo the merge. * Try to merge with origin/next. If there are no conflicts, undo the merge. * Try to merge with origin/pu. If there are no conflicts, undo the merge. If there are no conflicts in any of these steps, submit the next round of your series. If there are conflicts in any step, there's more work to do. If it conflicts with master, then yeah, just rebase on master. If it conflicts with next or pu, there are a few different steps you could take: (1) rebase on the topic that yours conflicts with (assuming it's just one), (2) rebase on next (if next conflicts, though this means your topic can't advance until everything else in next does first so is strongly discouraged), (3) if the conflicts are small/trivial you could just submit anyway and prominently call it out in your cover letter. If there were any conflicts or you rebased at all, make sure to call it out in your cover letter, especially if your series depends on anything not in master. And if you did anything other than rebasing on master, then expect a discussion to start around who should rebase on whom and what order we want to apply topics in, and maybe other steps to take. Elijah
Re: Push force-with-lease with multi-URL remote
On July 27, 2019 1:57:18 PM PDT, Junio C Hamano wrote: >What I would call "logically the same" is the set of repositories >that are synchronized at the server side, which may or may not have >multiple URLs to reach it, but behave as if it is just a single one >without your doing anything special. Your wanting to actively make >them in sync by the above definition makes them logically different >set of repositories. But the phrasing does not matter much. OK, I would probably have just used one push URL in this scenario. >One repository at a hosting service (which may iternally be >replicated, but that is not even observable from outside) may be >reached over https:// or ssh://, and the result of pushing to either >one of the URLs would be observable by immediately fetching back >from either one. Having both URLs and trying to use either one that >works may help under flaky proxy situation, for example. That makes sense, I guess, if the unusable URLs can be expected to fail fast. >In the reverse direction, I think "git fetch" supports the notion of > of repositories, so that fetch from multiple remotes can be >initiated with a single command, but I am not sure if we added the >same concept to the pushing side. I personally want to have >finer control, so when I push my work to multiple repositories, each >of them are treated as totally different push targets, and a script >controls multiple "git push" processes to each of them in parallel, >with retries and all. I think having multiple pushURL and pushing >to them is sort-of OK, but what is broken in your configuration is >that you have a single remote-tracking branch hierarchy---if you get >rid of it, so that refs/remotes/myremote/ does not exist and does >not get updated, I think things would work fine. Yes, I agree, the presence of only a single remote tracking ref is what makes the use of a single remote with multiple URLs suboptimal here—it was just a better than the other options. I tend to have pretty reliable Internet connectivity, and I don’t particularly want to go writing custom scripts. I just want to use the normal push and fetch commands. Using multiple URLs on one remote is OK, though the single remote tracking ref is annoying. Using separate remotes works, but is more annoying due to having to remember to push to all of them. If I understand what remote groups are (separate remotes but you can act on all of them with one command?) then they should be perfect. However it does not look like they work for pushing. Would it make sense to add? -- Christopher Head
Settings for minimizing repacking (and keeping 'rsync' happy)
Hi! Some of my Git repositories have mirrors, maintained with 'rsync'. I want to have some level of repacking, so that the repositories are efficient, but I also want it to minimize it, so that 'rsync' never has to perform a big transfer for the repositories. For example, I think it would be fine if files are repacked just once in their lifetimes, and then that resulting pack file is never repacked again. I did read the gc.bigPackThreshold and gc.autoPackLimit settings, but I don't think they would accomplish that. Basically, what I'm describing is the behaviour of not packing files until the resulting pack would be a given size (say 10MB for example), and then never repack such ~10MB packs again, ever. Can this be done with some Git settings? And do you foresee any kind of serious drawback or potential problem with this kind of behaviour? Thanks!! ardi
Re: [PATCH 1/3] grep: make pcre1_tables version agnostic
On Sat, Jul 27 2019, Carlo Marcelo Arenas Belón wrote: > 6d4b5747f0 ("grep: change internal *pcre* variable & function names > to be *pcre1*", 2017-05-25), renamed most variables to be PCRE1 > specific to give space to similarly named variables for PCRE2, but > in this case the change wasn't needed as the types were compatible > enough (const unsigned char* vs const uint8_t*) to be shared. Both the v1 and v2 functions return const unsigned char *. I don't know where I got the uint8_t from. This makes more sense. This series looks good to me. Thanks for the fix. Just one caveat: The point of 6d4b5747f0 was not to only split out those variables we couldn't get away with re-using. Then I would have later re-used e.g. pcre1_jit_on & pcre2_jit_on as just pcre_jit_on. We could also do that now. I think doing that & this part of the your changes makes things less readable. The two code branches we compile with ifdefs are mutually exclusive, so having the variables be unique helps with eyeballing / reasoning when changing the code. > Revert that change, as 94da9193a6 ("grep: add support for PCRE v2", > 2017-06-01) failed to create an equivalent PCRE2 version. > > Signed-off-by: Carlo Marcelo Arenas Belón > --- > grep.c | 6 +++--- > grep.h | 2 +- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/grep.c b/grep.c > index f7c3a5803e..cc65f7a987 100644 > --- a/grep.c > +++ b/grep.c > @@ -389,14 +389,14 @@ static void compile_pcre1_regexp(struct grep_pat *p, > const struct grep_opt *opt) > > if (opt->ignore_case) { > if (has_non_ascii(p->pattern)) > - p->pcre1_tables = pcre_maketables(); > + p->pcre_tables = pcre_maketables(); > options |= PCRE_CASELESS; > } > if (is_utf8_locale() && has_non_ascii(p->pattern)) > options |= PCRE_UTF8; > > p->pcre1_regexp = pcre_compile(p->pattern, options, &error, &erroffset, > - p->pcre1_tables); > + p->pcre_tables); > if (!p->pcre1_regexp) > compile_regexp_failed(p, error); > > @@ -462,7 +462,7 @@ static void free_pcre1_regexp(struct grep_pat *p) > { > pcre_free(p->pcre1_extra_info); > } > - pcre_free((void *)p->pcre1_tables); > + pcre_free((void *)p->pcre_tables); > } > #else /* !USE_LIBPCRE1 */ > static void compile_pcre1_regexp(struct grep_pat *p, const struct grep_opt > *opt) > diff --git a/grep.h b/grep.h > index 1875880f37..d34f66b384 100644 > --- a/grep.h > +++ b/grep.h > @@ -89,7 +89,7 @@ struct grep_pat { > pcre *pcre1_regexp; > pcre_extra *pcre1_extra_info; > pcre_jit_stack *pcre1_jit_stack; > - const unsigned char *pcre1_tables; > + const unsigned char *pcre_tables; > int pcre1_jit_on; > pcre2_code *pcre2_pattern; > pcre2_match_data *pcre2_match_data;
Re: [PATCH 3/3] grep: plug leak of pcre chartables in PCRE2
On Sat, Jul 27 2019, Carlo Marcelo Arenas Belón wrote: > Just as it is done with PCRE1, make sure that the allocated chartables > get free at cleanup time. > > This assumes no global context is used (NULL passed when created the > tables), but will likely be updated in tandem if that ever changes. > > Signed-off-by: Carlo Marcelo Arenas Belón > --- > grep.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/grep.c b/grep.c > index d04635fad4..d9768c5f05 100644 > --- a/grep.c > +++ b/grep.c > @@ -604,6 +604,7 @@ static void free_pcre2_pattern(struct grep_pat *p) > pcre2_match_data_free(p->pcre2_match_data); > pcre2_jit_stack_free(p->pcre2_jit_stack); > pcre2_match_context_free(p->pcre2_match_context); > + free((void *)p->pcre_tables); Is the cast really needed? I'm rusty on the rules, removing it from the pcre_free() you might have copied this from produces a warning for me, but not for free() itself. This is on GCC 8.3.0. How about for you & what compiler(s)?
Re: [PATCH 3/3] grep: plug leak of pcre chartables in PCRE2
On Sat, Jul 27, 2019 at 4:48 PM Ævar Arnfjörð Bjarmason wrote: > > + free((void *)p->pcre_tables); > > Is the cast really needed? I'm rusty on the rules, removing it from the > pcre_free() you might have copied this from produces a warning for me, > but not for free() itself. This is on GCC 8.3.0. How about for you & > what compiler(s)? both will trigger warnings for the same reason (-Wincompatible-pointer-types-discards-qualifiers) with Apple LLVM version 10.0.1 (clang-1001.0.46.4) gcc-9 in macOS triggers 2 "warnings"; one for discarding the const qualifier (-Wdiscarded-qualifiers) and another for mismatching parameter to free(): note: expected 'void *' but argument is of type 'const uint8_t *' {aka 'const unsigned char *'} Carlo
Re: [PATCH 1/3] grep: make pcre1_tables version agnostic
On Sat, Jul 27, 2019 at 4:47 PM Ævar Arnfjörð Bjarmason wrote: > > On Sat, Jul 27 2019, Carlo Marcelo Arenas Belón wrote: > > > 6d4b5747f0 ("grep: change internal *pcre* variable & function names > > to be *pcre1*", 2017-05-25), renamed most variables to be PCRE1 > > specific to give space to similarly named variables for PCRE2, but > > in this case the change wasn't needed as the types were compatible > > enough (const unsigned char* vs const uint8_t*) to be shared. reading this again, had to admit there is a fair amount of guessing on intent, but reading commits and the email discussion couldn't come up with a better explanation. is the root cause for the bug different?, could it be that the pcre2 API was misunderstood? and you expected this pointer will be free together with the context? (as it is done when a cloned context with tables is used?) > Both the v1 and v2 functions return const unsigned char *. I don't know > where I got the uint8_t from. This makes more sense. the type in PCRE2 is uint8_t, the documentation has a bug[1] almost every platform git cares for would have unsigned char = uint8_t though. > The point of 6d4b5747f0 was not to only split out those variables we > couldn't get away with re-using. Then I would have later re-used > e.g. pcre1_jit_on & pcre2_jit_on as just pcre_jit_on. We could also do > that now. IMHO pcre_jit_on makes more sense as a bit, with local variables being used for the pcre*_config() call with the right type.(uint32_t != int) > I think doing that & this part of the your changes makes things less > readable. The two code branches we compile with ifdefs are mutually > exclusive, so having the variables be unique helps with eyeballing / > reasoning when changing the code. I thought that too until the typo[2] in the pcre?_jit_on variable (now in next) kind of proved us wrong. Maybe the names are too similar? either way, would you rather drop this patch and make a replica variable? should I rebase to next with Reviewed-By tags so that all other changes in flight that would conflict could be corrected? Carlo [1] https://bugs.exim.org/show_bug.cgi?id=2420 [2] https://public-inbox.org/git/20190722181923.21572-1-dev+...@drbeat.li/