Re: [PATCH 1/2 v8] pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use
On Mon, Sep 12, 2016 at 11:23:18PM -0700, Junio C Hamano wrote: > Kirill Smelkov writes: > > > +static int want_found_object(int exclude, struct packed_git *p) > > +{ > > + if (exclude) > > + return 1; > > + if (incremental) > > + return 0; > > + > > + /* > > +* When asked to do --local (do not include an object that appears in a > > +* pack we borrow from elsewhere) or --honor-pack-keep (do not include > > +* an object that appears in a pack marked with .keep), finding a pack > > +* that matches the criteria is sufficient for us to decide to omit it. > > +* However, even if this pack does not satisfy the criteria, we need to > > +* make sure no copy of this object appears in _any_ pack that makes us > > +* to omit the object, so we need to check all the packs. > > +* > > +* We can however first check whether these options can possible matter; > > +* if they do not matter we know we want the object in generated pack. > > +* Otherwise, we signal "-1" at the end to tell the caller that we do > > +* not know either way, and it needs to check more packs. > > +*/ > > + if (!ignore_packed_keep && > > + (!local || !have_non_local_packs)) > > + return 1; > > + > > + if (local && !p->pack_local) > > + return 0; > > + if (ignore_packed_keep && p->pack_local && p->pack_keep) > > + return 0; > > + > > + /* we don't know yet; keep looking for more packs */ > > + return -1; > > +} > > Moving this logic out to this helper made the main logic in the > caller easier to grasp. > > > @@ -958,15 +993,30 @@ static int want_object_in_pack(const unsigned char > > *sha1, > >off_t *found_offset) > > { > > struct packed_git *p; > > + int want; > > > > if (!exclude && local && has_loose_object_nonlocal(sha1)) > > return 0; > > > > + /* > > +* If we already know the pack object lives in, start checks from that > > +* pack - in the usual case when neither --local was given nor .keep > > files > > +* are present we will determine the answer right now. > > +*/ > > + if (*found_pack) { > > + want = want_found_object(exclude, *found_pack); > > + if (want != -1) > > + return want; > > + } > > > > for (p = packed_git; p; p = p->next) { > > + off_t offset; > > + > > + if (p == *found_pack) > > + offset = *found_offset; > > + else > > + offset = find_pack_entry_one(sha1, p); > > + > > if (offset) { > > if (!*found_pack) { > > if (!is_pack_valid(p)) > > @@ -974,31 +1024,9 @@ static int want_object_in_pack(const unsigned char > > *sha1, > > *found_offset = offset; > > *found_pack = p; > > } > > + want = want_found_object(exclude, p); > > + if (want != -1) > > + return want; > > } > > } > > As Peff noted in his earlier review, however, MRU code needed to be > grafted in to the caller (an update to the MRU list was done in the > code that was moved to the want_found_object() helper). I think I > did it correctly, which ended up looking like this: > > want = want_found_object(exclude, p); > if (!exclude && want > 0) > mru_mark(packed_git_mru, entry); > if (want != -1) > return want; > > I somewhat feel that it is ugly that the helper knows about exclude > (i.e. in the original code, we immediately returned 1 without > futzing with the MRU when we find an entry that is to be excluded, > which now is done in the helper), and the caller also knows about > exclude (i.e. the caller knows that the helper may return positive > in two cases, it knows that MRU marking needs to happen only one of > the two cases, and it also knows that "exclude" is what > differentiates between the two cases) at the same time. > > But probably the reason why I feel it ugly is only because I knew > how the original looked like. I dunno. Junio, the code above is correct semantic merge of pack-mru and my topic, because in pack-mru if found and exclude=1, 1 was returned without marking found pack. But I wonder: even if we exclude an object, we were still looking for it in packs, and when we found it, we found the corresponding pack too. So, that pack _was_ most-recently-used, and it is correct to mark it as MRU. We can do the simplification in the follow-up patch after the merge, so merge does not change semantics and it is all bisectable, etc. Jeff?
Re: [PATCH] [git-p4.py] Add --checkpoint-period option to sync/clone
On 12 September 2016 at 23:02, Ori Rawlings wrote: > Importing a long history from Perforce into git using the git-p4 tool > can be especially challenging. The `git p4 clone` operation is based > on an all-or-nothing transactionality guarantee. Under real-world > conditions like network unreliability or a busy Perforce server, > `git p4 clone` and `git p4 sync` operations can easily fail, forcing a > user to restart the import process from the beginning. The longer the > history being imported, the more likely a fault occurs during the > process. Long enough imports thus become statistically unlikely to ever > succeed. That would never happen :-) The usual thing that I find is that my Perforce login session expires. > > The underlying git fast-import protocol supports an explicit checkpoint > command. The idea here is to optionally allow the user to force an > explicit checkpoint every seconds. If the sync/clone operation fails > branches are left updated at the appropriate commit available during the > latest checkpoint. This allows a user to resume importing Perforce > history while only having to repeat at most approximately seconds > worth of import activity. I think this ought to work, and could be quite useful. It would be good to have some kind of test case for it though, and updated documentation. Luke > > Signed-off-by: Ori Rawlings > --- > git-p4.py | 8 > 1 file changed, 8 insertions(+) > > diff --git a/git-p4.py b/git-p4.py > index fd5ca52..40cb64f 100755 > --- a/git-p4.py > +++ b/git-p4.py > @@ -2244,6 +2244,7 @@ class P4Sync(Command, P4UserMap): > optparse.make_option("-/", dest="cloneExclude", > action="append", type="string", > help="exclude depot path"), > +optparse.make_option("--checkpoint-period", > dest="checkpointPeriod", type="int", help="Period in seconds between explict > git fast-import checkpoints (by default, no explicit checkpoints are > performed)"), > ] > self.description = """Imports from Perforce into a git repository.\n > example: > @@ -2276,6 +2277,7 @@ class P4Sync(Command, P4UserMap): > self.tempBranches = [] > self.tempBranchLocation = "refs/git-p4-tmp" > self.largeFileSystem = None > +self.checkpointPeriod = -1 Or use None? > > if gitConfig('git-p4.largeFileSystem'): > largeFileSystemConstructor = > globals()[gitConfig('git-p4.largeFileSystem')] > @@ -3031,6 +3033,8 @@ class P4Sync(Command, P4UserMap): > > def importChanges(self, changes): > cnt = 1 > +if self.checkpointPeriod > -1: > +self.lastCheckpointTime = time.time() Could you just always set the lastCheckpointTime? > for change in changes: > description = p4_describe(change) > self.updateOptionDict(description) > @@ -3107,6 +3111,10 @@ class P4Sync(Command, P4UserMap): > self.initialParent) > # only needed once, to connect to the previous commit > self.initialParent = "" > + > +if self.checkpointPeriod > -1 and time.time() - > self.lastCheckpointTime > self.checkpointPeriod: > +self.checkpoint() > +self.lastCheckpointTime = time.time() If you use time.time(), then this could fail to work as expected if the system time goes backwards (e.g. NTP updates). However, Python 2 doesn't have access to clock_gettime() without jumping through hoops, so perhaps we leave this as a bug until git-p4 gets ported to Python 3. > except IOError: > print self.gitError.read() > sys.exit(1) > -- > 2.7.4 (Apple Git-66) >
Greetings to you and your family,,
-- Greetings to you and your family, Good Day, I am writing to request your assistance to transfer the sum of $8, 600.000.00 (Eight million Six hundred thousand USD) into your accounts. Please delete it if you are not interested. The total sum will be shared as follows: 60% for me, 40% for you. The transfer is risk free hence you will follow my instructions till the fund get to your account. Official application form will be forwarded to you to explain more comprehensively what is required of you, feel free to call me through my private telephone number. Full Name Sex... Age. Country... Occupation.. Mobile . Best Regards Mr. Abdulaiye Rahmani TEL: +226 75448132 PRIVATE EMAIL: abdulaiyerahma...@yahoo.com
RE: [PATCH v2] checkout: eliminate unnecessary merge for trivial checkout
> -Original Message- > From: Junio C Hamano [mailto:gits...@pobox.com] > Sent: Monday, September 12, 2016 4:32 PM > To: Ben Peart > Cc: git@vger.kernel.org; pclo...@gmail.com; 'Ben Peart' > > Subject: Re: [PATCH v2] checkout: eliminate unnecessary merge for trivial > checkout > > "Ben Peart" writes: > > > I completely agree that optimizing within merge_working_tree would > > provide more opportunities for optimization. I can certainly move the > > test into that function as a first step. > > Note that "optimizing more" was not the primary point of my response. > > Quite honestly, I'd rather see us speed up _ONLY_ obviously correct and > commonly used cases, while leaving most cases that _MAY_ turn out to be > optimizable (if we did careful analysis) unoptimized, and instead have them > handled by generic but known to be correct codepath, if it means we do NOT > to have to spend mental bandwidth to analyze not-common case--that is a > much better tradeoff. > > The suggestion to move the check one level down in the callchain was > primarily to avoid the proposed optimization from being overly eager and > ending up skipping necessary parts of what merge_working_tree() does (e.g. > like I suspected in the review that the proposed patch skips the check for > "you have unmerged entries" situation). The check for unmerged entries makes complete sense when you are about to attempt to merge different commit trees and generate an updated index and working directory. This optimization however is trying to skip those expensive steps for the specific case of creating a new branch and switching to it. In this narrow (but common) case, all that needs to happen is that a new ref is created and HEAD switched to it. Since we're not doing a merge, I don't believe the check is necessary.
Re: [ANNOUNCE] Git User's Survey 2016
On Mon, Sep 12, 2016 at 09:51:09PM +0200, Jakub Narębski wrote: > Hello all, > > > P.P.S. Different announcements use different URLs (different channels) > to better track where one got information about this survey. > > Thanks in advance for taking time to answer the survey, Can we get a channel for the freenode/#git channel? I'm trying to announce the survey there too.
Re: [ANNOUNCE] Git User's Survey 2016
Hello Kevin, On 13 September 2016 at 15:32, Kevin Daudt wrote: > On Mon, Sep 12, 2016 at 09:51:09PM +0200, Jakub Narębski wrote: >> >> P.P.S. Different announcements use different URLs (different channels) >> to better track where one got information about this survey. >> >> Thanks in advance for taking time to answer the survey, > > Can we get a channel for the freenode/#git channel? I'm trying to announce the > survey there too. Here you have it: https://survs.com/survey/c6wjrerw87 -- Jakub Narebski
[PATCH v3] checkout: eliminate unnecessary merge for trivial checkout
Teach git to avoid unnecessary merge during trivial checkout. When running 'git checkout -b foo' git follows a common code path through the expensive merge_working_tree even when it is unnecessary. As a result, 95% of the time is spent in merge_working_tree doing the 2-way merge between the new and old commit trees that is unneeded. The time breakdown is as follows: merge_working_tree <-- 95% unpack_trees <-- 80% traverse_trees <-- 50% cache_tree_update <-- 17% mark_new_skip_worktree <-- 10% With a large repo, this cost is pronounced. Using "git checkout -b r" to create and switch to a new branch costs 166 seconds (all times worst case with a cold file system cache). git.c:406 trace: built-in: git 'checkout' '-b' 'r' read-cache.c:1667 performance: 17.442926555 s: read_index_from name-hash.c:128 performance: 2.912145231 s: lazy_init_name_hash read-cache.c:2208 performance: 4.387713335 s: write_locked_index trace.c:420 performance: 166.458921289 s: git command: 'c:\Users\benpeart\bin\git.exe' 'checkout' '-b' 'r' Switched to a new branch 'r' By adding a test to skip the unnecessary call to merge_working_tree in this case reduces the cost to 16 seconds. git.c:406 trace: built-in: git 'checkout' '-b' 's' read-cache.c:1667 performance: 16.100742476 s: read_index_from trace.c:420 performance: 16.461547867 s: git command: 'c:\Users\benpeart\bin\git.exe' 'checkout' '-b' 's' Switched to a new branch 's' Signed-off-by: Ben Peart --- builtin/checkout.c | 92 ++ 1 file changed, 92 insertions(+) diff --git a/builtin/checkout.c b/builtin/checkout.c index 8672d07..8b2f428 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -38,6 +38,10 @@ struct checkout_opts { int ignore_skipworktree; int ignore_other_worktrees; int show_progress; + /* +* If new checkout options are added, needs_working_tree_merge +* should be updated accordingly. +*/ const char *new_branch; const char *new_branch_force; @@ -460,11 +464,99 @@ static void setup_branch_path(struct branch_info *branch) branch->path = strbuf_detach(&buf, NULL); } +static int needs_working_tree_merge(const struct checkout_opts *opts, + const struct branch_info *old, + const struct branch_info *new) +{ + /* +* We must do the merge if we are actually moving to a new +* commit tree. +*/ + if (!old->commit || !new->commit || + oidcmp(&old->commit->tree->object.oid, &new->commit->tree->object.oid)) + return 1; + + /* +* opts->patch_mode cannot be used with switching branches so is +* not tested here +*/ + + /* +* opts->quiet only impacts output so doesn't require a merge +*/ + + /* +* Honor the explicit request for a three-way merge or to throw away +* local changes +*/ + if (opts->merge || opts->force) + return 1; + + /* +* Checking out the requested commit may require updating the working +* directory and index, let the merge handle it. +*/ + if (opts->force_detach) + return 1; + + /* +* opts->writeout_stage cannot be used with switching branches so is +* not tested here +*/ + + /* +* Honor the explicit ignore requests +*/ + if (!opts->overwrite_ignore || opts->ignore_skipworktree || + opts->ignore_other_worktrees) + return 1; + + /* +* opts->show_progress only impacts output so doesn't require a merge +*/ + + /* +* If we're not creating a new branch, by definition we're changing +* the existing one so need to do the merge +*/ + if (!opts->new_branch) + return 1; + + /* +* new_branch_force is defined to "create/reset and checkout a branch" +* so needs to go through the merge to do the reset +*/ + if (opts->new_branch_force) + return 1; + + /* +* A new orphaned branch requrires the index and the working tree to be +* adjusted to +*/ + if (opts->new_orphan_branch) + return 1; + + /* +* Remaining variables are not checkout options but used to track state +* that doesn't trigger the need for a merge. +*/ + + return 0; +} + static int merge_working_tree(const struct checkout_opts *opts, struct branch_info *old, struct branch_info *new, int *writeout_error) { + /* +* Optimize the performance of "git checkout -b foo" by avoiding +* the expens
Re: [RFC/PATCH 01/17] diff: move line ending check into emit_hunk_header
Am 13.09.2016 um 06:45 schrieb Stefan Beller: In a later patch, I want to propose an option to detect&color moved lines in a diff, which cannot be done in a one-pass over the diff. Instead we need to go over the whole diff twice, because we cannot detect the first line of the two corresponding lines (+ and -) that got moved. So to prepare the diff machinery for two pass algorithms (i.e. buffer it all up and then operate on the result), move all emissions to places, such that the only emitting function is emit_line_0. This patch moves code that is conceptually part of emit_hunk_header, but was using output in fn_out_consume, back to emit_hunk_header. Signed-off-by: Stefan Beller --- diff.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index cc8e812..aa50b2d 100644 --- a/diff.c +++ b/diff.c @@ -610,6 +610,9 @@ static void emit_hunk_header(struct emit_callback *ecbdata, } strbuf_add(&msgbuf, line + len, org_len - len); + if (line[org_len - 1] != '\n') + strbuf_addch(&msgbuf, '\n'); + Using strbuf_complete_line() would be nicer. emit_line(ecbdata->opt, "", "", msgbuf.buf, msgbuf.len); strbuf_release(&msgbuf); } @@ -1247,8 +1250,6 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) len = sane_truncate_line(ecbdata, line, len); find_lno(line, ecbdata); emit_hunk_header(ecbdata, line, len); - if (line[len-1] != '\n') - putc('\n', o->file); return; }
Re: [PATCH v7 10/10] convert: add filter..process option
On 12.09.16 11:49, Lars Schneider wrote: >> How do we send pathnames the have '\n' ? >> Not really recommended, but allowed. >> And here I am a little bit lost, is each of the lines packed into >> a pkt-line ? >> command=smudge is packet as pkt-line and pathname= is packed into >> another one ? (The we don't need the '\n' at all) > > Every line is a dedicated packet. That's why '\n' in a path name would > not be a problem as the receiver is expected to read the entire packet > when parsing the value (and the receiver knows the packet length, too). > > The '\n' at the end is required by the pkt-line format: > "A non-binary line SHOULD BE terminated by an LF..." > (see protocol-common.txt) That is only the half part of the story: A non-binary line SHOULD BE terminated by an LF, which if present MUST be included in the total length. Receivers MUST treat pkt-lines with non-binary data the same whether or not they contain the trailing LF (stripping the LF if present, and not complaining when it is missing). How do we treat pathnames ? They can have each byte value except '\0'. What should a receiver do, which reads a string like "ABC\n\n" ? Is it "ABC\n" or "ABC\n\n" ? I would really consider to treat pathnames as binary, and not add a trailing '\n', are there other opinions ?
Re: [ANNOUNCE] Git User's Survey 2016
On Tue, Sep 13, 2016 at 03:52:28PM +0200, Jakub Narębski wrote: > Hello Kevin, > > On 13 September 2016 at 15:32, Kevin Daudt wrote: > > On Mon, Sep 12, 2016 at 09:51:09PM +0200, Jakub Narębski wrote: > > >> > >> P.P.S. Different announcements use different URLs (different channels) > >> to better track where one got information about this survey. > >> > >> Thanks in advance for taking time to answer the survey, > > > > Can we get a channel for the freenode/#git channel? I'm trying to announce > > the > > survey there too. > > Here you have it: > > https://survs.com/survey/c6wjrerw87 > > -- > Jakub Narebski Thanks.
Re: [PATCH v7 10/10] convert: add filter..process option
larsxschnei...@gmail.com writes: > diff --git a/contrib/long-running-filter/example.pl > b/contrib/long-running-filter/example.pl > ... > +sub packet_read { > +my $buffer; > +my $bytes_read = read STDIN, $buffer, 4; > +if ( $bytes_read == 0 ) { > + > +# EOF - Git stopped talking to us! > +exit(); > +... > +packet_write( "clean=true\n" ); > +packet_write( "smudge=true\n" ); > +packet_flush(); > + > +while (1) { These extra SP around the contents inside () pair look unfamiliar and somewhat strange to me, but as long as they are consistently done (and I think you are mostly being consistent), it is OK. > +#define CAP_CLEAN(1u<<0) > +#define CAP_SMUDGE (1u<<1) As these are meant to be usable together, i.e. bits in a single flag word, they are of type "unsigned int", which makes perfect sense. Make sure your variables and fields that store them are of the same type. I think I saw "int' used to pass them in at least one place. > +static int apply_filter(const char *path, const char *src, size_t len, > +int fd, struct strbuf *dst, struct convert_driver > *drv, > +const int wanted_capability) > +{ > + const char* cmd = NULL; "const char *cmd = NULL;" of course. > diff --git a/unpack-trees.c b/unpack-trees.c > index 11c37fb..f6798f8 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -10,6 +10,7 @@ > #include "attr.h" > #include "split-index.h" > #include "dir.h" > +#include "convert.h" > > /* > * Error messages expected by scripts out of plumbing commands such as Why? The resulting file seems to compile without this addition.
Re: git-am includes escape characters from 'From' field
On Mon, Sep 12, 2016 at 10:10:23PM +0200, Swift Geek wrote: > git-am seems to add backslash that escapes double quote character, example > git format-patch > > From 63da989a5295214f9bd06cd7b409a86a65241eea Mon Sep 17 00:00:00 2001 > From: "Sebastian \"Swift Geek\" Grzywna" This looks correct; the output of format-patch is an rfc2822 message, and it requires this quoting. The part you don't show, and that I think is wrong, is that if you then "git am" this patch, it pulls the backslashes into the commit object. The culprit looks like "parse_mail()" in builtin/am.c (or possibly mailinfo() that it calls), which blindly picks up the name portion without doing any rfc2822 de-quoting. I don't think we have any existing de-quoting routines to plug in, so fixing it would probably start with writing one. -Peff
Re: [PATCH v7 00/10] Git filter protocol
larsxschnei...@gmail.com writes: > From: Lars Schneider > > The goal of this series is to avoid launching a new clean/smudge filter > process for each file that is filtered. I tentatively queued this in 'pu' because I wanted to see these changes in context and also wanted to know if there are overlaps and conflicts with other topics in flight. As your earlier steps renamed packet_write() to packet_write_fmt() but didn't add a new packet_write() that has different semantics, it was pleasantly easy to make sure there is no new caller added in the meantime (it did conflict with Duy's shallow-deepen topic and the resolution had to touch outside the <<< conflicted === regions >>>, but it was otherwise trivial). Some details of the protocol (I think Torsten has already pointed out that not all paths can be representable with the current incarnation; there may be other minor nits like that) may still need to be worked out, but I think the series at this point gets the basic code structure right and such additional fixes would not change things too drastically (IOW, I think we are very close to being 'next'-ready). Thanks.
RE: [ANNOUNCE] Git User's Survey 2016
Hi Jakub, You said: P.S. At request I can open a separate channel in survey, with a separate survey URL, so that responses from particular site or organization could be separated out. Please can you open a channel for use by Ericsson? Many thanks David DAVID BAINBRIDGE Product Manager SW Development Strategic Product Manager, Configuration and Compliance Management BNEP EITTE SP&S Platforms&Infrastructure Ericsson 8500 Decarie Montreal, H4P 2N2, Canada Phone +1 514 345 7900 x42014 Mobile +1 438 990 2452 Office ECN 810 42014 david.bainbri...@ericsson.com www.ericsson.com -Original Message- From: Jakub Narębski [mailto:jna...@gmail.com] Sent: Monday, September 12, 2016 15:51 To: git@vger.kernel.org Cc: Doug Rathbone ; David Bainbridge ; Stefan Beller ; Andrew Ardill ; Eric Wong Subject: [ANNOUNCE] Git User's Survey 2016 Hello all, We would like to ask you a few questions about your use of the Git version control system. This survey is mainly to understand who is using Git, how and why. The results will be published to the Git wiki on the GitSurvey2016 page (https://git.wiki.kernel.org/index.php/GitSurvey2016) and discussed on the git mailing list. The survey would be open from 12 September to 20 October 2016. Please devote a few minutes of your time to fill this simple questionnaire, it will help a lot the git community to understand your needs, what you like of Git, and of course what you don't like. The survey can be found here: https://tinyurl.com/GitSurvey2016 https://survs.com/survey/lmo7ed3439 There is also alternate version which does not require cookies, but it doesn't allow one to go back to response and edit it. https://tinyurl.com/GitSurvey2016-anon https://survs.com/survey/naeec8kwd8 P.S. At request I can open a separate channel in survey, with a separate survey URL, so that responses from particular site or organization could be separated out. Please send me a email with name of channel, and I will return with a separate survey URL to use. Note that the name of the channel would be visible to others. P.P.S. Different announcements use different URLs (different channels) to better track where one got information about this survey. Thanks in advance for taking time to answer the survey, -- Jakub Narębski on behalf of Git Development Community
Re: [PATCH v2 09/14] i18n: notes: mark error messages for translation
A Seg, 12-09-2016 às 14:23 +0200, Jean-Noël Avila escreveu: > Le 12/09/2016 à 13:29, Vasco Almeida a écrit : > > > > Signed-off-by: Vasco Almeida > > --- > > builtin/notes.c | 18 +- > > 1 file changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/builtin/notes.c b/builtin/notes.c > > index f848b89..abacae2 100644 > > --- a/builtin/notes.c > > +++ b/builtin/notes.c > > @@ -340,7 +340,7 @@ static struct notes_tree > > *init_notes_check(const char *subcommand, > > > > ref = (flags & NOTES_INIT_WRITABLE) ? t->update_ref : t- > > >ref; > > if (!starts_with(ref, "refs/notes/")) > > - die("Refusing to %s notes in %s (outside of > > refs/notes/)", > > + die(_("Refusing to %s notes in %s (outside of > > refs/notes/)"), > > subcommand, ref); > > return t; > > } > > Not sure this one will be easy to localize. The verb is passed as a > parameter : see line 366 "list", line 426 "add", line 517 "copy", > line > 658 "show", line 816 "merge", line 908 "remove" or line 595 with > argv[0]. > > If all the verbs are real subcommands, then the translators should be > warned that this is some english twisting, but that they need to > refer > to the subcommand on the command line. Yes, these verbs are git notes subcommands. I will add a Translators comment to it explaining so. Or we can unfold that error messages like if (!strcmp(subcommand, "add") die(_("Refusing to add notes in %s (outside of refs/notes/)"), ref); elseif ... else die(_("Refusing to %s notes in %s (outside of refs/notes/)"), subcommand, ref); This is more verbose but translations would benefit from it being more natural. What do we prefer: (1) concise source and a little unnatural translations or (2) verbose code and natural translations? Compare, imaging that English is a target translation language, the user would read: "Refusing to do add of notes in /path [...]" (1) "Refusing do add notes in /path [...]" (2) > Otherwise, > > Acked-by: Jean-Noël Avila > > JN
Re: [PATCH v2] ls-files: adding support for submodules
Brandon Williams writes: > Allow ls-files to recognize submodules in order to retrieve a list of > files from a repository's submodules. This is done by forking off a > process to recursively call ls-files on all submodules. Also added an > output-path-prefix command in order to prepend paths to child processes. > > Signed-off-by: Brandon Williams > @@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, > const char *path) > static void write_name(const char *name) > { > /* > + * NEEDSWORK: To make this thread-safe, full_name would have to be owned > + * by the caller. > + * > + * full_name get reused across output lines to minimize the allocation > + * churn. > + */ > + static struct strbuf full_name = STRBUF_INIT; > + if (output_path_prefix != '\0') { > + strbuf_reset(&full_name); > + strbuf_addstr(&full_name, output_path_prefix); > + strbuf_addstr(&full_name, name); > + name = full_name.buf; > + } At first glance it was surprising that no test caught this lack of dereference; the reason is because you initialize output_path_prefix to an empty string, not NULL, causing full_name.buf always used, which does not have an impact on the output. I think initializing it to NULL is a more typical way to say "this option has not been given", and if you took that route, the condition would become if (output_path_prefix && *output_path_prefix) { ... In any case, the fact that only this much change was required to add output-path-prefix shows two good things: (1) the original code was already well structured, funneling any pathname we need to emit through this single function so that we can do this kind of updates, and (2) the author of the patch was competent to spot this single point that needs to be updated. Nice. > + status = run_command(&cp); > + if (status) > + exit(status); run_command()'s return value comes from either start_command() or finish_command(). These signal failure by returning a non-zero value, and in practice they are negative small integers. Feeding negative value to exit() is not quite kosher. Perhaps exit(128) to mimick as if we called die() is better. If your primary interest is to support the "find in the working tree files that are tracked, recursively in submodules" grep, I think this "when we hit a submodule, spawn a separate ls-files in there" is sufficient and a solid base to build on it. On the other hand, if you are more ambitious and "grep" is merely an example of things that can be helped by having a list of paths across module boundaries, we may want to "libify" ls-files in such a way that a single process can instantiate one or more instances of "ls-files machinery", that takes which repository to work in and other arguments that specifies which paths to report, and instead of always showing the result to the standard output, makes repeated calls to a callback function to report the discovered path and other attributes associated with the path that were asked for (the object name, values of tag_*, etc.), without spawning a separate "ls-files" process. The latter would be a lot bigger task and I do not necessarily think it is needed, but that is one possible future direction to keep in mind. Thanks, will queue with a minimum fix.
[PATCH] strbuf: use valid pointer in strbuf_remove()
The fourth argument of strbuf_splice() is passed to memcpy(3), which is not supposed to handle NULL pointers. Let's be extra careful and use a valid empty string instead. It even shortens the source code. :) Signed-off-by: Rene Scharfe --- strbuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/strbuf.c b/strbuf.c index f3bd571..b839be4 100644 --- a/strbuf.c +++ b/strbuf.c @@ -187,7 +187,7 @@ void strbuf_insert(struct strbuf *sb, size_t pos, const void *data, size_t len) void strbuf_remove(struct strbuf *sb, size_t pos, size_t len) { - strbuf_splice(sb, pos, len, NULL, 0); + strbuf_splice(sb, pos, len, "", 0); } void strbuf_add(struct strbuf *sb, const void *data, size_t len) -- 2.10.0
Re: [PATCH v7 10/10] convert: add filter..process option
Torsten Bögershausen writes: > I would really consider to treat pathnames as binary, and not add a trailing > '\n', > are there other opinions ? It would be the most consistent if the same format as write_name_quoted() is used for this, I would think.
[PATCH] pathspec: removed unnecessary function prototypes
removed function prototypes from pathspec.h which don't have a corresponding implementation. Signed-off-by: Brandon Williams --- pathspec.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/pathspec.h b/pathspec.h index 4a80f6f..59809e4 100644 --- a/pathspec.h +++ b/pathspec.h @@ -96,7 +96,5 @@ static inline int ps_strcmp(const struct pathspec_item *item, extern char *find_pathspecs_matching_against_index(const struct pathspec *pathspec); extern void add_pathspec_matches_against_index(const struct pathspec *pathspec, char *seen); -extern const char *check_path_for_gitlink(const char *path); -extern void die_if_path_beyond_symlink(const char *path, const char *prefix); #endif /* PATHSPEC_H */ -- 2.8.0.rc3.226.g39d4020
Re: [ANNOUNCE] Git User's Survey 2016
On 13 September 2016 at 18:15, David Bainbridge wrote: > Hi Jakub, > > You said: > P.S. At request I can open a separate channel in survey, with > a separate survey URL, so that responses from particular site > or organization could be separated out. > > Please can you open a channel for use by Ericsson? Sent (privately to David). Best regards, -- Jakub Narębski
Re: [PATCH v2 07/14] i18n: merge-recursive: mark error messages for translation
A Seg, 12-09-2016 às 09:04 -0700, Junio C Hamano escreveu: > Vasco Almeida writes: > > > > > Lowercase first word of such error messages following the usual > > style. > > "Change X to lowercase" is fine, but "Lowercase" is not a verb. > > Reword it to "Downcase the first word...", perhaps (not limited to > this step). Lowercase is a verb [1] meaning to print or write with a lowercase letter or letters. Knowing that can the commit message be kept? [1] http://www.dictionary.com/browse/lowercase
[PATCH] checkout: constify parameters of checkout_stage() and checkout_merged()
Document the fact that checkout_stage() and checkout_merged() don't change the objects passed to them by adding the modifier const. Signed-off-by: Rene Scharfe --- builtin/checkout.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 8672d07..afbff3e 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -154,8 +154,8 @@ static int check_stages(unsigned stages, const struct cache_entry *ce, int pos) return 0; } -static int checkout_stage(int stage, struct cache_entry *ce, int pos, - struct checkout *state) +static int checkout_stage(int stage, const struct cache_entry *ce, int pos, + const struct checkout *state) { while (pos < active_nr && !strcmp(active_cache[pos]->name, ce->name)) { @@ -169,7 +169,7 @@ static int checkout_stage(int stage, struct cache_entry *ce, int pos, return error(_("path '%s' does not have their version"), ce->name); } -static int checkout_merged(int pos, struct checkout *state) +static int checkout_merged(int pos, const struct checkout *state) { struct cache_entry *ce = active_cache[pos]; const char *path = ce->name; -- 2.10.0
Bug
To whom this may concern, I found a bug in git while trying to push my website. I redid the process and it happened again. I also tried it on another computer and it happened again. I was wondering how to claim a bug? Thank you, Michael Hawes
Re: Bug
Hi, Michael. It would be helpful to get more context on what triggered this bug. I'm not a 'core' dev, so there may be a better way to send this. In general, you want to state the following: 0) Information about your git installation, host system, etc. 1) Information about your repo (was it GitHub? local? self-hosted?) 2) What did you do? (git push origin master? git push?) 3) What happened instead of working? (the error message would be helpful. Hope this helps. Cheers! -Santiago. On Tue, Sep 13, 2016 at 01:18:52PM -0400, Mike Hawes wrote: > To whom this may concern, > > I found a bug in git while trying to push my website. > > I redid the process and it happened again. > > I also tried it on another computer and it happened again. > > I was wondering how to claim a bug? > > Thank you, > > > Michael Hawes
[PATCH] unpack-trees: pass checkout state explicitly to check_updates()
Add a parameter for the struct checkout variable to check_updates() instead of using a static global variable. Passing it explicitly makes object ownership and usage more easily apparent. And we get rid of a static variable; those can be problematic in library-like code. Signed-off-by: Rene Scharfe --- unpack-trees.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 11c37fb..74d6dd4 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -218,8 +218,8 @@ static void unlink_entry(const struct cache_entry *ce) schedule_dir_for_removal(ce->name, ce_namelen(ce)); } -static struct checkout state; -static int check_updates(struct unpack_trees_options *o) +static int check_updates(struct unpack_trees_options *o, +const struct checkout *state) { unsigned cnt = 0, total = 0; struct progress *progress = NULL; @@ -264,7 +264,7 @@ static int check_updates(struct unpack_trees_options *o) display_progress(progress, ++cnt); ce->ce_flags &= ~CE_UPDATE; if (o->update && !o->dry_run) { - errs |= checkout_entry(ce, &state, NULL); + errs |= checkout_entry(ce, state, NULL); } } } @@ -1094,6 +1094,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options int i, ret; static struct cache_entry *dfc; struct exclude_list el; + struct checkout state; if (len > MAX_UNPACK_TREES) die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES); @@ -1239,7 +1240,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options } o->src_index = NULL; - ret = check_updates(o) ? (-2) : 0; + ret = check_updates(o, &state) ? (-2) : 0; if (o->dst_index) { if (!ret) { if (!o->result.cache_tree) -- 2.10.0
[PATCH] sha1_file: use llist_mergesort() for sorting packs
Sort the linked list of packs directly using llist_mergesort() instead of building an array, calling qsort(3) and fixing up the list pointers. This is shorter and less complicated. Signed-off-by: Rene Scharfe --- Peff: Or do you have other plans, e.g. to replace packed_git with packed_git_mru completely? sha1_file.c | 39 +++ 1 file changed, 15 insertions(+), 24 deletions(-) diff --git a/sha1_file.c b/sha1_file.c index 472ccb2..66dccaa 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -25,6 +25,7 @@ #include "dir.h" #include "mru.h" #include "list.h" +#include "mergesort.h" #ifndef O_NOATIME #if defined(__linux__) && (defined(__i386__) || defined(__PPC__)) @@ -1380,10 +1381,20 @@ static void prepare_packed_git_one(char *objdir, int local) strbuf_release(&path); } +static void *get_next_packed_git(const void *p) +{ + return ((const struct packed_git *)p)->next; +} + +static void set_next_packed_git(void *p, void *next) +{ + ((struct packed_git *)p)->next = next; +} + static int sort_pack(const void *a_, const void *b_) { - struct packed_git *a = *((struct packed_git **)a_); - struct packed_git *b = *((struct packed_git **)b_); + const struct packed_git *a = a_; + const struct packed_git *b = b_; int st; /* @@ -1410,28 +1421,8 @@ static int sort_pack(const void *a_, const void *b_) static void rearrange_packed_git(void) { - struct packed_git **ary, *p; - int i, n; - - for (n = 0, p = packed_git; p; p = p->next) - n++; - if (n < 2) - return; - - /* prepare an array of packed_git for easier sorting */ - ary = xcalloc(n, sizeof(struct packed_git *)); - for (n = 0, p = packed_git; p; p = p->next) - ary[n++] = p; - - qsort(ary, n, sizeof(struct packed_git *), sort_pack); - - /* link them back again */ - for (i = 0; i < n - 1; i++) - ary[i]->next = ary[i + 1]; - ary[n - 1]->next = NULL; - packed_git = ary[0]; - - free(ary); + packed_git = llist_mergesort(packed_git, get_next_packed_git, +set_next_packed_git, sort_pack); } static void prepare_packed_git_mru(void) -- 2.10.0
Re: [PATCH] sha1_file: use llist_mergesort() for sorting packs
On Tue, Sep 13, 2016 at 07:54:42PM +0200, René Scharfe wrote: > Sort the linked list of packs directly using llist_mergesort() instead > of building an array, calling qsort(3) and fixing up the list pointers. > This is shorter and less complicated. Makes sense. > Peff: Or do you have other plans, e.g. to replace packed_git with > packed_git_mru completely? Nope. I haven't looked into it, but I think there would be some benefit to replacing packed_git_mru with the code in list.h. But I don't see any benefit in replacing packed_git with that, or with the MRU itself (once list.h is in use, one _could_ shove the MRU into packed_git itself, but I think we would still retain the original link order for reference). Thanks for asking. -Peff
Re: [PATCH] strbuf: use valid pointer in strbuf_remove()
On Tue, Sep 13, 2016 at 06:40:22PM +0200, René Scharfe wrote: > The fourth argument of strbuf_splice() is passed to memcpy(3), which is > not supposed to handle NULL pointers. Let's be extra careful and use a > valid empty string instead. It even shortens the source code. :) Heh. Looks obviously correct and like a good thing to do. -Peff
Re: [PATCH] pathspec: removed unnecessary function prototypes
On Tue, Sep 13, 2016 at 09:52:51AM -0700, Brandon Williams wrote: > removed function prototypes from pathspec.h which don't have a > corresponding implementation. I'm always curious of the "why" in cases like this. Did we forget to add them? Did they get renamed? Did they go away? Looks like the latter; 5a76aff (add: convert to use parse_pathspec, 2013-07-14) just forgot to remove them. -Peff
Re: [PATCH] checkout: constify parameters of checkout_stage() and checkout_merged()
On Tue, Sep 13, 2016 at 07:11:52PM +0200, René Scharfe wrote: > Document the fact that checkout_stage() and checkout_merged() don't > change the objects passed to them by adding the modifier const. Hmm. Sometimes these big "context" objects are hard to make const, because we end up using them to hold or pass state between functions (e.g., see diff_options). So I'd worry slightly that we'll end up un-consting this in the long run. That being said, it is easy to revert, and it provides some small benefit, so I don't mind it in the meantime. -Peff
Re: [ANNOUNCE] Git User's Survey 2016
> On 13 Sep 2016, at 17:54, Jakub Narębski wrote: > > On 13 September 2016 at 18:15, David Bainbridge > wrote: >> Hi Jakub, >> >> You said: >> P.S. At request I can open a separate channel in survey, with >> a separate survey URL, so that responses from particular site >> or organization could be separated out. >> >> Please can you open a channel for use by Ericsson? > > Sent (privately to David). Could you send me a channel for Autodesk, too? Thank you, Lars
Re: [PATCH v2 09/14] i18n: notes: mark error messages for translation
On mardi 13 septembre 2016 16:35:05 CEST Vasco Almeida wrote: > A Seg, 12-09-2016 às 14:23 +0200, Jean-Noël Avila escreveu: > > Not sure this one will be easy to localize. The verb is passed as a > > parameter : see line 366 "list", line 426 "add", line 517 "copy", > > line > > 658 "show", line 816 "merge", line 908 "remove" or line 595 with > > argv[0]. > > > > If all the verbs are real subcommands, then the translators should be > > warned that this is some english twisting, but that they need to > > refer > > to the subcommand on the command line. > > Yes, these verbs are git notes subcommands. I will add a Translators > comment to it explaining so. Or we can unfold that error messages like > > if (!strcmp(subcommand, "add") > die(_("Refusing to add notes in %s (outside of refs/notes/)"), > ref); > elseif ... > > else > die(_("Refusing to %s notes in %s (outside of refs/notes/)"), > subcommand, ref); This would be counter productive to use the inject strings as keys just to test them just after. > > This is more verbose but translations would benefit from it being more > natural. What do we prefer: (1) concise source and a little unnatural > translations or (2) verbose code and natural translations? > > Compare, imaging that English is a target translation language, the > user would read: > "Refusing to do add of notes in /path [...]" (1) > "Refusing do add notes in /path [...]" (2) Having one sentence per action is cumbersome, but avoiding sentence lego is mandatory for proper i18n. How about just adding quotes around the subcommand and warn translators ?
Re: [RFC 0/3] http: avoid repeatedly adding curl easy to curlm
Eric Wong writes: > The key patch here is 3/3 which seems like an obvious fix to > adding the problem of adding a curl easy handle to a curl multi > handle repeatedly. Yeah, sounds like the right thing to do and 2/3 makes it really easy to read the resulting code. > I will investigate those failures in a week or two when I regain > regular computer access. Thanks. Will tentatively queue on 'pu' and wait for updates.
Re: [ANNOUNCE] Git User's Survey 2016
On 13 September 2016 at 22:11, Lars Schneider wrote: >> On 13 Sep 2016, at 17:54, Jakub Narębski wrote: >> On 13 September 2016 at 18:15, David Bainbridge >> wrote: >>> Hi Jakub, >>> >>> You said: >>> P.S. At request I can open a separate channel in survey, with >>> a separate survey URL, so that responses from particular site >>> or organization could be separated out. >>> >>> Please can you open a channel for use by Ericsson? >> >> Sent (privately to David). > > Could you send me a channel for Autodesk, too? Here it is: https://survs.com/survey/c51qiuw394 Please tell me if you want the channel anonymized in survey results (after its closing). -- Jakub Narębski
Re: [PATCH 01/16] t1007: factor out repeated setup
On Mon, Sep 12, 2016 at 8:23 PM, Jeff King wrote: > We have a series of 3 CRLF tests that do exactly the same > (long) setup sequence. Let's pull it out into a common setup > test, which is shorter, more efficient, and will make it > easier to add new tests. > > Note that we don't have to worry about cleaning up any of > the setup which was previously per-test; we call pop_repo > after the CRLF tests, which cleans up everything. > > Signed-off-by: Jeff King This makes sense and looks good, Thanks, Reviewed-by: Stefan Beller
Left with empty files after "git stash pop" when system hung
I have used "git stash --include-untracked", checked out another branch, went back, and "git stash pop"ed the changes. Then my system crashed/hung (music that was playing was repeated in a loop). I have waited for some minutes, and then turned it off. Afterwards, the repository in question was in a state where all files contained in the stash were empty. "git status" looked good on first sight: all the untracked and modified files were listed there; but they were empty. % git fsck --lost-found error: object file .git/objects/04/1e659b5dbfd3f0be351a782b54743692875aec is empty error: object file .git/objects/04/1e659b5dbfd3f0be351a782b54743692875aec is empty fatal: loose object 041e659b5dbfd3f0be351a782b54743692875aec (stored in .git/objects/04/1e659b5dbfd3f0be351a782b54743692875aec) is corrupt % find .git/objects -size 0|wc -l 12 I would have assumed that the "stash pop" operation would be "atomic", i.e. it should not remove the stash object before other objects have been written successfully. The filesystem in question is ext4, and I am using Arch Linux. I have removed all empty files in .git/objects and tried to find the previous stash with `gitk --all $( git fsck | awk '{print $3}' )` then, but it appears to have disappeared. Please CC me in replies. Cheers, Daniel. -- http://daniel.hahler.de/ signature.asc Description: OpenPGP digital signature
Re: [PATCH 06/16] diff: always try to set up the repository
On Mon, Sep 12, 2016 at 8:23 PM, Jeff King wrote: > 2. If you're in a subdirectory of a repository, then we > still try to read ".git/config", but it generally > doesn't exist. So "diff --no-index" there does not > respect repo config. Nit: So IIUC your cover letter even this /used/ to work but broke only recently? So I feel like the message is a bit misleading (i.e. you argue for a change in behavior instead of calling it a bug fix for a regression. I think a bug fix for a regression is harder to revert as compared to a "new" behavior) I agree on the code, though.
Re: [PATCH v2 09/14] i18n: notes: mark error messages for translation
Jean-Noël AVILA writes: >> Yes, these verbs are git notes subcommands > > Having one sentence per action is cumbersome, but avoiding sentence lego is > mandatory for proper i18n. How about just adding quotes around the > subcommand > and warn translators ? I think that is a sensible way to go. I do not think it adds value to "translate" the action names that needs to be typed verbatim in the message.
Re: [PATCH v7 10/10] convert: add filter..process option
> On 10 Sep 2016, at 17:40, Torsten Bögershausen wrote: > > [] > > One general question up here, more comments inline. > The current order for a clean-filter is like this, I removed the error > handling: > > int convert_to_git() > { > ret |= apply_filter(path, src, len, -1, dst, filter); > if (ret && dst) { > src = dst->buf; > len = dst->len; > } > ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe); > return ret | ident_to_git(path, src, len, dst, ca.ident); > } > > The first step is the clean filter, the CRLF-LF conversion (if needed), > then ident. > The current implementation streams the whole file content to the filter, > (STDIN of the filter) and reads back STDOUT from the filter into a STRBUF. > This is to use the UNIX-like STDIN--STDOUT method for writing a filter. > > However, int would_convert_to_git_filter_fd() and convert_to_git_filter_fd() > offer a sort of short-cut: > The filter reads from the file directly, and the output of the filter is > read into a STRBUF. Are you sure? As far as I understand the code the filter does not read from the file in any case today. The functions would_convert_to_git_filter_fd() and convert_to_git_filter_fd() just avoid avoid mapping the file in Git. The content is still streamed via pipes: https://github.com/git/git/commit/9035d75a2be9d80d82676504d69553245017f6d4 > It looks as if the multi-filter approach can use this in a similar way: > Give the pathname to the filter, the filter opens the file for reading > and stream the result via the pkt-line protocol into Git. > This needs some more changes, and may be very well go into a separate patch > series. (and should). > > What I am asking for: > When a multi-filter is used, the content is handled to the filter via > pkt-line, > and the result is given to Git via pkt-line ? > Nothing wrong with it, I just wonder, if it should be mentioned somewhere. That is most certainly a good idea and the main reason I added "capabilities" to the protocol. Joey Hess worked on this topic (not merged, yet) and I would like to make this available to the long-running filter protocol as soon as the feature is available: http://public-inbox.org/git/1468277112-9909-1-git-send-email-jo...@joeyh.name/ >> +sub packet_read { >> +my $buffer; >> +my $bytes_read = read STDIN, $buffer, 4; >> +if ( $bytes_read == 0 ) { >> + >> +# EOF - Git stopped talking to us! >> +exit(); >> +} >> +elsif ( $bytes_read != 4 ) { >> +die "invalid packet size '$bytes_read' field"; >> +} > > This is half-kosher, I would say, > (And I really. really would like to see an implementation in C ;-) Would you be willing to contribute a patch? :-) > A read function may look like this: > > ret = read(0, &buffer, 4); > if (ret < 0) { > /* Error, nothing we can do */ > exit(1); > } else if (ret == 0) { > /* EOF */ > exit(0); > } else if (ret < 4) { > /* > * Go and read more, until we have 4 bytes or EOF or Error */ > } else { > /* Good case, see below */ > } I see. However, my intention was to provide an absolute minimal example to teach a reader how the protocol works. I consider all proper error handling an exercise for the reader ;-) >> +#define CAP_CLEAN(1u<<0) >> +#define CAP_SMUDGE (1u<<1) > > Is CAP_ too generic, and GIT_FILTER_CAP (or so) less calling for trouble ? I had something like that but Junio suggested these names in V4: http://public-inbox.org/git/xmqq8twd8uld@gitster.mtv.corp.google.com/ >> + >> +err = (strlen(filter_type) > PKTLINE_DATA_MAXLEN); > > Extra () needed ? > More () in the code... I thought it might improve readability, but I will remove them if you think this would be more consistent with existing Git code. Thanks, Lars
Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()
> On 13 Sep 2016, at 00:30, Junio C Hamano wrote: > > larsxschnei...@gmail.com writes: > >> From: Lars Schneider >> >> packet_flush() would die in case of a write error even though for some >> callers an error would be acceptable. Add packet_flush_gently() which >> writes a pkt-line flush packet and returns `0` for success and `-1` for >> failure. >> ... >> +int packet_flush_gently(int fd) >> +{ >> +packet_trace("", 4, 1); >> +if (write_in_full(fd, "", 4) == 4) >> +return 0; >> +error("flush packet write failed"); >> +return -1; > > It is more idiomatic to do > > return error(...); > > but more importantly, does the caller even want an error message > unconditionally printed here? > > I suspect that it is a strong sign that the caller wants to be in > control of when and what error message is produced; otherwise it > wouldn't be calling the _gently() variant, no? Agreed! Thanks, Lars
Re: [PATCH 16/16] init: reset cached config when entering new repo
I have reviewed all patches though I am no expert on the init routines. They all look good except for the one nit I noted for the commit message in patch 6. With that, the whole series is: Reviewed-by: Stefan Beller Thanks!
Re: [PATCH 06/16] diff: always try to set up the repository
On Tue, Sep 13, 2016 at 03:00:17PM -0700, Stefan Beller wrote: > On Mon, Sep 12, 2016 at 8:23 PM, Jeff King wrote: > > > 2. If you're in a subdirectory of a repository, then we > > still try to read ".git/config", but it generally > > doesn't exist. So "diff --no-index" there does not > > respect repo config. > > Nit: > So IIUC your cover letter even this /used/ to work but > broke only recently? So I feel like the message is a bit > misleading (i.e. you argue for a change in behavior instead of > calling it a bug fix for a regression. I think a bug fix for a regression > is harder to revert as compared to a "new" behavior) No, this has always been broken. What broke recently is the tests covered in patch 14 related to git-init. IOW, we have always done this "blind read" of .git/config, ever since the early days. It was always wrong, but was mostly overlooked because it worked often enough and usually didn't cause other problems. But because of the caching and lazy-reading done by git_config() these days (and get_shared_repository() which builds on it), you can get quite confusing and buggy effects if you lazy-read at the wrong time. -Peff
Re: [PATCH v3] checkout: eliminate unnecessary merge for trivial checkout
Ben Peart writes: > +static int needs_working_tree_merge(const struct checkout_opts *opts, > + const struct branch_info *old, > + const struct branch_info *new) > +{ > +... > +} I do not think I need to repeat the same remarks on the conditions in this helper, which hasn't changed since v2. Many "comments" in the code do not explain why skipping is justified, or what they claim to check looks to me just plain wrong. For example, there is /* * If we're not creating a new branch, by definition we're changing * the existing one so need to do the merge */ if (!opts->new_branch) return 1; but "git checkout" (no other argument) hits this condition. It disables the most trivial optimization opportunity, because we are not "creating". "By definition, we're changing"? Really? Not quite. If you disable this bogus check, "git checkout" (no other argument) would be allowed to skip the merge_working_tree(), and that in turn reveals another case that the helper is not checking when unpack_trees() MUST be called. Note: namely, when sparse checkout is in effect, switching from HEAD to HEAD can nuke existing working tree files outside the sparse pattern -- YUCK! See penultimate test in t1011 for an example. This yuckiness is not your fault, but needs_working_tree_merge() logic you added needs to refrain from skipping unpack_trees() call when sparse thing is in effect. I'd expect "git checkout -b foo" instead of "git checkout" (no other argument) would fail to honor the sparse thing and reveal this bug, because the above bogus "!opts->new_branch" check will not protect you for that case. In other words, these random series of "if (...) return 1" are bugs hiding other real bugs and we need to reason about which ones are bugs that are hiding what other bugs that are not covered by this function. As Peff said earlier for v1, this is still an unreadable mess. We need to figure out a way to make sure we are skipping on the right condition and not accidentally hiding a bug of failing to check the right condition. I offhand do not have a good suggestion on this; sorry. > static int merge_working_tree(const struct checkout_opts *opts, > struct branch_info *old, > struct branch_info *new, > int *writeout_error) > { > + /* > + * Optimize the performance of "git checkout -b foo" by avoiding > + * the expensive merge, index and working directory updates if they > + * are not needed. > + */ > + if (!needs_working_tree_merge(opts, old, new)) > + return 0; > + > int ret; > struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file)); With the change you made at the beginning of this function, it no longer compiles with -Wdecl-after-stmt, but that is the smallest of the problems. It is a small step in the right direction to move the call to the helper from the caller to this function, but it is a bit too small. Notice that the lines after the above context look like this: hold_locked_index(lock_file, 1); if (read_cache_preload(NULL) < 0) return error(_("index file corrupt")); resolve_undo_clear(); if (opts->force) { ret = reset_tree(new->commit->tree, opts, 1, writeout_error); if (ret) return ret; } else { struct tree_desc trees[2]; ... I would have expected that the check goes inside the "else" thing that actually does a two-tree merge, and the helper loses the check with opts->force, at least. That would still be a change smaller than desired, but at least a meaningful improvement compared to the previous one. As I have already pointed out, in the "else" clause there is a check "is the index free of conflicted entries? if so error out", and that must be honored in !opt->force case, no matter what your needs_working_tree_merge() says. I also was hoping that you would notice, when you were told about the unmerged check, by reading the remainder of the merge_working_tree(), that we need to call show_local_changes() when we are not doing force and when we are not quiet---returning early like the above patch will never be able to call that one downstream in the function. Regardless of what the actual checks end up to be, the right place to do this "optimization" would look more like: builtin/checkout.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 2b50a49..a6b9e17 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -508,14 +508,19 @@ static int merge_working_tree(const struct checkout_opts *opts, topts.dir->flags |= DIR_SHOW_IGNORED; setup_standard_excludes(topts.dir); } + + if ( we kno
Re: [RFC/PATCH 01/17] diff: move line ending check into emit_hunk_header
On Tue, Sep 13, 2016 at 7:42 AM, René Scharfe wrote: >> >> strbuf_add(&msgbuf, line + len, org_len - len); >> + if (line[org_len - 1] != '\n') >> + strbuf_addch(&msgbuf, '\n'); >> + > > > Using strbuf_complete_line() would be nicer. That makes sense! Thanks, Stefan
Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()
Lars Schneider writes: >> On 13 Sep 2016, at 00:30, Junio C Hamano wrote: >> >> larsxschnei...@gmail.com writes: >> >>> From: Lars Schneider >>> >>> packet_flush() would die in case of a write error even though for some >>> callers an error would be acceptable. Add packet_flush_gently() which >>> writes a pkt-line flush packet and returns `0` for success and `-1` for >>> failure. >>> ... >>> +int packet_flush_gently(int fd) >>> +{ >>> + packet_trace("", 4, 1); >>> + if (write_in_full(fd, "", 4) == 4) >>> + return 0; >>> + error("flush packet write failed"); >>> + return -1; >> >> It is more idiomatic to do >> >> return error(...); >> >> but more importantly, does the caller even want an error message >> unconditionally printed here? >> >> I suspect that it is a strong sign that the caller wants to be in >> control of when and what error message is produced; otherwise it >> wouldn't be calling the _gently() variant, no? > > Agreed! I am also OK with the current form, too. Those who need to enhance it to packet_flush_gently(int fd, int quiet) can come later.
Re: [RFC/PATCH 05/17] diff.c: emit_line_0 can handle no color setting
Stefan Beller writes: > In a later patch, I want to propose an option to detect&color > moved lines in a diff, which cannot be done in a one-pass over > the diff. Instead we need to go over the whole diff twice, > because we cannot detect the first line of the two corresponding > lines (+ and -) that got moved. > > So to prepare the diff machinery for two pass algorithms > (i.e. buffer it all up and then operate on the result), > move all emissions to places, such that the only emitting > function is emit_line_0. > > In later patches we may pass lines that are not colored to > the central function emit_line_0, so we > need to emit the color only when it is non-NULL. Explained this way, a reader would find that this step is here before the underlying code is ready--we are not even buffering at this step yet. But that is OK. It used to be that passing "" as set/reset was the way to get a --no-color output. Now you can pass NULL instead of empty strings. That would be an alternative explanation why this is an acceptable change (as your later step probably has a good reason why it cannot pass "" instead of NULL). > > Signed-off-by: Stefan Beller > --- > diff.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/diff.c b/diff.c > index b6a40ae..5d57130 100644 > --- a/diff.c > +++ b/diff.c > @@ -473,11 +473,13 @@ static void emit_line_0(struct diff_options *o, const > char *set, const char *res > } > > if (len || !nofirst) { > - fputs(set, file); > + if (set) > + fputs(set, file); > if (!nofirst) > fputc(first, file); > fwrite(line, len, 1, file); > - fputs(reset, file); > + if (reset) > + fputs(reset, file); > } > if (has_trailing_carriage_return) > fputc('\r', file);
Re: [RFC/PATCH 06/17] diff.c: convert fn_out_consume to use emit_line_*
Stefan Beller writes: > In a later patch, I want to propose an option to detect&color > moved lines in a diff, which cannot be done in a one-pass over > the diff. Instead we need to go over the whole diff twice, > because we cannot detect the first line of the two corresponding > lines (+ and -) that got moved. > > So to prepare the diff machinery for two pass algorithms > (i.e. buffer it all up and then operate on the result), > move all emissions to places, such that the only emitting > function is emit_line_0. > > This covers the remaining parts of fn_out_consume. name_x_tab are colored as before, which you are already aware of and we'd need to find a way to handle, but other than that, this is a no-op conversion, getting us closer to the goal of making everything go through a single funnel. > name_a_tab = strchr(ecbdata->label_path[0], ' ') ? "\t" : ""; > name_b_tab = strchr(ecbdata->label_path[1], ' ') ? "\t" : ""; > - > - fprintf(o->file, "%s%s--- %s%s%s\n", > - line_prefix, meta, ecbdata->label_path[0], reset, > name_a_tab); > - fprintf(o->file, "%s%s+++ %s%s%s\n", > - line_prefix, meta, ecbdata->label_path[1], reset, > name_b_tab); > + emit_line_fmt(o, meta, reset, "--- %s%s\n", > + ecbdata->label_path[0], name_a_tab); > + emit_line_fmt(o, meta, reset, "+++ %s%s\n", > + ecbdata->label_path[1], name_b_tab); > ecbdata->label_path[0] = ecbdata->label_path[1] = NULL; > }
Re: [RFC/PATCH 10/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
Stefan Beller writes: > In a later patch, I want to propose an option to detect&color > moved lines in a diff, which cannot be done in a one-pass over > the diff. Instead we need to go over the whole diff twice, > because we cannot detect the first line of the two corresponding > lines (+ and -) that got moved. > > So to prepare the diff machinery for two pass algorithms > (i.e. buffer it all up and then operate on the result), > move all emissions to places, such that the only emitting > function is emit_line_0. > > This prepares the code for submodules to go through the > emit_line_0 function. > > Signed-off-by: Stefan Beller > --- I wonder how this interacts with the jk/diff-submodule-diff-inline topic by Jacob that has graduated recently to 'master'. IIRC, it just lets a separate "git diff" instance that is spawned in the submodule directory emit its findings to the output of the driving "git diff" in the superproject.
Re: [RFC/PATCH 16/17] diff: buffer output in emit_line_0
Stefan Beller writes: > +struct line_emission { > + const char *set; > + const char *line; > + const char *ws; > + const char *reset; > + int first; > + int len; > + int whitespace_check; > + unsigned ws_rule; > + int has_trailing_carriage_return; > + int has_trailing_newline; > +}; It is somewhat strange to see whitespace things are per-line here. I'd understand it if it were per-path, though.
Re: [PATCH] pathspec: removed unnecessary function prototypes
Jeff King writes: > On Tue, Sep 13, 2016 at 09:52:51AM -0700, Brandon Williams wrote: > >> removed function prototypes from pathspec.h which don't have a >> corresponding implementation. > > I'm always curious of the "why" in cases like this. Did we forget to add > them? Did they get renamed? Did they go away? > > Looks like the latter; 5a76aff (add: convert to use parse_pathspec, > 2013-07-14) just forgot to remove them. Thanks for digging.
Re: [RFC/PATCH 10/17] submodule.c: convert show_submodule_summary to use emit_line_fmt
On Tue, Sep 13, 2016 at 4:02 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> In a later patch, I want to propose an option to detect&color >> moved lines in a diff, which cannot be done in a one-pass over >> the diff. Instead we need to go over the whole diff twice, >> because we cannot detect the first line of the two corresponding >> lines (+ and -) that got moved. >> >> So to prepare the diff machinery for two pass algorithms >> (i.e. buffer it all up and then operate on the result), >> move all emissions to places, such that the only emitting >> function is emit_line_0. >> >> This prepares the code for submodules to go through the >> emit_line_0 function. >> >> Signed-off-by: Stefan Beller >> --- > > I wonder how this interacts with the jk/diff-submodule-diff-inline > topic by Jacob that has graduated recently to 'master'. IIRC, it > just lets a separate "git diff" instance that is spawned in the > submodule directory emit its findings to the output of the driving > "git diff" in the superproject. > The easiest way to find out is to merge HEAD^ of this patch series (i.e. "diff: buffer output in emit_line_0") with whatever we suspect can cause breakage for the goal of channeling everything though emit_line_* functions. Looking at that series, I think I'll have to redo this (maybe even including sb/diff-cleanup, to have it all in one series) to capture all output there. I suspected a breakage, but as the patch series grew larger and larger, I first wanted to get into a working state before paying attention to solving conflicts as resolving conflicts is easier when I know where this series is headed. Thanks! Stefan
Re: [PATCH] unpack-trees: pass checkout state explicitly to check_updates()
René Scharfe writes: > Add a parameter for the struct checkout variable to check_updates() > instead of using a static global variable. Passing it explicitly makes > object ownership and usage more easily apparent. And we get rid of a > static variable; those can be problematic in library-like code. > ... > diff --git a/unpack-trees.c b/unpack-trees.c > index 11c37fb..74d6dd4 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -218,8 +218,8 @@ static void unlink_entry(const struct cache_entry *ce) > schedule_dir_for_removal(ce->name, ce_namelen(ce)); > } > > -static struct checkout state; > ... > @@ -1094,6 +1094,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, > struct unpack_trees_options > int i, ret; > static struct cache_entry *dfc; > struct exclude_list el; > + struct checkout state; Does the distinction between this thing in BSS implicitly cleared and the new one on stack that does not seem to have any initialization matter? ... goes and looks ... OK, after this hunk we clear and set up everything in state, so there is no difference in behaviour. Just we got rid of an unnecessary file-scope global. Nice. Thanks. > if (len > MAX_UNPACK_TREES) > die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES); > @@ -1239,7 +1240,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, > struct unpack_trees_options > } > > o->src_index = NULL; > - ret = check_updates(o) ? (-2) : 0; > + ret = check_updates(o, &state) ? (-2) : 0; > if (o->dst_index) { > if (!ret) { > if (!o->result.cache_tree)
Re: [RFC/PATCH 16/17] diff: buffer output in emit_line_0
On Tue, Sep 13, 2016 at 4:06 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> +struct line_emission { >> + const char *set; >> + const char *line; >> + const char *ws; >> + const char *reset; >> + int first; >> + int len; >> + int whitespace_check; >> + unsigned ws_rule; >> + int has_trailing_carriage_return; >> + int has_trailing_newline; >> +}; > > It is somewhat strange to see whitespace things are per-line here. > I'd understand it if it were per-path, though. Yeah we have to have it at least per path as that is the granularity the user can configure it. So would we rather want to keep the ecbdata around for each file pair and just reference that? I thought we deliberately want to avoid ecbdata, so maybe we rather want to have another struct that keeps path related information around (pointer to the blob and white space information). Thanks, Stefan
Re: [RFC/PATCH 16/17] diff: buffer output in emit_line_0
Stefan Beller writes: > So would we rather want to keep the ecbdata around for each file pair and > just reference that? I thought we deliberately want to avoid ecbdata, so maybe > we rather want to have another struct that keeps path related information > around (pointer to the blob and white space information). I would expect that there would be two structs, one per path "struct buffered_patch" that has the per-path thing, and another per line "struct buffered_patch_line" that describes what each line is, and has a pointer to the former.
[RFC 0/1] de-quote quoted-strings in mailinfo
This is my first 'big' C patch, so first an RFC. This patch implements RFC2822 dequoting of quoted-pairs in quoted strings, which was not done yet. This means removing the "\" as escape character from header fields, but only quoted strings, and comments (text between braces). According to the RFC, comments can also appear in square brackets in the e-mail domain, but that has not been implemented. In fact, just like other functions, it just looks at the whole header line. Please let me know what you think. Kevin Daudt (1): mailinfo: de-quote quoted-pair in header fields mailinfo.c | 46 ++ t/t5100-mailinfo.sh| 5 + t/t5100/quoted-pair.expect | 5 + t/t5100/quoted-pair.in | 9 + t/t5100/quoted-pair.info | 5 + 5 files changed, 70 insertions(+) create mode 100644 t/t5100/quoted-pair.expect create mode 100644 t/t5100/quoted-pair.in create mode 100644 t/t5100/quoted-pair.info -- 2.10.0.rc2
Re: [RFC/PATCH 16/17] diff: buffer output in emit_line_0
On Tue, Sep 13, 2016 at 4:32 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> So would we rather want to keep the ecbdata around for each file pair and >> just reference that? I thought we deliberately want to avoid ecbdata, so >> maybe >> we rather want to have another struct that keeps path related information >> around (pointer to the blob and white space information). > > I would expect that there would be two structs, one per path > "struct buffered_patch" that has the per-path thing, and another per > line "struct buffered_patch_line" that describes what each line is, > and has a pointer to the former. > Heh, I was trying to come up with a clever thing to save that pointer, as we would need to have that pointer once per line, so in large patches that would save a bit of space, but probably I should not try to be too smart about it. So I'd split up the struct line_emission into the two proposed buffered_patch_line as well as buffered_patch. However the naming is a bit off than I would expect. Historically you had one patch per file, so it was natural to name a change of multiple files a "patchset" (c.f. a commit in Gerrit is called "patchset"/revision) Today as Git is quite successful, one "patch" is easily understood as the equivalent of one patch, i.e. what format-patch produced. So I'd prefer to go with buffer_filepair and buffer_line maybe?
[RFC 0/1] mailinfo: de-quote quoted-pair in header fields
rfc2822 has provisions for quoted strings in structured header fields, but also allows for escaping these with so-called quoted-pairs. git currently does not do anything with this at all, and verbatim takes over the field body. Make sure to properly dequote these quoted-strings and comments. Signed-off-by: Kevin Daudt --- mailinfo.c | 46 ++ t/t5100-mailinfo.sh| 5 + t/t5100/quoted-pair.expect | 5 + t/t5100/quoted-pair.in | 9 + t/t5100/quoted-pair.info | 5 + 5 files changed, 70 insertions(+) create mode 100644 t/t5100/quoted-pair.expect create mode 100644 t/t5100/quoted-pair.in create mode 100644 t/t5100/quoted-pair.info diff --git a/mailinfo.c b/mailinfo.c index e19abe3..3b7ae8a 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -445,6 +445,51 @@ static void decode_header(struct mailinfo *mi, struct strbuf *it) mi->input_error = -1; } +static int unescape_quoted_pair(struct mailinfo *mi, struct strbuf *line) +{ + struct strbuf outbuf = STRBUF_INIT; + const char *in = line->buf; + int c, skip=0; + char escape_context=0; + + while ((c = *in++) != 0) { + if (!skip) { + switch (c) { + case '"': + if (!escape_context) + escape_context = '"'; + else if (escape_context == '"') + escape_context = 0; + break; + case '\\': + if (escape_context) { + skip = 1; + continue; + } + break; + case '(': + if (!escape_context) + escape_context = '('; + break; + case ')': + if (escape_context == '(') + escape_context = 0; + break; + } + } else { + skip = 0; + } + + strbuf_addch(&outbuf, c); + } + + strbuf_reset(line); + strbuf_addbuf(line, &outbuf); + + return 0; + +} + static int check_header(struct mailinfo *mi, const struct strbuf *line, struct strbuf *hdr_data[], int overwrite) @@ -461,6 +506,7 @@ static int check_header(struct mailinfo *mi, */ strbuf_add(&sb, line->buf + len + 2, line->len - len - 2); decode_header(mi, &sb); + unescape_quoted_pair(mi, &sb); handle_header(&hdr_data[i], &sb); ret = 1; goto check_header_out; diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index 1a5a546..2be61bf 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -142,4 +142,9 @@ test_expect_success 'mailinfo unescapes with --mboxrd' ' test_cmp expect mboxrd/msg ' +test_expect_success 'mailinfo unescapes rfc2822 quoted-pair' ' +git mailinfo /dev/null /dev/null <"$TEST_DIRECTORY"/t5100/quoted-pair.in >"$TEST_DIRECTORY"/t5100/quoted-pair.info && +test_cmp "$TEST_DIRECTORY"/t5100/quoted-pair.expect "$TEST_DIRECTORY"/t5100/quoted-pair.info +' + test_done diff --git a/t/t5100/quoted-pair.expect b/t/t5100/quoted-pair.expect new file mode 100644 index 000..9fe72e9 --- /dev/null +++ b/t/t5100/quoted-pair.expect @@ -0,0 +1,5 @@ +Author: "Author "The Author" Name" +Email: someb...@example.com +Subject: testing quoted-pair +Date: Sun, 25 May 2008 00:38:18 -0700 + diff --git a/t/t5100/quoted-pair.in b/t/t5100/quoted-pair.in new file mode 100644 index 000..e2e627a --- /dev/null +++ b/t/t5100/quoted-pair.in @@ -0,0 +1,9 @@ +From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001 +From: "Author \"The Author\" Name" +Date: Sun, 25 May 2008 00:38:18 -0700 +Subject: [PATCH] testing quoted-pair + + + +--- +patch diff --git a/t/t5100/quoted-pair.info b/t/t5100/quoted-pair.info new file mode 100644 index 000..9fe72e9 --- /dev/null +++ b/t/t5100/quoted-pair.info @@ -0,0 +1,5 @@ +Author: "Author "The Author" Name" +Email: someb...@example.com +Subject: testing quoted-pair +Date: Sun, 25 May 2008 00:38:18 -0700 + -- 2.10.0.rc2
[RFC 0/1] de-quote quoted-strings in mailinfo
This is my first 'big' C patch, so first an RFC. This patch implements RFC2822 dequoting of quoted-pairs in quoted strings, which was not done yet. This means removing the "\" as escape character from header fields, but only quoted strings, and comments (text between braces). According to the RFC, comments can also appear in square brackets in the e-mail domain, but that has not been implemented. In fact, just like other functions, it just looks at the whole header line. Please let me know what you think. Kevin Daudt (1): mailinfo: de-quote quoted-pair in header fields mailinfo.c | 46 ++ t/t5100-mailinfo.sh| 5 + t/t5100/quoted-pair.expect | 5 + t/t5100/quoted-pair.in | 9 + t/t5100/quoted-pair.info | 5 + 5 files changed, 70 insertions(+) create mode 100644 t/t5100/quoted-pair.expect create mode 100644 t/t5100/quoted-pair.in create mode 100644 t/t5100/quoted-pair.info -- 2.10.0.rc2
Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields
Kevin Daudt writes: > +static int unescape_quoted_pair(struct mailinfo *mi, struct strbuf *line) > +{ > + struct strbuf outbuf = STRBUF_INIT; > + const char *in = line->buf; > + int c, skip=0; > + char escape_context=0; Have SP around '=', i.e. int c, skip = 0; char escape_context = 0; > + while ((c = *in++) != 0) { > + if (!skip) { > + switch (c) { > + case '"': > +... > + break; > + } > + } else { > + skip = 0; > + } > + > + strbuf_addch(&outbuf, c); > + } It often is easier to read if smaller of the two are in the if part and the larger in else part. Also your switch/case is indented one level too deep. I.e. while (...) { if (skip) { skip = 0; } else { switch (c) { case '"': do this; ... } } strbuf_addch(...); } I found the variable name "skip" a bit hard to reason about. What it does is to signal the next round of the processing that we have seen a single-byte quote and it should keep the byte it will get, no matter what its value is. It is "skipping" the conditional processing, but I'd imagine most people would consider it is "keeping the byte". > @@ -461,6 +506,7 @@ static int check_header(struct mailinfo *mi, >*/ > strbuf_add(&sb, line->buf + len + 2, line->len - len - > 2); > decode_header(mi, &sb); > + unescape_quoted_pair(mi, &sb); > handle_header(&hdr_data[i], &sb); > ret = 1; > goto check_header_out; I wonder why this call is only in here, not on other headers that all call decode_header(). For that matter, I wonder if the call (or the logic of the helper function itself) should go at the end of decode_header(). After all, this is different kind of decoding; the current one knows how to do b/q encoding but forgot about the more traditional quoting done with backslash, and you are teaching the code that the current decoding it does is insufficient and how to handle the one that the original implementors forgot about. Thanks.
Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields
On Tue, Sep 13, 2016 at 05:04:45PM -0700, Junio C Hamano wrote: > Kevin Daudt writes: > > > It often is easier to read if smaller of the two are in the if part > and the larger in else part. Also your switch/case is indented one > level too deep. I.e. > Thanks, I've switched the order and fixed indentation. > > I found the variable name "skip" a bit hard to reason about. What > it does is to signal the next round of the processing that we have > seen a single-byte quote and it should keep the byte it will get, no > matter what its value is. It is "skipping" the conditional > processing, but I'd imagine most people would consider it is > "keeping the byte". Yes, I agree and was trying to find a better name. I have renamed it to "take_next_literally", which indicates better what it means. > > > @@ -461,6 +506,7 @@ static int check_header(struct mailinfo *mi, > > */ > > strbuf_add(&sb, line->buf + len + 2, line->len - len - > > 2); > > decode_header(mi, &sb); > > + unescape_quoted_pair(mi, &sb); > > handle_header(&hdr_data[i], &sb); > > ret = 1; > > goto check_header_out; > > I wonder why this call is only in here, not on other headers that > all call decode_header(). For that matter, I wonder if the call (or > the logic of the helper function itself) should go at the end of > decode_header(). After all, this is different kind of decoding; the > current one knows how to do b/q encoding but forgot about the more > traditional quoting done with backslash, and you are teaching the > code that the current decoding it does is insufficient and how to > handle the one that the original implementors forgot about. Makes sense, it should be applied to all headers (I missed the other decode_header calls). I will send a new version later.
Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields
On Tue, Sep 13, 2016 at 05:04:45PM -0700, Junio C Hamano wrote: > > @@ -461,6 +506,7 @@ static int check_header(struct mailinfo *mi, > > */ > > strbuf_add(&sb, line->buf + len + 2, line->len - len - > > 2); > > decode_header(mi, &sb); > > + unescape_quoted_pair(mi, &sb); > > handle_header(&hdr_data[i], &sb); > > ret = 1; > > goto check_header_out; > > I wonder why this call is only in here, not on other headers that > all call decode_header(). For that matter, I wonder if the call (or > the logic of the helper function itself) should go at the end of > decode_header(). After all, this is different kind of decoding; the > current one knows how to do b/q encoding but forgot about the more > traditional quoting done with backslash, and you are teaching the > code that the current decoding it does is insufficient and how to > handle the one that the original implementors forgot about. It has been a while since I looked at rfc2822, but aren't the quoting and syntax rules different for addresses versus other headers? We would not want to dequote a Subject header, I think. -Peff
Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields
On Wed, Sep 14, 2016 at 01:46:12AM +0200, Kevin Daudt wrote: > diff --git a/t/t5100/quoted-pair.expect b/t/t5100/quoted-pair.expect > new file mode 100644 > index 000..9fe72e9 > --- /dev/null > +++ b/t/t5100/quoted-pair.expect > @@ -0,0 +1,5 @@ > +Author: "Author "The Author" Name" > +Email: someb...@example.com > +Subject: testing quoted-pair > +Date: Sun, 25 May 2008 00:38:18 -0700 So obviously this is much better than including the backslashed quotes. But I have to wonder why the first line is not: Author: Author "The Author" Name Who is responsible for stripping out the other quotes? I know that they _do_ get stripped out even in the current code, but it is not clear to me if that is intentional or an accident. In Git's world-view (e.g., in commit headers), an ident name continues until we get to the "<" of the email (or a "\n" terminates the header line completely). So if mailinfo is converting rfc2822 headers into Git ident, I'd expect it to fully remove any quotes that are not intended to be in the name, and everything after "Author: " up to the newline would become the name. It's entirely possible I'm missing something subtle about the design of mailinfo, though. -Peff
Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields
Jeff King writes: > It has been a while since I looked at rfc2822, but aren't the quoting > and syntax rules different for addresses versus other headers? We would > not want to dequote a Subject header, I think. You're absolutely right. RFC2822 does not quite _want_ to dequote anything. As you pointed out in a separate message, we are the one who want to strip out "" quoting when mailinfo says Author: "Jeff King" to its standard output (aka "info"), and turn it into GIT_AUTHOR_NAME='Jeff King' and do so ONLY for the author name. So I would think it is the responsibility of the one that reads the "info" file that is produced by mailinfo to dequote the backslash thing if the mailinfo gave us Author: "Jeff \"Peff\" King"
Re: [PATCH v3] checkout: eliminate unnecessary merge for trivial checkout
Sorry for bothering, why not introduce a brand new option like git checkout -b foo --skip-worktree-merge for such rare optimization use case? On Wed, Sep 14, 2016 at 12:34 AM, Junio C Hamano wrote: > Ben Peart writes: > >> +static int needs_working_tree_merge(const struct checkout_opts *opts, >> + const struct branch_info *old, >> + const struct branch_info *new) >> +{ >> +... >> +} > > I do not think I need to repeat the same remarks on the conditions > in this helper, which hasn't changed since v2. Many "comments" in > the code do not explain why skipping is justified, or what they > claim to check looks to me just plain wrong. > > For example, there is > >/* > * If we're not creating a new branch, by definition we're changing > * the existing one so need to do the merge > */ >if (!opts->new_branch) >return 1; > > but "git checkout" (no other argument) hits this condition. It > disables the most trivial optimization opportunity, because we are > not "creating". > > "By definition, we're changing"? Really? Not quite. > > If you disable this bogus check, "git checkout" (no other argument) > would be allowed to skip the merge_working_tree(), and that in turn > reveals another case that the helper is not checking when > unpack_trees() MUST be called. > > Note: namely, when sparse checkout is in effect, switching from > HEAD to HEAD can nuke existing working tree files outside the > sparse pattern -- YUCK! See penultimate test in t1011 for > an example. > > This yuckiness is not your fault, but needs_working_tree_merge() > logic you added needs to refrain from skipping unpack_trees() call > when sparse thing is in effect. I'd expect "git checkout -b foo" > instead of "git checkout" (no other argument) would fail to honor > the sparse thing and reveal this bug, because the above bogus > "!opts->new_branch" check will not protect you for that case. > > In other words, these random series of "if (...) return 1" are bugs > hiding other real bugs and we need to reason about which ones are > bugs that are hiding what other bugs that are not covered by this > function. As Peff said earlier for v1, this is still an unreadable > mess. We need to figure out a way to make sure we are skipping on > the right condition and not accidentally hiding a bug of failing to > check the right condition. I offhand do not have a good suggestion > on this; sorry. > >> static int merge_working_tree(const struct checkout_opts *opts, >> struct branch_info *old, >> struct branch_info *new, >> int *writeout_error) >> { >> + /* >> + * Optimize the performance of "git checkout -b foo" by avoiding >> + * the expensive merge, index and working directory updates if they >> + * are not needed. >> + */ >> + if (!needs_working_tree_merge(opts, old, new)) >> + return 0; >> + >> int ret; >> struct lock_file *lock_file = xcalloc(1, sizeof(struct lock_file)); > > With the change you made at the beginning of this function, it no > longer compiles with -Wdecl-after-stmt, but that is the smallest of > the problems. > > It is a small step in the right direction to move the call to the > helper from the caller to this function, but it is a bit too small. > > Notice that the lines after the above context look like this: > > hold_locked_index(lock_file, 1); > if (read_cache_preload(NULL) < 0) > return error(_("index file corrupt")); > > resolve_undo_clear(); > if (opts->force) { > ret = reset_tree(new->commit->tree, opts, 1, writeout_error); > if (ret) > return ret; > } else { > struct tree_desc trees[2]; > ... > > I would have expected that the check goes inside the "else" thing > that actually does a two-tree merge, and the helper loses the check > with opts->force, at least. That would still be a change smaller > than desired, but at least a meaningful improvement compared to the > previous one. As I have already pointed out, in the "else" clause > there is a check "is the index free of conflicted entries? if so > error out", and that must be honored in !opt->force case, no matter > what your needs_working_tree_merge() says. I also was hoping that > you would notice, when you were told about the unmerged check, by > reading the remainder of the merge_working_tree(), that we need to > call show_local_changes() when we are not doing force and when we > are not quiet---returning early like the above patch will never be > able to call that one downstream in the function. > > Regardless of what the actual checks end up to be, the right place > to do this "optimization" would look more like: > > builtin/checkout.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git
[PATCH] vcs-svn/fast_export: fix timestamp fmt specifiers
Two instances of %ld being used for unsigned longs Signed-off-by: Mike Ralphson --- vcs-svn/fast_export.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c index bd0f2c2..97cba39 100644 --- a/vcs-svn/fast_export.c +++ b/vcs-svn/fast_export.c @@ -73,7 +73,7 @@ void fast_export_begin_note(uint32_t revision, const char *author, static int firstnote = 1; size_t loglen = strlen(log); printf("commit %s\n", note_ref); - printf("committer %s <%s@%s> %ld +\n", author, author, "local", timestamp); + printf("committer %s <%s@%s> %lu +\n", author, author, "local", timestamp); printf("data %"PRIuMAX"\n", (uintmax_t)loglen); fwrite(log, loglen, 1, stdout); if (firstnote) { @@ -107,7 +107,7 @@ void fast_export_begin_commit(uint32_t revision, const char *author, } printf("commit %s\n", local_ref); printf("mark :%"PRIu32"\n", revision); - printf("committer %s <%s@%s> %ld +\n", + printf("committer %s <%s@%s> %lu +\n", *author ? author : "nobody", *author ? author : "nobody", *uuid ? uuid : "local", timestamp); -- https://github.com/git/git/pull/293