Re: [GSoC] Info: new blog series of my work on Git GSoC '18

2018-05-02 Thread Johannes Schindelin
Hi Pratik,

On Wed, 2 May 2018, Pratik Karki wrote:

> As promised in my proposal, I've started
> to write a blog series of GSoC '18 with Git. The initial blog is up.
> You can find it here[1]. The initial one is just to get started and
> from next iterations, I'll start detailing of my work towards converting
> rebase to builtin.
> 
> [1]: https://prertik.github.io/categories/git/

This is awesome! Thanks for doing this,
Dscho


Re: git merge banch w/ different submodule revision

2018-05-02 Thread Middelschulte, Leif
Am Montag, den 30.04.2018, 19:02 +0200 schrieb Heiko Voigt:
> On Thu, Apr 26, 2018 at 03:19:36PM -0700, Stefan Beller wrote:
> > Stefan wrote:
> > > See 
> > > https://github.com/git/git/commit/68d03e4a6e448aa557f52adef92595ac4d6cd4bd
> > > (68d03e4a6e (Implement automatic fast-forward merge for submodules, 
> > > 2010-07-07)
> > > to explain the situation you encounter. (specifically merge_submodule
> > > at the end of the diff)
> > 
> > +cc Heiko, author of that commit.
> 
> In that commit we tried to be very careful about. I do not understand
> the situation in which the current strategy would be wrong by default.
> 
> We only merge if the following applies:
> 
>  * The changes in the superproject on both sides point forward in the
>submodule.
> 
>  * One side is contained in the other. Contained from the submodule
>perspective. Sides from the superproject merge perspective.
> 
> So in case of the mentioned rewind of a submodule: Only one side of the
> 3-way merge would point forward and the merge would fail.
> 
> I can imagine, that in case of a temporary revert of a commit in the
> submodule that you would not want that merged into some other branch.
> But that would be the same without submodules. If you merge a temporary
> revert from another branch you will not get any conflict.
> 
> So maybe someone can explain the use case in which one would get the
> results that seem wrong?
In an ideal world, where there are no regressions between revisions, a
fast-forward is appropriate. However, we might have regressions within
submodules.

So the usecase is the following:

Environment:
- We have a base library L that is developed by some team (Team B).
- Another team (Team A) developes a product P based on those libraries using 
git-flow.

Case:
The problem occurs, when a developer (D) of Team A tries to have a feature
that he developed on a branch accepted by a core developer of P:
If a core developer of P advanced the reference of L within P (linear history), 
he might
deem the work D insufficient. Not because of the actual work by D, but 
regressions
that snuck into L. The core developer will not be informed about the 
missmatching
revisions of L.

So it would be nice if there was some kind of switch or at least some trigger.

Cheers,

Leif


> 
> Cheers Heiko
> 

Re: [PATCH 2/6] t1406: prepare for the refs code to fail with BUG()

2018-05-02 Thread Johannes Schindelin
Hi,

On Wed, 2 May 2018, Junio C Hamano wrote:

> Duy Nguyen  writes:
> 
> > On Mon, Apr 30, 2018 at 12:17 AM, Johannes Schindelin
> >  wrote:
> >> t1406 specifically verifies that certain code paths fail with a BUG: ...
> >> message.
> >>
> >> In the upcoming commit, we will convert that message to be generated via
> >> BUG() instead of die("BUG: ..."), which implies SIGABRT instead of a
> >> regular exit code.
> >
> > On the other hand, SIGABRT on linux creates core dumps. And on some
> > setup (like mine) core dumps may be redirected to some central place
> > via /proc/sys/kernel/core_pattern. I think systemd does it too but I
> > didn't check.
> >
> > This moving to SIGABRT when we know it _will_ happen when running the
> > test suite will accumulate core dumps over time and not cleaned up by
> > the test suite. Maybe keeping die("BUG: here is a good compromise.
> 
> I do not think it is.  At regular runtime, we _do_ want Git to dump
> core if it triggers BUG() condition, whose point is to mark
> conditions that should never happen.

Indeed.

> As discussed in this thread, tests that use t/helper/ executables
> that try to trickle BUG() codepath to ensure that these "should
> never happen" conditions are caught do need to deal with it.  If
> dumping core is undesirable, tweaking BUG() implementation so that
> it becomes die("BUG: ...") *ONLY* when the caller knows what it is
> doing (e.g. running t/helper/ commands) is probably a good idea.
> Perhaps GIT_TEST_OPTS can gain one feature "--bug-no-abort" and set
> an environment variable so that implementation of BUG() can notice,
> or something.

I think we can do even better than that. t/helper/*.c could set a global
variable that no other code is supposed to set, to trigger an alternative
to SIGABRT. Something like

-- snip --
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 87066ced62a..5176f9f20ae 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -47,7 +47,9 @@ static struct test_cmd cmds[] = {
 int cmd_main(int argc, const char **argv)
 {
int i;
+   extern int BUG_exit_code;
 
+   BUG_exit_code = 99;
if (argc < 2)
die("I need a test name!");
 
diff --git a/usage.c b/usage.c
index cdd534c9dfc..9c84dccfa97 100644
--- a/usage.c
+++ b/usage.c
@@ -210,6 +210,9 @@ void warning(const char *warn, ...)
va_end(params);
 }
 
+/* Only set this, ever, from t/helper/, when verifying that bugs are
caught. */
+int BUG_exit_code;
+
 static NORETURN void BUG_vfl(const char *file, int line, const char *fmt,
va_list params)
 {
char prefix[256];
@@ -221,6 +224,8 @@ static NORETURN void BUG_vfl(const char *file, int
line, const char *fmt, va_lis
snprintf(prefix, sizeof(prefix), "BUG: ");
 
vreportf(prefix, fmt, params);
+   if (BUG_exit_code)
+   exit(BUG_exit_code);
abort();
 }
 
-- snap --

I'll try to find some time to play with this.

Ciao,
Dscho
> 
> When we are testing normal parts of Git outside t/helper/, we never
> want to hit BUG().  Aborting and dumping core when that happens is
> an desirable outcome.  From that point of view, the idea in 1/6 of
> this patch series to annotate test_must_fail and say "we know this
> one is going to hit BUG()" is a sound one.  The implementation in
> 1/6 to treat SIGABRT as an acceptable failure needs to be replaced
> to instead use the above mechanism you would use to tell BUG() not
> to abort but die with message to arrange that to happen before
> running the git command (most likely something from t/helper/) under
> test_must_fail ok=sigabrt; and then those who regularly break their
> Git being tested (read: us devs) and hit BUG() could instead set the
> environment variable (i.e. internal implementation detail) manually
> in their environment to turn these BUG()s into die("BUG:...)s while
> testing their early mistakes if they do not want core (of course,
> you could just do "ulimit -c", and that may be simpler solution of
> your "testing Git contaminates central core depot" issue).
> 
> 
> 
> 
> 
> 
> 


Re: [PATCH v4 10/10] commit-graph.txt: update design document

2018-05-02 Thread Jakub Narebski
Derrick Stolee  writes:

> On 4/30/2018 7:32 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
[...]
>>>   - After computing and storing generation numbers, we must make graph
>>> walks aware of generation numbers to gain the performance benefits they
>>> enable. This will mostly be accomplished by swapping a 
>>> commit-date-ordered
>>> priority queue with one ordered by generation number. The following
>>> -  operations are important candidates:
>>> +  operation is an important candidate:
>>>   -- paint_down_to_common()
>>>   - 'log --topo-order'
>>
>> Another possible candidates:
>>
>> - remove_redundant() - see comment in previous patch
>> - still_interesting() - where Git uses date slop to stop walking
>>   too far
>
> remove_redundant() will be included in v5, thanks.

Oh.  Nice.

I'll try to review the new patch in detail soon.

> Instead of "still_interesting()" I'll add "git tag --merged" as the
> candidate to consider, as discussed in [1].
>
> [1] https://public-inbox.org/git/87fu3g67ry@lant.ki.iif.hu/t/#u
>     "branch --contains / tag --merged inconsistency"

All right.  I have mentioned still_interesting() as a hint where
possible additional generation numbers based optimization may lurk
(because that's where heuristic based on dates is used - similarly to
how it was done in this series with paint_down_to_common()).

[...]
>> One important issue left is handling features that change view of
>> project history, and their interaction with commit-graph feature.
>>
>> What would happen, if we turn on commit-graph feature, generate commit
>> graph file, and then:
>>
>>* use graft file or remove graft entries to cut history, or remove cut
>>  or join two [independent] histories.
>>* use git-replace mechanims to do the same
>>* in shallow clone, deepen or shorten the clone
>>
>> What would happen if without re-generating commit-graph file (assuming
>> tha Git wouldn't do it for us), we run some feature that makes use of
>> commit-graph data:
>>
>>- git branch --contains
>>- git tag --contains
>>- git rev-list A..B
>>
>
> The commit-graph is not supported in these scenarios (yet). grafts are
> specifically mentioned in the future work section.
>
> I'm not particularly interested in supporting these features, so they
> are good venues for other contributors to get involved in the
> commit-graph feature. Eventually, they will be blockers to making the
> commit-graph feature a "default" feature. That is when I will pay
> attention to these situations. For now, a user must opt-in to having a
> commit-graph file (and that same user has possibly opted in to these
> history modifying features).

Well, that is sensible approach.  Get commit-graph features in working
condition, and worry about beng able to make it on by default later.

Nice to have it clarified.  I'll stop nagging about that, then ;-P

One issue: 'grafts' are mentioned in the future work section of the
technical documentation, but we don't have *any* warning about
commit-graph limitations in user-facing documentation, that is
git-commit-graph(1) manpage.

Best,
-- 
Jakub Narębski


Re: [PATCH] gitweb: Measure offsets against UTF-8 flagged string

2018-05-02 Thread Junio C Hamano
Shin Kojima  writes:

> Offset positions should not be counted by byte length, but by actual
> character length.
> ...
>  # escape tabs (convert tabs to spaces)
>  sub untabify {
> - my $line = shift;
> + my $line = to_utf8(shift);
>  
>   while ((my $pos = index($line, "\t")) != -1) {
>   if (my $count = (8 - ($pos % 8))) {

Some codepaths in the resulting codeflow look even hackier than they
already are.  For example, format_rem_add_lines_pair() calls
untabify() and then feeds its result to esc_html().  The first thing
done in esc_html() is to call to_utf8().  I know that to_utf8()
cheats and leaves the incoming string as-is if it is already UTF-8,
so this may be a safe conversion, but ideally we should be able to
say "function X takes non-UTF8 and works on it", "function Y takes
UTF8 and works on it", and "function Z takes non-UTF8 and gives UTF8
data back" for each functions clearly, not "function W can take
either UTF8 or any other garbage and tries to return UTF8".

Also, does it even "fix" the problem to use to_utf8() here in the
first place?  Untabify is about aligning the character after a HT to
multiple of 8 display position, so we'd want measure display width,
which is not the same as either byte count or char count.


Re: [PATCH 0/3] Add --progress and --dissociate to git submodule

2018-05-02 Thread Casey Fitzpatrick
Thanks I was not aware of that option.

On Wed, May 2, 2018 at 12:37 AM, Junio C Hamano  wrote:
> Casey Fitzpatrick  writes:
>
>> These patches add --progress and --dissociate options to git submodule.
>>
>> The --progress option existed beforehand, but only for the update command and
>> it was left undocumented.
>>
>> Both add and update submodule commands supported --reference, but not its 
>> pair
>> option --dissociate which allows for independent clones rather than depending
>> on the reference.
>>
>> This is a resubmission with comments from Stefan Beller and Eric Sunshine
>> addressed.
>
> Readers would really appreciate it if these are prepared with
> format-patch with -v$N option.  Unless they read faster than you
> post patches, they will find a few messages identically titled, all
> unread in their mailbox, and it is not always easy to tell which
> ones are the latest.
>
> Thanks.


[PATCH v2 1/4] test-tool: help verifying BUG() code paths

2018-05-02 Thread Johannes Schindelin
When we call BUG(), we signal via SIGABRT that something bad happened,
dumping cores if so configured. In some setups these coredumps are
redirected to some central place such as /proc/sys/kernel/core_pattern,
which is a good thing.

However, when we try to verify in our test suite that bugs are caught in
certain code paths, we do *not* want to clutter such a central place
with unnecessary coredumps.

So let's special-case the test helpers (which we use to verify such code
paths) so that the BUG() calls will *not* call abort() but exit with a
special-purpose exit code instead.

Helped-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Johannes Schindelin 
---
 t/helper/test-tool.c | 2 ++
 usage.c  | 5 +
 2 files changed, 7 insertions(+)

diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 87066ced62a..5176f9f20ae 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -47,7 +47,9 @@ static struct test_cmd cmds[] = {
 int cmd_main(int argc, const char **argv)
 {
int i;
+   extern int BUG_exit_code;
 
+   BUG_exit_code = 99;
if (argc < 2)
die("I need a test name!");
 
diff --git a/usage.c b/usage.c
index cdd534c9dfc..9c84dccfa97 100644
--- a/usage.c
+++ b/usage.c
@@ -210,6 +210,9 @@ void warning(const char *warn, ...)
va_end(params);
 }
 
+/* Only set this, ever, from t/helper/, when verifying that bugs are caught. */
+int BUG_exit_code;
+
 static NORETURN void BUG_vfl(const char *file, int line, const char *fmt, 
va_list params)
 {
char prefix[256];
@@ -221,6 +224,8 @@ static NORETURN void BUG_vfl(const char *file, int line, 
const char *fmt, va_lis
snprintf(prefix, sizeof(prefix), "BUG: ");
 
vreportf(prefix, fmt, params);
+   if (BUG_exit_code)
+   exit(BUG_exit_code);
abort();
 }
 
-- 
2.17.0.windows.1.36.gdf4ca5fb72a




[PATCH v2 0/4] Finish the conversion from die("BUG: ...") to BUG()

2018-05-02 Thread Johannes Schindelin
The BUG() macro was introduced in this patch series:
https://public-inbox.org/git/20170513032414.mfrwabt4hovuj...@sigill.intra.peff.net

The second patch in that series converted one caller from die("BUG: ")
to use the BUG() macro.

It seems that there was no concrete plan to address the same issue in
the rest of the code base.

This patch series tries to do that.

Note: I would be very surprised if the monster commit 3/4 ("Replace all
die("BUG: ...") calls by BUG() ones") would apply cleanly on `pu` (I
develop this on top of `master`).

For that reason, the commit message contains the precise Unix shell
invocation (GNU sed semantics, not BSD sed ones, because I know that the
Git maintainer as well as the author of the patch introducing BUG() both
use Linux and not macOS or any other platform that would offer a BSD
sed). It should be straight-forward to handle merge
conflicts/non-applying patches by simply re-running that command.

Changes since v2:

- Avoided the entire discussion about the previous 2/6 (now dropped)
  that prepared t1406 to handle SIGABRT by side-stepping the issue: the
  ref-store test helper will no longer call abort() in BUG() calls but
  exit with exit code 99 instead.

- As a consequence, I could fold 3/6 into 4/6 (i.e. *not* special-case
  the refs/*.c code but do one wholesale die("BUG: ...") -> BUG()
  conversion).

- Further consequence: 1/6, which added handling for ok=sigabrt to
  test_must_fail, was dropped.

- I also used the opportunity to explain in the commit message of the last
  commit why the localization was dropped from one die(_("BUG: ...")) call.


Johannes Schindelin (4):
  test-tool: help verifying BUG() code paths
  run-command: use BUG() to report bugs, not die()
  Replace all die("BUG: ...") calls by BUG() ones
  Convert remaining die*(BUG) messages

 apply.c  |  4 ++--
 archive-tar.c|  2 +-
 attr.c   | 10 +-
 blame.c  |  2 +-
 builtin/am.c | 20 +--
 builtin/branch.c |  2 +-
 builtin/cat-file.c   |  4 ++--
 builtin/clone.c  |  2 +-
 builtin/commit.c |  2 +-
 builtin/config.c |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/init-db.c|  2 +-
 builtin/ls-files.c   |  2 +-
 builtin/notes.c  |  8 
 builtin/pack-objects.c   |  4 ++--
 builtin/pull.c   |  2 +-
 builtin/receive-pack.c   |  2 +-
 builtin/replace.c|  2 +-
 builtin/update-index.c   |  2 +-
 bulk-checkin.c   |  2 +-
 color.c  |  4 ++--
 column.c |  2 +-
 config.c | 12 +--
 date.c   |  2 +-
 diff.c   | 12 +--
 dir-iterator.c   |  2 +-
 git-compat-util.h|  2 +-
 grep.c   | 16 +++
 http.c   |  8 
 imap-send.c  |  2 +-
 lockfile.c   |  2 +-
 mailinfo.c   |  2 +-
 merge-recursive.c| 12 +--
 notes-merge.c|  4 ++--
 pack-bitmap-write.c  |  2 +-
 pack-bitmap.c|  6 +++---
 pack-objects.c   |  2 +-
 packfile.c   |  6 +++---
 pathspec.c   | 12 +--
 pkt-line.c   |  2 +-
 prio-queue.c |  2 +-
 ref-filter.c |  4 ++--
 refs.c   | 34 
 refs/files-backend.c | 20 +--
 refs/iterator.c  |  6 +++---
 refs/packed-backend.c| 16 +++
 refs/ref-cache.c |  2 +-
 remote.c |  2 +-
 revision.c   |  4 ++--
 run-command.c| 33 ++-
 setup.c  |  4 ++--
 sha1-lookup.c|  2 +-
 sha1-name.c  |  4 ++--
 shallow.c|  6 +++---
 sigchain.c   |  2 +-
 strbuf.c |  4 ++--
 submodule.c  |  8 
 t/helper/test-example-decorate.c | 16 +++
 t/helper/test-tool.c |  2 ++
 tmp-objdir.c |  2 +-
 trailer.c|  6 +++---
 transport.c  |  4 ++--
 unpack-trees.c   |  2 +-
 usage.c  |  5 +
 vcs-svn/fast_export.c|  6 --
 worktree.c   |  2 +-
 wrapper.c|  4 ++--
 wt-status.c  

[PATCH v2 2/4] run-command: use BUG() to report bugs, not die()

2018-05-02 Thread Johannes Schindelin
The slightly misleading name die_bug() of the function intended to
report a bug is actually called always, and only reports a bug if the
passed-in parameter `err` is non-zero.

It uses die_errno() to report the bug, to helpfully include the error
message corresponding to `err`.

However, as these messages indicate bugs, we really should use BUG().
And as BUG() is a macro to be able to report the exact file and line
number, we need to convert die_bug() to a macro instead of only
replacing the die_errno() by a call to BUG().

While at it, use a name more indicative of the purpose: CHECK_BUG().

Signed-off-by: Johannes Schindelin 
---
 run-command.c | 23 ++-
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/run-command.c b/run-command.c
index 12c94c1dbe5..0ad6f135d5a 100644
--- a/run-command.c
+++ b/run-command.c
@@ -471,15 +471,12 @@ struct atfork_state {
sigset_t old;
 };
 
-#ifndef NO_PTHREADS
-static void bug_die(int err, const char *msg)
-{
-   if (err) {
-   errno = err;
-   die_errno("BUG: %s", msg);
-   }
-}
-#endif
+#define CHECK_BUG(err, msg) \
+   do { \
+   int e = (err); \
+   if (e) \
+   BUG("%s: %s", msg, strerror(e)); \
+   } while(0)
 
 static void atfork_prepare(struct atfork_state *as)
 {
@@ -491,9 +488,9 @@ static void atfork_prepare(struct atfork_state *as)
if (sigprocmask(SIG_SETMASK, &all, &as->old))
die_errno("sigprocmask");
 #else
-   bug_die(pthread_sigmask(SIG_SETMASK, &all, &as->old),
+   CHECK_BUG(pthread_sigmask(SIG_SETMASK, &all, &as->old),
"blocking all signals");
-   bug_die(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
+   CHECK_BUG(pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, &as->cs),
"disabling cancellation");
 #endif
 }
@@ -504,9 +501,9 @@ static void atfork_parent(struct atfork_state *as)
if (sigprocmask(SIG_SETMASK, &as->old, NULL))
die_errno("sigprocmask");
 #else
-   bug_die(pthread_setcancelstate(as->cs, NULL),
+   CHECK_BUG(pthread_setcancelstate(as->cs, NULL),
"re-enabling cancellation");
-   bug_die(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
+   CHECK_BUG(pthread_sigmask(SIG_SETMASK, &as->old, NULL),
"restoring signal mask");
 #endif
 }
-- 
2.17.0.windows.1.36.gdf4ca5fb72a




[PATCH v2 3/4] Replace all die("BUG: ...") calls by BUG() ones

2018-05-02 Thread Johannes Schindelin
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae55
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).

The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovuj...@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.

Let's just convert all remaining ones in one fell swoop.

This trick was performed by this invocation:

sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)

Signed-off-by: Johannes Schindelin 
---
 apply.c  |  4 ++--
 archive-tar.c|  2 +-
 attr.c   | 10 +-
 blame.c  |  2 +-
 builtin/am.c | 20 +--
 builtin/branch.c |  2 +-
 builtin/cat-file.c   |  4 ++--
 builtin/clone.c  |  2 +-
 builtin/commit.c |  2 +-
 builtin/config.c |  2 +-
 builtin/fast-export.c|  2 +-
 builtin/fsck.c   |  2 +-
 builtin/index-pack.c |  4 ++--
 builtin/init-db.c|  2 +-
 builtin/ls-files.c   |  2 +-
 builtin/notes.c  |  8 
 builtin/pack-objects.c   |  4 ++--
 builtin/pull.c   |  2 +-
 builtin/receive-pack.c   |  2 +-
 builtin/replace.c|  2 +-
 builtin/update-index.c   |  2 +-
 bulk-checkin.c   |  2 +-
 color.c  |  4 ++--
 column.c |  2 +-
 config.c | 12 +--
 date.c   |  2 +-
 diff.c   | 12 +--
 dir-iterator.c   |  2 +-
 grep.c   | 16 +++
 http.c   |  8 
 imap-send.c  |  2 +-
 lockfile.c   |  2 +-
 mailinfo.c   |  2 +-
 merge-recursive.c| 12 +--
 notes-merge.c|  4 ++--
 pack-bitmap-write.c  |  2 +-
 pack-bitmap.c|  6 +++---
 pack-objects.c   |  2 +-
 packfile.c   |  6 +++---
 pathspec.c   | 10 +-
 pkt-line.c   |  2 +-
 prio-queue.c |  2 +-
 ref-filter.c |  4 ++--
 refs.c   | 34 
 refs/files-backend.c | 20 +--
 refs/iterator.c  |  6 +++---
 refs/packed-backend.c| 16 +++
 refs/ref-cache.c |  2 +-
 remote.c |  2 +-
 revision.c   |  4 ++--
 run-command.c| 10 +-
 setup.c  |  4 ++--
 sha1-lookup.c|  2 +-
 sha1-name.c  |  4 ++--
 shallow.c|  6 +++---
 sigchain.c   |  2 +-
 strbuf.c |  4 ++--
 submodule.c  |  6 +++---
 t/helper/test-example-decorate.c | 16 +++
 tmp-objdir.c |  2 +-
 trailer.c|  6 +++---
 transport.c  |  4 ++--
 unpack-trees.c   |  2 +-
 worktree.c   |  2 +-
 wrapper.c|  4 ++--
 wt-status.c  | 20 +--
 zlib.c   |  4 ++--
 67 files changed, 190 insertions(+), 190 deletions(-)

diff --git a/apply.c b/apply.c
index 7e5792c996f..3866adbc97f 100644
--- a/apply.c
+++ b/apply.c
@@ -2375,7 +2375,7 @@ static void update_pre_post_images(struct image *preimage,
if (postlen
? postlen < new_buf - postimage->buf
: postimage->len < new_buf - postimage->buf)
-   die("BUG: caller miscounted postlen: asked %d, orig = %d, used 
= %d",
+   BUG("caller miscounted postlen: asked %d, orig = %d, used = %d",
(int)postlen, (int) postimage->len, (int)(new_buf - 
postimage->buf));
 
/* Fix the length of the whole thing */
@@ -3509,7 +3509,7 @@ static int load_current(struct apply_state *state,
unsigned mode = patch->new_mode;
 
if (!patch->is_new)
-   die("BUG: patch to %s is not a creation", patch->old_name);
+   BUG("patch to %s is not a creation", patch->old_name);
 
pos = cache_name_pos(name, strlen(name));
if (pos < 0)
diff --git a/archive-tar.c b/archive-tar.c
index 3563bcb9f26..61a1a2547cc 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -441,7 +441,7 @@ static int write_tar_filter_archive(const struct ar

[PATCH v2 4/4] Convert remaining die*(BUG) messages

2018-05-02 Thread Johannes Schindelin
These were not caught by the previous commit, as they did not match the
regular expression.

While at it, remove the localization from one instance: we never want
BUG() messages to be translated, as they target Git developers, not the
end user (hence it would be quite unhelpful to not only burden the
translators, but then even end up with a bug report in a language that
no core Git contributor understands).

Signed-off-by: Johannes Schindelin 
---
 git-compat-util.h | 2 +-
 pathspec.c| 2 +-
 submodule.c   | 2 +-
 vcs-svn/fast_export.c | 6 --
 4 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 07e383257b4..3a7216f5313 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1052,7 +1052,7 @@ int git_qsort_s(void *base, size_t nmemb, size_t size,
 
 #define QSORT_S(base, n, compar, ctx) do { \
if (qsort_s((base), (n), sizeof(*(base)), compar, ctx)) \
-   die("BUG: qsort_s() failed");   \
+   BUG("qsort_s() failed");\
 } while (0)
 
 #ifndef REG_STARTEND
diff --git a/pathspec.c b/pathspec.c
index a637a6d15c0..27cd6067860 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -486,7 +486,7 @@ static void init_pathspec_item(struct pathspec_item *item, 
unsigned flags,
/* sanity checks, pathspec matchers assume these are sane */
if (item->nowildcard_len > item->len ||
item->prefix > item->len) {
-   die ("BUG: error initializing pathspec_item");
+   BUG("error initializing pathspec_item");
}
 }
 
diff --git a/submodule.c b/submodule.c
index 733db441714..c282fa8fe51 100644
--- a/submodule.c
+++ b/submodule.c
@@ -2043,7 +2043,7 @@ const char *get_superproject_working_tree(void)
 
if (super_sub_len > cwd_len ||
strcmp(&cwd[cwd_len - super_sub_len], super_sub))
-   die (_("BUG: returned path string doesn't match cwd?"));
+   BUG("returned path string doesn't match cwd?");
 
super_wt = xstrdup(cwd);
super_wt[cwd_len - super_sub_len] = '\0';
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 3fd047a8b82..b5b8913cb0f 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -320,7 +320,8 @@ const char *fast_export_read_path(const char *path, 
uint32_t *mode_out)
err = fast_export_ls(path, mode_out, &buf);
if (err) {
if (errno != ENOENT)
-   die_errno("BUG: unexpected fast_export_ls error");
+   BUG("unexpected fast_export_ls error: %s",
+   strerror(errno));
/* Treat missing paths as directories. */
*mode_out = S_IFDIR;
return NULL;
@@ -338,7 +339,8 @@ void fast_export_copy(uint32_t revision, const char *src, 
const char *dst)
err = fast_export_ls_rev(revision, src, &mode, &data);
if (err) {
if (errno != ENOENT)
-   die_errno("BUG: unexpected fast_export_ls_rev error");
+   BUG("unexpected fast_export_ls_rev error: %s",
+   strerror(errno));
fast_export_delete(dst);
return;
}
-- 
2.17.0.windows.1.36.gdf4ca5fb72a


Re: [PATCH 2/6] t1406: prepare for the refs code to fail with BUG()

2018-05-02 Thread Junio C Hamano
Johannes Schindelin  writes:

>> As discussed in this thread, tests that use t/helper/ executables
>> that try to trickle BUG() codepath to ensure that these "should
>> never happen" conditions are caught do need to deal with it.  If
>> dumping core is undesirable, tweaking BUG() implementation so that
>> it becomes die("BUG: ...") *ONLY* when the caller knows what it is
>> doing (e.g. running t/helper/ commands) is probably a good idea.
>> Perhaps GIT_TEST_OPTS can gain one feature "--bug-no-abort" and set
>> an environment variable so that implementation of BUG() can notice,
>> or something.
>
> I think we can do even better than that. t/helper/*.c could set a global
> variable that no other code is supposed to set, to trigger an alternative
> to SIGABRT. Something like

Yes, I agree with that solution for t/helper/ part.

But we also need to arrange a way for things outside t/helper/ to
set the BUG_exit_code at runtime, so that those like Duy who causes
BUG() to trigger in some "git cmd" under development when running
tests for himself, where he does not want his BUG() to dump core and
contaminate global core storage.


Re: What's cooking in git.git (Apr 2018, #04; Mon, 30)

2018-05-02 Thread Junio C Hamano
Johannes Schindelin  writes:

>> As I wrote elsewhere, for a low-impact and ralatively old issue like
>> this, it is OK for a fix to use supporting code that only exists in
>> more recent codebase and become unmergeable to anything older than
>> the concurrent 'maint' track as of the time when the fix is written.
>> Even though it is sometimes nicer if the fix were written to be
>> mergeable to codebase near the point where the issue originates, it
>> is often not worth doing so if it requires bending backwards to
>> refrain from using a newer and more convenient facility.
>
> So do you want me to clean up the backporting branches? I mean, we could...

For a relatively obscure and low-impact bug that is old like this
one, I'd actually prefer to be able to say "heh, if that hurts,
either refrain from doing so, or update to the recent maintenance
track that has a fix for it", to use the fix as an incentive to move
the world forward ;-).  After all, people have lived with the bug
for a long time.



[PATCH] checkout & worktree: introduce a core.DWIMRemote setting

2018-05-02 Thread Ævar Arnfjörð Bjarmason
Introduce a core.DWIMRemote setting which can be used to designate a
remote to prefer (via core.DWIMRemote=origin) when running e.g. "git
checkout master" to mean origin/master, even though there's other
remotes that have the "master" branch.

I want this because it's very handy to use this workflow to checkout a
repository and create a topic branch, then get back to a "master" as
retrieved from upstream:

(
rm -rf /tmp/tbdiff &&
git clone g...@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git checkout master
)

That will output:

Branch 'master' set up to track remote branch 'master' from 'origin'.
Switched to a new branch 'master'

But as soon as a new remote is added (e.g. just to inspect something
from someone else) the DWIMery goes away:

(
rm -rf /tmp/tbdiff &&
git clone g...@github.com:trast/tbdiff.git &&
cd tbdiff &&
git branch -m topic &&
git remote add avar g...@github.com:avar/tbdiff.git &&
git fetch avar &&
git checkout master
)

Will output:

error: pathspec 'master' did not match any file(s) known to git.

The new core.DWIMRemote config allows me to say that whenever that
ambiguity comes up I'd like to prefer "origin", and it'll still work
as though the only remote I had was "origin".

I considered calling the config setting checkout.DWIMRemote, but then
discovered that this behavior is also used by "worktree" for similar
purposes, so it makes sense to have it under core.*. As noted in the
documentation we may also want to use this in other commands in the
future if they have similar DWIM behavior in the presence of one
remote.

See also 70c9ac2f19 ("DWIM "git checkout frotz" to "git checkout -b
frotz origin/frotz"", 2009-10-18) which introduced this DWIM feature
to begin with, and 4e85333197 ("worktree: make add  
dwim", 2017-11-26) which added it to git-worktree.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt   | 15 +++
 Documentation/git-checkout.txt |  5 +
 Documentation/git-worktree.txt |  5 +
 builtin/checkout.c |  3 ++-
 checkout.c | 17 +++--
 t/t2024-checkout-dwim.sh   | 10 ++
 t/t2025-worktree-add.sh| 18 ++
 7 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153cb3..d4740fcdb9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -898,6 +898,21 @@ core.notesRef::
 This setting defaults to "refs/notes/commits", and it can be overridden by
 the `GIT_NOTES_REF` environment variable.  See linkgit:git-notes[1].
 
+core.DWIMRemote::
+   Various git commands will look at references on the configured
+   remotes and DW[YI]M (Do What (You|I) Mean) if the reference
+   only exists on one remote. This setting allows for setting the
+   name of a special remote that should always win when it comes
+   to disambiguation. The typical use-case is to set this to
+   `origin`.
++
+Currently this is used by linkgit:git-checkout[1] when `git checkout
+' will checkout the '' branch on another remote,
+and linkgit:git-worktree[1] when 'git worktree add' similarly DWYM
+when a branch is unique across remotes, or this setting is set to a
+special remote. This setting might be used for other commands or
+functionality in the future when appropriate.
+
 core.sparseCheckout::
Enable "sparse checkout" feature. See section "Sparse checkout" in
linkgit:git-read-tree[1] for more information.
diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index ca5fc9c798..8ad7be6c53 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -38,6 +38,11 @@ equivalent to
 $ git checkout -b  --track /
 
 +
+If the branch exists in multiple remotes the `core.DWIMRemote`
+variable can be used to pick the remote you really mean. Set it to
+e.g. `core.DWIMRemote=origin` to always checkout remote branches from
+there. See also `core.DWIMRemote` in linkgit:git-config[1].
++
 You could omit , in which case the command degenerates to
 "check out the current branch", which is a glorified no-op with
 rather expensive side-effects to show only the tracking information,
diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index 2755ca90e3..37db12f816 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -60,6 +60,11 @@ with a matching name, treat as equivalent to:
 $ git worktree add --track -b   /
 
 +
+It's also possible to use the `core.DWIMRemote` setting to designate a
+special remote this rule should be applied to, even if the branch
+isn't unique across all remotes. See `core.DWIMRemote` in
+linkgit:git-config[1].
++
 If `` is omitted and neither `-b` nor `-B` nor `--detach` used,
 then, as a convenience, a 

Re: [PATCH v4 09/10] merge: check config before loading commits

2018-05-02 Thread Jakub Narebski
Derrick Stolee  writes:
> On 4/30/2018 6:54 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
>>
>>> Now that we use generation numbers from the commit-graph, we must
>>> ensure that all commits that exist in the commit-graph are loaded
>>> from that file instead of from the object database. Since the
>>> commit-graph file is only checked if core.commitGraph is true, we
>>> must check the default config before we load any commits.
>>>
>>> In the merge builtin, the config was checked after loading the HEAD
>>> commit. This was due to the use of the global 'branch' when checking
>>> merge-specific config settings.
>>>
>>> Move the config load to be between the initialization of 'branch' and
>>> the commit lookup.
>>
>> Sidenote: I wonder why reading config was postponed to later in the
>> command lifetime... I guess it was to avoid having to read config if
>> HEAD was invalid.
>
> The 'branch' does need to be loaded before the call to git_config (as
> I found out after moving the config call too early), so I suppose it
> was natural to pair that with resolving head_commit.

Right, so there was only a limited number of places where call to
git_config could be put correctly. Now I wonder no more.

[...]
>>> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
>>> index a380419b65..77d85aefe7 100755
>>> --- a/t/t5318-commit-graph.sh
>>> +++ b/t/t5318-commit-graph.sh
>>> @@ -221,4 +221,13 @@ test_expect_success 'write graph in bare repo' '
>>>   graph_git_behavior 'bare repo with graph, commit 8 vs merge 1' bare 
>>> commits/8 merge/1
>>>   graph_git_behavior 'bare repo with graph, commit 8 vs merge 2' bare 
>>> commits/8 merge/2
>>>   +test_expect_success 'perform fast-forward merge in full repo' '
>>> +   cd "$TRASH_DIRECTORY/full" &&
>>> +   git checkout -b merge-5-to-8 commits/5 &&
>>> +   git merge commits/8 &&
>>> +   git show-ref -s merge-5-to-8 >output &&
>>> +   git show-ref -s commits/8 >expect &&
>>> +   test_cmp expect output
>>> +'
>> All right.  (though I wonder if this tests catches all problems where
>> BUG("bad generation skip") could have been encountered.
>
> We will never know until we have this series running in the wild (and
> even then, some features are very obscure) and enough people turn on
> the config setting.
>
> One goal of the "fsck and gc" series is to get this feature running
> during the rest of the test suite as much as possible, so we can get
> additional coverage. Also to get more experience from the community
> dogfooding the feature.

Sidenote: for two out of three features that change the view of history
we could also update commit-graph automatically:
* the shortening or deepening of shallow clone could also re-calculate
  the commit graph (or invalidate it)
* git-replace could check if the replacement modifies history, and if
  so, recalculate the commit graph (or invalidate it/check its validity)
* there is no such possibility for grafts, but they are deprecated anyway

-- 
Jakub Narębski


Re: [PATCH] gitweb: Measure offsets against UTF-8 flagged string

2018-05-02 Thread Shin Kojima
> ideally we should be able to say "function X takes non-UTF8 and
> works on it", "function Y takes UTF8 and works on it", and "function
> Z takes non-UTF8 and gives UTF8 data back" for each functions
> clearly, not "function W can take either UTF8 or any other garbage
> and tries to return UTF8".

Yes, I totally agree with that.


> Some codepaths in the resulting codeflow look even harder than they
> already are.  For example, format_rem_add_lines_pair() calls
> untabify() and then feeds its result to esc_html().

Honestly, I discovered this problem exactly from
format_rem_add_lines_pair().  In my environment($fallback_encoding
= 'cp932'), some commitdiff shows garbled text only inside color-words
portions.

I added a reproduce process at the end of this message.

After my investigation, I thought format_rem_add_lines_pair() tries to
use split()/index()/substr()/etc against raw blob contents and that
produces funny characters.  These builtin functions should be used to
decoded string.

untabify() looks proper place for me to decode blob contents
beforehand, as it definitely is not to be used for binary contests
like images and compressed snapshot files.

I'm sure using to_utf8() in untabify() fixes this problem.  In fact,
there is also another similar problem in blame function that assumes
blob contents as if utf8 encoded:

binmode $fd, ':utf8';

I personally consider text blob contents should be decoded as soon as
possible and esc_html() should not contain to_utf8(), but the
codeflow is slightly vast and I couldn't eliminate the possibility of
calls from somewhere else that does not care character encodings.

So yes, I would appreciate hearing your thoughts.


> Also, does it even "fix" the problem to use to_utf8() here in the
> first place?  Untabify is about aligning the character after a HT to
> multiple of 8 display position, so we'd want measure display width,
> which is not the same as either byte count or char count.

Following is a reproduce process:

$ git --version
git version 2.17.0

$ mkdir test
$ cd test
$ git init
$ echo 'モバイル' | iconv -f UTF-8 -t Shift_JIS > dummy
$ git add .
$ git commit -m 'init'
$ echo 'インスタント' | iconv -f UTF-8 -t Shift_JIS > dummy
$ git commit -am 'change'
$ git instaweb
$ echo 'our $fallback_encoding = "cp932";' >> .git/gitweb/gitweb_config.perl
$ w3m -dump 'http://127.0.0.1:1234/?p=.git;a=commitdiff'

What I got:

gitprojects / .git / commitdiff
[commit   ] ? search: [] [ ]re
summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 79e26fe)
change master

authorShin Kojima 
  Wed, 2 May 2018 10:55:01 + (19:55 +0900)
committer Shin Kojima 
  Wed, 2 May 2018 10:55:01 + (19:55 +0900)

dummy  patch | blob | history


diff --git a/dummy b/dummy
index ac37f38..31fb96a 100644 (file)
--- a/dummy
+++ b/dummy
@@ -1 +1 @@
-coイル
+Cンスタント
Unnamed repository; edit this file 'description' to name the repository.
RSS Atom

What I expected:

gitprojects / .git / commitdiff
[commit   ] ? search: [] [ ]re
summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 79e26fe)
change master

authorShin Kojima 
  Wed, 2 May 2018 10:55:01 + (19:55 +0900)
committer Shin Kojima 
  Wed, 2 May 2018 10:55:01 + (19:55 +0900)

dummy  patch | blob | history


diff --git a/dummy b/dummy
index ac37f38..31fb96a 100644 (file)
--- a/dummy
+++ b/dummy
@@ -1 +1 @@
-モバイル
+インスタント
Unnamed repository; edit this file 'description' to name the repository.
RSS Atom


-- 
Shin Kojima


Re: [PATCH v3 00/12] get_short_oid UI improvements

2018-05-02 Thread Derrick Stolee

On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:

The biggest change in v3 is the no change at all to the code, but a
lengthy explanation of why I didn't go for Derrick's simpler
implementation. Maybe I'm wrong about that, but I felt uneasy
offloading undocumented (or if I documented it, it would only be for
this one edge-case) magic on the oid_array API. Instead I'm just
making this patch a bit more complex.


I think that's fair. Thanks for going along with me on the thought 
experiment.


-Stolee


Re: [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common()

2018-05-02 Thread Jakub Narebski
Derrick Stolee  writes:
> On 4/30/2018 6:19 PM, Jakub Narebski wrote:
>> Derrick Stolee  writes:
[...]
>>> @@ -831,6 +834,13 @@ static struct commit_list *paint_down_to_common(struct 
>>> commit *one, int n, struc
>>> struct commit_list *parents;
>>> int flags;
>>>   + if (commit->generation > last_gen)
>>> +   BUG("bad generation skip");
>>> +   last_gen = commit->generation;
>> Shouldn't we provide more information about where the problem is to the
>> user, to make it easier to debug the repository / commit-graph data?
>>
>> Good to have this sanity check here.
>
> This BUG() _should_ only be seen by developers who add callers which
> do not load commits from the commit-graph file. There is a chance that
> there are cases not covered by this patch and the added tests,
> though. Hopefully we catch them all by dogfooding the feature before
> turning it on by default.
>
> I can add the following to help debug these bad situations:
>
> + BUG("bad generation skip %d > %d at %s",
> + commit->generation, last_gen,
> + oid_to_hex(&commit->object.oid));

On one hand, after thiking about this a bit, I agree that this BUG() is
more about catching the errors in Git code, rather than in repository.

On the other hand, the more detailed information could help determining
what the problems is (e.g. that "at " is at HEAD).

Hopefully we won't see which is which, as it would mean bugs in Git ;))

[...]
>>> @@ -946,7 +956,7 @@ static int remove_redundant(struct commit **array, int 
>>> cnt)
>>> filled_index[filled] = j;
>>> work[filled++] = array[j];
>>> }
>>> -   common = paint_down_to_common(array[i], filled, work);
>>> +   common = paint_down_to_common(array[i], filled, work, 0);
>>
>> Here we are interested not only if "one"/array[i] is reachable from
>> "twos"/work, but also if "twos" is reachable from "one".  Simple cutoff
>> only works in one way, though I wonder if we couldn't use cutoff being
>> minimum generation number of "one" and "twos" together.
>>
>> But that may be left for a separate commit (after checking that the
>> above is correct).
>>
>> Not as simple and obvious as paint_down_to_common() used in
>> in_merge_bases_any(), so it is all right.
>
> Thanks for reporting this. Since we are only concerned about
> reachability in this method, it is a good candidate to use
> min_generation. It is also subtle enough that we should leave it as a
> separate commit.

Thanks for checking this, and for the followup.

>  Also, we can measure performance improvements
> separately, as I will mention in my commit message (but I'll copy it
> here):
>
>     For a copy of the Linux repository, we measured the following
>     performance improvements:
>
>     git merge-base v3.3 v4.5
>
>     Before: 234 ms
>  After: 208 ms
>  Rel %: -11%
>
>     git merge-base v4.3 v4.5
>
>     Before: 102 ms
>  After:  83 ms
>  Rel %: -19%
>
>     The experiments above were chosen to demonstrate that we are
>     improving the filtering of the merge-base set. In the first
>     example, more time is spent walking the history to find the
>     set of merge bases before the remove_redundant() call. The
>     starting commits are closer together in the second example,
>     therefore more time is spent in remove_redundant(). The relative
>     change in performance differs as expected.

Nice.

I was not expecting as much performance improvements as we got for
--contains tests because remove_redundant() is a final step in longer
process, dominated by man calculations.  Still, nothing to sneeze about.

Best regards,
-- 
Jakub Narębski



Re: [PATCH v4 08/10] commit: add short-circuit to paint_down_to_common()

2018-05-02 Thread Derrick Stolee

On 5/2/2018 9:05 AM, Jakub Narebski wrote:

Derrick Stolee  writes:

     For a copy of the Linux repository, we measured the following
     performance improvements:

     git merge-base v3.3 v4.5

     Before: 234 ms
  After: 208 ms
  Rel %: -11%

     git merge-base v4.3 v4.5

     Before: 102 ms
  After:  83 ms
  Rel %: -19%

     The experiments above were chosen to demonstrate that we are
     improving the filtering of the merge-base set. In the first
     example, more time is spent walking the history to find the
     set of merge bases before the remove_redundant() call. The
     starting commits are closer together in the second example,
     therefore more time is spent in remove_redundant(). The relative
     change in performance differs as expected.

Nice.

I was not expecting as much performance improvements as we got for
--contains tests because remove_redundant() is a final step in longer
process, dominated by man calculations.  Still, nothing to sneeze about.


One reason these numbers are not too surprising is that 
remove_redundant() can demonstrate quadratic behavior. It is calculating 
pair-wise reachability by starting a walk at each of the candidates (in 
the worst case). In typical cases, the first walk marks many of the 
other candidates as redundant and we don't need to start walks from 
those commits.


A possible optimization could be to sort the candidates by descending 
generation so we find the first walk is likely to mark the rest as 
redundant. But this may already be the case if the candidates are added 
to the list in order of "discovery" which is already simulating this 
behavior.


Thanks,
-Stolee


Re: [PATCH v3 00/12] get_short_oid UI improvements

2018-05-02 Thread Derrick Stolee

On 5/2/2018 8:42 AM, Derrick Stolee wrote:

On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:

The biggest change in v3 is the no change at all to the code, but a
lengthy explanation of why I didn't go for Derrick's simpler
implementation. Maybe I'm wrong about that, but I felt uneasy
offloading undocumented (or if I documented it, it would only be for
this one edge-case) magic on the oid_array API. Instead I'm just
making this patch a bit more complex.


I think that's fair. Thanks for going along with me on the thought 
experiment.


Also, v3 looks good to me.

Reviewed-by: Derrick Stolee 



Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-02 Thread Ben Peart



On 5/1/2018 8:08 PM, Elijah Newren wrote:

On Tue, May 1, 2018 at 4:11 PM, Junio C Hamano  wrote:

Elijah Newren  writes:





I also just realized that in addition to status being inconsistent
with diff/log/show, it was also inconsistent with itself -- it handles
staged and unstaged changes differently.
(wt_status_collect_changes_worktree() had different settings for break
detection than wt_status_collect_changes_index().)

Eckhard, can you add some comments to your commit message mentioning
the email pointed to by Junio about break detection and rename
detection being unsafe to use together, as well as the inconsistencies
in break detection between different commands?  That may help future
readers understand why break detection was turned off for
wt_status_collect_changes_index().



Wow, lots of inconsistent behaviors here with diff/merge/status and the 
various options being used.  Let me add another one:


I've been sitting on another patch that we've been using internally for 
some time that enables us to turn off rename and break detection for 
status via config settings and new command-line options.


The issue that triggered the creation of the patch was that if someone 
ran status while in a merge conflict state, the status would take a very 
long time.  Turning off rename and break detection "fixed" the problem.


I was waiting for some of these inflight changes to settle down and get 
accepted before I started another patch series but I thought I should at 
least let everyone know about this additional issue that will need to be 
addressed.


[PATCH] git-send-email: allow re-editing of message

2018-05-02 Thread Drew DeVault
When shown the email summary, an opportunity is presented for the user
to edit the email as if they had specified --annotate. This also permits
them to edit it multiple times.

Signed-off-by: Drew DeVault 
Reviewed-by: Simon Ser 

---
Sent to the ML without comment, sending back around with more people
Cc'd for better visibility.

 git-send-email.perl | 35 ---
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2fa7818ca..14f2e8ae4 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1330,9 +1330,14 @@ sub file_name_is_absolute {
return File::Spec::Functions::file_name_is_absolute($path);
 }
 
-# Returns 1 if the message was sent, and 0 otherwise.
-# In actuality, the whole program dies when there
-# is an error sending a message.
+# Prepares the email, then asks the user what to do.
+#
+# If the user chooses to send the email, it's sent and 1 is returned.
+# If the user chooses not to send the email, 0 is returned.
+# If the user decides they want to make further edits, -1 is returned and the
+# caller is expected to call send_message again after the edits are performed.
+#
+# If an error occurs sending the email, this just dies.
 
 sub send_message {
my @recipients = unique_email_list(@to);
@@ -1404,15 +1409,17 @@ Message-Id: $message_id
 
 EOF
}
-   # TRANSLATORS: Make sure to include [y] [n] [q] [a] in your
+   # TRANSLATORS: Make sure to include [y] [n] [e] [q] [a] in your
# translation. The program will only accept English input
# at this point.
-   $_ = ask(__("Send this email? ([y]es|[n]o|[q]uit|[a]ll): "),
-valid_re => qr/^(?:yes|y|no|n|quit|q|all|a)/i,
+   $_ = ask(__("Send this email? ([y]es|[n]o|[e]dit|[q]uit|[a]ll): 
"),
+valid_re => qr/^(?:yes|y|no|n|edit|e|quit|q|all|a)/i,
 default => $ask_default);
die __("Send this email reply required") unless defined $_;
if (/^n/i) {
return 0;
+   } elsif (/^e/i) {
+   return -1;
} elsif (/^q/i) {
cleanup_compose_files();
exit(0);
@@ -1552,7 +1559,9 @@ $references = $initial_in_reply_to || '';
 $subject = $initial_subject;
 $message_num = 0;
 
-foreach my $t (@files) {
+sub process_file {
+   my ($t) = @_;
+
open my $fh, "<", $t or die sprintf(__("can't open file %s"), $t);
 
my $author = undef;
@@ -1755,6 +1764,10 @@ foreach my $t (@files) {
}
 
my $message_was_sent = send_message();
+   if ($message_was_sent == -1) {
+   do_edit($t);
+   return 0;
+   }
 
# set up for the next message
if ($thread && $message_was_sent &&
@@ -1776,6 +1789,14 @@ foreach my $t (@files) {
undef $auth;
sleep($relogin_delay) if defined $relogin_delay;
}
+
+   return 1;
+}
+
+foreach my $t (@files) {
+   while (!process_file($t)) {
+   # This space deliberately left blank
+   }
 }
 
 # Execute a command (e.g. $to_cmd) to get a list of email addresses
-- 
2.17.0



Re:

2018-05-02 Thread Ben Peart



On 4/30/2018 12:12 PM, Elijah Newren wrote:

On Mon, Apr 30, 2018 at 6:11 AM, Ben Peart  wrote:

On 4/27/2018 2:19 PM, Elijah Newren wrote:


From: Elijah Newren 

On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart  wrote:


Can you write the documentation that clearly explains the exact behavior
you
want?  That would kill two birds with one stone... :)



Sure, something like the following is what I envision, and I've tried to
include the suggestion from Junio to document the copy behavior in the
merge-recursive documentation.





Thanks Elijah. I've applied this patch and reviewed and tested it.  It works
and addresses the concerns around the settings inheritance from
diff.renames.  I still _prefer_ the simpler model that doesn't do the
partial inheritance but I can use this model as well.

I'm unsure on the protocol here.  Should I incorporate this patch and submit
a reroll or can it just be applied as is?


I suspect you'll want to re-roll anyway, to base your series on
en/rename-directory-detection-reboot instead of on master.  (Junio
plans to merge it down to next, and your series has four different
merge conflicts with it.)

There are two other loose ends with this series that Junio will need
to weigh in on:

- I'm obviously a strong proponent of the inherited setting, but Junio
may change his mind after reading Dscho's arguments against it (or
after reading my arguments for it).

- I like the setting as-is, and think we could allow a "copy" setting
for merge.renames to specify that the post-merge diffstat should
detect copies (not part of your series, but a useful addition I'd like
to tackle afterwards).  However, Junio had comments in
xmqqwox19ohw@gitster-ct.c.googlers.com about merge.renames
handling the scoring as well, like -Xfind-renames.  Those sound
incompatible to me for a single setting, and I'm unsure if Junio would
resolve them the way I do or still feels strongly about the scoring.



I think this patch series (including Elijah's fixup!) improves the 
situation from where we were and it provides the necessary functionality 
to solve the problem I started out to solve.  While there are other 
changes that could be made, I think they should be done in separate 
follow up patches.


I'm happy to reroll this incorporating the fixup! so that we can make 
progress.  Junio, would you prefer I reroll this based on 
en/rename-directory-detection-reboot or master?


Re: [PATCH] git-p4 - Add option --sha1 to submit in p4

2018-05-02 Thread Luke Diamand
On 2 May 2018 at 15:32, Merland Romain  wrote:
> From 4867808cad2b759ebf31c275356e602b72c5659f Mon Sep 17 00:00:00 2001
> From: Romain Merland 
> To: git@vger.kernel.org
> Cc: Junio C Hamano , Jeff King , Luke
> Diamand , Vinicius Kursancew 
> Date: Wed, 2 May 2018 15:02:11 +0200
> Subject: [PATCH] git-p4 - Add option --sha1 to submit in p4
>
> Add option --sha1 to command 'git-p4 submit' in order to submit in p4 a
> commit
> that is not necessarily on master.
> In that case, don't rebase the submitted changes.

That could be very useful, I often find the commit I want to submit is
half-way down a long list of other commits.

Currently I end up cherry-picking the one I want into a clean repo,
but that's much more awkward than your --sha1 change.

A few comments inline:

> Signed-off-by: Romain Merland 
> ---
>  git-p4.py | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index 7bb9cadc6..d64ff79dd 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -1352,7 +1352,9 @@ class P4Submit(Command, P4UserMap):
>  optparse.make_option("--update-shelve",
> dest="update_shelve", action="append", type="int",
>   metavar="CHANGELIST",
>   help="update an existing shelved
> changelist, implies --shelve, "
> -   "repeat in-order for multiple
> shelved changelists")
> +   "repeat in-order for multiple
> shelved changelists"),
> +optparse.make_option("--sha1", dest="sha1", metavar="SHA1",
> + help="submit only the specified
> commit, don't rebase afterwards")

Is there a better name than "sha1" ? If git ever changes its hash to
something else will this still make sense?

I wonder why you wouldn't rebase afterwards?

Perhaps an additional option to skip the rebase?

>  ]
>  self.description = "Submit changes from git to the perforce depot."
>  self.usage += " [name of git branch to submit into perforce depot]"
> @@ -1362,6 +1364,7 @@ class P4Submit(Command, P4UserMap):
>  self.dry_run = False
>  self.shelve = False
>  self.update_shelve = list()
> +self.sha1 = ""
>  self.prepare_p4_only = False
>  self.conflict_behavior = None
>  self.isWindows = (platform.system() == "Windows")
> @@ -2103,9 +2106,12 @@ class P4Submit(Command, P4UserMap):
>  else:
>  commitish = 'HEAD'
>
> -for line in read_pipe_lines(["git", "rev-list", "--no-merges",
> "%s..%s" % (self.origin, commitish)]):
> -commits.append(line.strip())
> -commits.reverse()
> +if self.sha1 != "":
> +commits.append(self.sha1)
> +else:
> +for line in read_pipe_lines(["git", "rev-list", "--no-merges",
> "%s..%s" % (self.origin, commitish)]):
> +commits.append(line.strip())
> +commits.reverse()
>
>  if self.preserveUser or gitConfigBool("git-p4.skipUserNameCheck"):
>  self.checkAuthorship = False
> @@ -2215,8 +2221,11 @@ class P4Submit(Command, P4UserMap):
>  sync.branch = self.branch
>  sync.run([])
>
> -rebase = P4Rebase()
> -rebase.rebase()
> +if self.sha1 == "":

if not self.skip_rebase:

> +rebase = P4Rebase()
> +rebase.rebase()
> +else:
> +print "You will have to do 'git p4 sync' and rebase."
>
>  else:
>  if len(applied) == 0:
> --
> 2.17.0
>
>

This would be better with some documentation in git-p4.txt and a test case!

Regards!
Luke


Re: [PATCH v2 1/4] test-tool: help verifying BUG() code paths

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 11:38 AM, Johannes Schindelin
 wrote:
> When we call BUG(), we signal via SIGABRT that something bad happened,
> dumping cores if so configured. In some setups these coredumps are
> redirected to some central place such as /proc/sys/kernel/core_pattern,
> which is a good thing.
>
> However, when we try to verify in our test suite that bugs are caught in
> certain code paths, we do *not* want to clutter such a central place
> with unnecessary coredumps.
>
> So let's special-case the test helpers (which we use to verify such code
> paths) so that the BUG() calls will *not* call abort() but exit with a
> special-purpose exit code instead.
>
> Helped-by: Nguyễn Thái Ngọc Duy 
> Signed-off-by: Johannes Schindelin 
> ---
>  t/helper/test-tool.c | 2 ++
>  usage.c  | 5 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index 87066ced62a..5176f9f20ae 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -47,7 +47,9 @@ static struct test_cmd cmds[] = {
>  int cmd_main(int argc, const char **argv)
>  {
> int i;
> +   extern int BUG_exit_code;
>
> +   BUG_exit_code = 99;

It may be even better to let individual tests in t1406 control this,
pretty much like your original patch, except that we tell usage.c to
not send SIGABRT and just return a special fault code. That way we
don't accidentally suppress BUG()'s sigabrt behavior  in tests that do
not anticipate it (even in t1406).

But this patch is ok for me too if you don't want another reroll.

> if (argc < 2)
> die("I need a test name!");
>
> diff --git a/usage.c b/usage.c
> index cdd534c9dfc..9c84dccfa97 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -210,6 +210,9 @@ void warning(const char *warn, ...)
> va_end(params);
>  }
>
> +/* Only set this, ever, from t/helper/, when verifying that bugs are caught. 
> */
> +int BUG_exit_code;
> +
>  static NORETURN void BUG_vfl(const char *file, int line, const char *fmt, 
> va_list params)
>  {
> char prefix[256];
> @@ -221,6 +224,8 @@ static NORETURN void BUG_vfl(const char *file, int line, 
> const char *fmt, va_lis
> snprintf(prefix, sizeof(prefix), "BUG: ");
>
> vreportf(prefix, fmt, params);
> +   if (BUG_exit_code)
> +   exit(BUG_exit_code);
> abort();
>  }
>
> --
> 2.17.0.windows.1.36.gdf4ca5fb72a
>
>



-- 
Duy


Re: [PATCH] checkout & worktree: introduce a core.DWIMRemote setting

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 12:54 PM, Ævar Arnfjörð Bjarmason
 wrote:
> Introduce a core.DWIMRemote setting which can be used to designate a
> remote to prefer (via core.DWIMRemote=origin) when running e.g. "git
> checkout master" to mean origin/master, even though there's other
> remotes that have the "master" branch.

Do we anticipate more dwimy customizations to justify a new dwim group
in config (i.e. dwim.remote instead of core.dwimRemote)?
-- 
Duy


Re: [PATCH 08/41] packfile: abstract away hash constant values

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 2:11 AM, brian m. carlson
 wrote:
> On Tue, May 01, 2018 at 12:22:43PM +0200, Duy Nguyen wrote:
>> On Mon, Apr 23, 2018 at 11:39:18PM +, brian m. carlson wrote:
>> > There are several instances of the constant 20 and 20-based values in
>> > the packfile code.  Abstract away dependence on SHA-1 by using the
>> > values from the_hash_algo instead.
>>
>> While we're abstracting away 20. There's the only 20 left in
>> sha1_file.c that should also be gone. But I guess you could do that
>> later since you need to rename fill_sha1_path to
>> fill_loose_object_path or something.
>
> I'm already working on knocking those out.

Yeah I checked out your part14 branch after writing this note :P You
still need to rename the function though. I can remind that again when
part14 is sent out.

>> > @@ -507,15 +509,15 @@ static int open_packed_git_1(struct packed_git *p)
>> >  " while index indicates %"PRIu32" objects",
>> >  p->pack_name, ntohl(hdr.hdr_entries),
>> >  p->num_objects);
>> > -   if (lseek(p->pack_fd, p->pack_size - sizeof(sha1), SEEK_SET) == -1)
>> > +   if (lseek(p->pack_fd, p->pack_size - hashsz, SEEK_SET) == -1)
>> > return error("end of packfile %s is unavailable", 
>> > p->pack_name);
>> > -   read_result = read_in_full(p->pack_fd, sha1, sizeof(sha1));
>> > +   read_result = read_in_full(p->pack_fd, hash, hashsz);
>> > if (read_result < 0)
>> > return error_errno("error reading from %s", p->pack_name);
>> > -   if (read_result != sizeof(sha1))
>> > +   if (read_result != hashsz)
>> > return error("packfile %s signature is unavailable", 
>> > p->pack_name);
>> > -   idx_sha1 = ((unsigned char *)p->index_data) + p->index_size - 40;
>> > -   if (hashcmp(sha1, idx_sha1))
>> > +   idx_hash = ((unsigned char *)p->index_data) + p->index_size - hashsz * 
>> > 2;
>> > +   if (hashcmp(hash, idx_hash))
>>
>> Since the hash size is abstracted away, shouldn't this hashcmp become
>> oidcmp? (which still does not do the right thing, but at least it's
>> one less place to worry about)
>
> Unfortunately, I can't, because it's not an object ID.  I think the
> decision was made to not transform non-object ID hashes into struct
> object_id, which makes sense.  I suppose we could have an equivalent
> struct hash or something for those other uses.

I probably miss something, is hashcmp() supposed to stay after the
conversion? And will it compare any hash (as configured in the_algo)
or will it for SHA-1 only?

If hashcmp() will eventually compare the_algo->rawsz then yes this makes sense.

>> > @@ -675,7 +677,8 @@ struct packed_git *add_packed_git(const char *path, 
>> > size_t path_len, int local)
>> > p->pack_size = st.st_size;
>> > p->pack_local = local;
>> > p->mtime = st.st_mtime;
>> > -   if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
>> > +   if (path_len < the_hash_algo->hexsz ||
>> > +   get_sha1_hex(path + path_len - the_hash_algo->hexsz, p->sha1))
>>
>> get_sha1_hex looks out of place when we start going with
>> the_hash_algo. Maybe change to get_oid_hex() too.
>
> I believe this is the pack hash, which isn't an object ID.  I will
> transform it to be called something other than "sha1" and allocate more
> memory for it in a future series, though.

Ah ok, it's the "sha1" in the name that bugged me. I'm all good then.
-- 
Duy


Re: [PATCH v2 00/42] object_id part 13

2018-05-02 Thread Duy Nguyen
On Wed, May 02, 2018 at 12:25:28AM +, brian m. carlson wrote:
> Changes from v1:
> * Add missing sign-off.
> * Removed unneeded braces from init_pack_info.
> * Express 51 in terms of the_hash_algo->hexsz.
> * Fix comments referring to SHA-1.
> * Update commit messages as suggested.
> * Add and use empty_tree_oid_hex and empty_blob_oid_hex.

Interdiff for people who don't have time to read 42 patches yet

diff --git a/builtin/merge.c b/builtin/merge.c
index 8d75ebe64b..7084bcfdea 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -290,7 +290,7 @@ static void read_empty(const struct object_id *oid, int 
verbose)
args[i++] = "-v";
args[i++] = "-m";
args[i++] = "-u";
-   args[i++] = oid_to_hex(the_hash_algo->empty_tree);
+   args[i++] = empty_tree_oid_hex();
args[i++] = oid_to_hex(oid);
args[i] = NULL;
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c31ceb30c2..dca523f50f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -968,7 +968,7 @@ static const char *push_to_deploy(unsigned char *sha1,
return "Working directory has unstaged changes";
 
/* diff-index with either HEAD or an empty tree */
-   diff_index[4] = head_has_history() ? "HEAD" : 
oid_to_hex(the_hash_algo->empty_tree);
+   diff_index[4] = head_has_history() ? "HEAD" : empty_tree_oid_hex();
 
child_process_init(&child);
child.argv = diff_index;
diff --git a/cache.h b/cache.h
index c5b041019b..71b3c1b15b 100644
--- a/cache.h
+++ b/cache.h
@@ -1033,6 +1033,9 @@ static inline int is_empty_tree_oid(const struct 
object_id *oid)
return !oidcmp(oid, the_hash_algo->empty_tree);
 }
 
+const char *empty_tree_oid_hex(void);
+const char *empty_blob_oid_hex(void);
+
 /* set default permissions by passing mode arguments to open(2) */
 int git_mkstemps_mode(char *pattern, int suffix_len, int mode);
 int git_mkstemp_mode(char *pattern, int mode);
diff --git a/http.c b/http.c
index ec70676748..312a5e1833 100644
--- a/http.c
+++ b/http.c
@@ -2070,7 +2070,7 @@ int http_get_info_packs(const char *base_url, struct 
packed_git **packs_head)
get_sha1_hex(data + i + 6, hash);
fetch_and_setup_pack_index(packs_head, hash,
  base_url);
-   i += 51;
+   i += hexsz + 11;
break;
}
default:
diff --git a/sequencer.c b/sequencer.c
index 12c1e1cdbb..94b6513402 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1480,8 +1480,7 @@ static int do_pick_commit(enum todo_command command, 
struct commit *commit,
unborn = get_oid("HEAD", &head);
if (unborn)
oidcpy(&head, the_hash_algo->empty_tree);
-   if (index_differs_from(unborn ?
-  oid_to_hex(the_hash_algo->empty_tree) : 
"HEAD",
+   if (index_differs_from(unborn ? empty_tree_oid_hex() : "HEAD",
   NULL, 0))
return error_dirty_index(opts);
}
diff --git a/server-info.c b/server-info.c
index 828ec5e538..7ce6dcd67b 100644
--- a/server-info.c
+++ b/server-info.c
@@ -223,11 +223,9 @@ static void init_pack_info(const char *infofile, int force)
else
stale = 1;
 
-   for (i = 0; i < num_pack; i++) {
-   if (stale) {
+   for (i = 0; i < num_pack; i++)
+   if (stale)
info[i]->old_num = -1;
-   }
-   }
 
/* renumber them */
QSORT(info, num_pack, compare_info);
diff --git a/sha1_file.c b/sha1_file.c
index 794753bd54..bf6c8da3ff 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -109,6 +109,18 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = {
},
 };
 
+const char *empty_tree_oid_hex(void)
+{
+   static char buf[GIT_MAX_HEXSZ + 1];
+   return oid_to_hex_r(buf, the_hash_algo->empty_tree);
+}
+
+const char *empty_blob_oid_hex(void)
+{
+   static char buf[GIT_MAX_HEXSZ + 1];
+   return oid_to_hex_r(buf, the_hash_algo->empty_blob);
+}
+
 /*
  * This is meant to hold a *small* number of objects that you would
  * want read_sha1_file() to be able to return, but yet you do not want
diff --git a/submodule.c b/submodule.c
index 22a96b7af0..ee7eea4877 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1567,7 +1567,7 @@ static void submodule_reset_index(const char *path)
   get_super_prefix_or_empty(), path);
argv_array_pushl(&cp.args, "read-tree", "-u", "--reset", NULL);
 
-   argv_array_push(&cp.args, oid_to_hex(the_hash_algo->empty_tree));
+   argv_array_push(&cp.args, empty_tree_oid_hex());
 
if (run_command(&cp))
die("could not reset submodule index");
@@ -1659,9 +1659,9 @@ int s

Re: [PATCH v2 1/2] commit: fix --short and --porcelain options

2018-05-02 Thread Samuel Lijin
On Tue, May 1, 2018 at 10:50 PM Junio C Hamano  wrote:

> Samuel Lijin  writes:

> > Mark the commitable flag in the wt_status object in the call to
> > `wt_status_collect()`, instead of in `wt_longstatus_print_updated()`,
> > and simplify the logic in the latter function to take advantage of the
> > logic shifted to the former. This means that callers do not need to use
> > `wt_longstatus_print_updated()` to collect the `commitable` flag;
> > calling `wt_status_collect()` is sufficient.
> >
> > As a result, invoking `git commit` with `--short` or `--porcelain`
> > (which imply `--dry-run`, but previously returned an inconsistent error
> > code inconsistent with dry run behavior) correctly returns status code
> > zero when there is something to commit. This fixes two bugs documented
> > in the test suite.


> Hmm, I couldn't quite get what the above two paragraphs were trying
> to say, but I think I figured out by looking at wt_status.c before
> applying this patch, so let me see if I correctly understand what
> this patch is about by thinking a bit aloud.

> There are only two assignments to s->commitable in wt-status.c; one
> happens in wt_longstatus_print_updated(), when the function notices
> there is even one record to be shown (i.e. there is an "updated"
> path) and the other in show_merge_in_progress() which is called by
> wt_longstatus_prpint_state().  The latter codepath happens when we
> are in a merge and there is no remaining conflicted paths (the code
> allows the contents to be committed to be identical to HEAD).  Both
> are called from wt_longstatus_print(), which in turn is called by
> wt_status_print().

> The implication of the above observation is that we do not set
> commitable bit (by the way, shouldn't we spell it with two 'T's?)

Yep, MW confirms: https://www.merriam-webster.com/dictionary/commitable

I didn't think to check how common "commitable" is in the codebase, but it
doesn't seem to be too many, looking at the output of `git grep
commitable`, so I'll add that to the patch series when I reroll.

> if we are not doing the long format status.  The title hints at it
> but "fix" is too vague.  It would be easier to understand if it
> began like this (i.e. state problem clearly first, before outlining
> the solution):

>  [PATCH 1/2] commit: fix exit status under --short/--porcelain
options

>  In wt-status.c, s->commitable bit is set only in the
>  codepaths reachable from wt_status_print() when output
>  format is STATUS_FORMAT_LONG as a side effect of printing
>  bits of status.  Consequently, when running with --short and
>  --porcelain options, the bit is not set to reflect if there
>  is anything to be committed, and "git commit --short" or
>  "--porcelain" (both of which imply "--dry-run") failed to
>  signal if there is anything to commit with its exit status.

>  Instead, update s->commitable bit in wt_status_collect(),
>  regardless of the output format. ...

> Is that what is going on here?  Yours made it sound as if moving the
> code to _collect() was done for the sake of moving code around and
> simplifying the logic, and bugfix fell out of the move merely as a
> side effect, which probably was the source of my confusion.

Yep, that's right. I wasn't sure if the imperative tone was required for
the whole commit or just the description, hence the awkward structure. (I
also wasn't sure how strict the 70 char limit on the description was.)

> > +static void wt_status_mark_commitable(struct wt_status *s) {
> > + int i;
> > +
> > + for (i = 0; i < s->change.nr; i++) {
> > + struct wt_status_change_data *d =
(s->change.items[i]).util;
> > +
> > + if (d->index_status && d->index_status !=
DIFF_STATUS_UNMERGED) {
> > + s->commitable = 1;
> > + return;
> > + }
> > + }
> > +}

> I am not sure if this is sufficient.  From a cursory look of the
> existing code (and vague recollection in my ageing brain ;-), I
> think we say it is committable if

>   (1) when not merging, there is something to show in the "to be
>   committed" section (i.e. there must be something changed since
>   HEAD in the index).

>   (2) when merging, no conflicting paths remain (i.e. change.nr being
>   zero is fine).

> So it is unclear to me how you are dealing with (2) under "--short"
> option, which does not call show_merge_in_progress() to catch that
> case.

And the answer there is that I'm not :) I had hoped that there was a test
to catch mistakes like this but evidently not. Thanks for pointing that
out, I'll add a test to catch that.

I'm also realizing that I didn't const-ify wt_longstatus_print() in the
next patch, which is another reason I didn't catch this.

This seems a bit trickier to handle. What do you think of this approach:
(1) move wt_status_state into wt_status and (2) move the call to
wt_status_get_sta

[PATCH v4 0/3] add additional config settings for merge

2018-05-02 Thread Ben Peart
This version incorporates Elijah's fixup patch and is now based on
en/rename-directory-detection in next.

Base Ref: en/rename-directory-detection
Web-Diff: https://github.com/benpeart/git/commit/16b175c25f
Checkout: git fetch https://github.com/benpeart/git merge-renames-v4 && git 
checkout 16b175c25f

### Patches

Ben Peart (3):
  merge: update documentation for {merge,diff}.renameLimit
  merge: Add merge.renames config setting
  merge: pass aggressive when rename detection is turned off

 Documentation/diff-config.txt |  3 ++-
 Documentation/merge-config.txt|  8 +-
 Documentation/merge-strategies.txt| 11 +---
 diff.c|  2 +-
 diff.h|  1 +
 merge-recursive.c | 31 ++-
 merge-recursive.h |  8 +-
 t/t3034-merge-recursive-rename-options.sh | 18 +
 8 files changed, 68 insertions(+), 14 deletions(-)


base-commit: c5b761fb2711542073cf1906c0e86a34616b79ae
-- 
2.17.0.windows.1




[PATCH v4 1/3] merge: update documentation for {merge,diff}.renameLimit

2018-05-02 Thread Ben Peart
Update the documentation to better indicate that the renameLimit setting is
ignored if rename detection is turned off via command line options or config
settings.

Signed-off-by: Ben Peart 
---
 Documentation/diff-config.txt  | 3 ++-
 Documentation/merge-config.txt | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 5ca942ab5e..77caa66c2f 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -112,7 +112,8 @@ diff.orderFile::
 
 diff.renameLimit::
The number of files to consider when performing the copy/rename
-   detection; equivalent to the 'git diff' option `-l`.
+   detection; equivalent to the 'git diff' option `-l`. This setting
+   has no effect if rename detection is turned off.
 
 diff.renames::
Whether and how Git detects renames.  If set to "false",
diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 12b6bbf591..48ee3bce77 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -35,7 +35,8 @@ include::fmt-merge-msg-config.txt[]
 merge.renameLimit::
The number of files to consider when performing rename detection
during a merge; if not specified, defaults to the value of
-   diff.renameLimit.
+   diff.renameLimit. This setting has no effect if rename detection
+   is turned off.
 
 merge.renormalize::
Tell Git that canonical representation of files in the
-- 
2.17.0.windows.1



[PATCH v4 2/3] merge: Add merge.renames config setting

2018-05-02 Thread Ben Peart
Add the ability to control rename detection for merge via a config setting.
This setting behaves the same and defaults to the value of diff.renames but only
applies to merge.

Reviewed-by: Johannes Schindelin 
Helped-by: Elijah Newren 
Signed-off-by: Ben Peart 
---
 Documentation/merge-config.txt|  5 
 Documentation/merge-strategies.txt| 11 ++---
 diff.c|  2 +-
 diff.h|  1 +
 merge-recursive.c | 30 ++-
 merge-recursive.h |  8 +-
 t/t3034-merge-recursive-rename-options.sh | 18 ++
 7 files changed, 63 insertions(+), 12 deletions(-)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 48ee3bce77..662c2713ca 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -38,6 +38,11 @@ merge.renameLimit::
diff.renameLimit. This setting has no effect if rename detection
is turned off.
 
+merge.renames::
+   Whether and how Git detects renames.  If set to "false",
+   rename detection is disabled. If set to "true", basic rename
+   detection is enabled.  Defaults to the value of diff.renames.
+
 merge.renormalize::
Tell Git that canonical representation of files in the
repository has changed over time (e.g. earlier commits record
diff --git a/Documentation/merge-strategies.txt 
b/Documentation/merge-strategies.txt
index fd5d748d1b..30acc99232 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -23,8 +23,9 @@ recursive::
causing mismerges by tests done on actual merge commits
taken from Linux 2.6 kernel development history.
Additionally this can detect and handle merges involving
-   renames.  This is the default merge strategy when
-   pulling or merging one branch.
+   renames, but currently cannot make use of detected
+   copies.  This is the default merge strategy when pulling
+   or merging one branch.
 +
 The 'recursive' strategy can take the following options:
 
@@ -84,12 +85,14 @@ no-renormalize;;
`merge.renormalize` configuration variable.
 
 no-renames;;
-   Turn off rename detection.
+   Turn off rename detection. This overrides the `merge.renames`
+   configuration variable.
See also linkgit:git-diff[1] `--no-renames`.
 
 find-renames[=];;
Turn on rename detection, optionally setting the similarity
-   threshold.  This is the default.
+   threshold.  This is the default. This overrides the
+   'merge.renames' configuration variable.
See also linkgit:git-diff[1] `--find-renames`.
 
 rename-threshold=;;
diff --git a/diff.c b/diff.c
index fb22b19f09..e744b35cdc 100644
--- a/diff.c
+++ b/diff.c
@@ -177,7 +177,7 @@ static int parse_submodule_params(struct diff_options 
*options, const char *valu
return 0;
 }
 
-static int git_config_rename(const char *var, const char *value)
+int git_config_rename(const char *var, const char *value)
 {
if (!value)
return DIFF_DETECT_RENAME;
diff --git a/diff.h b/diff.h
index 7cf276f077..966fc8fce6 100644
--- a/diff.h
+++ b/diff.h
@@ -321,6 +321,7 @@ extern int git_diff_ui_config(const char *var, const char 
*value, void *cb);
 extern void diff_setup(struct diff_options *);
 extern int diff_opt_parse(struct diff_options *, const char **, int, const 
char *);
 extern void diff_setup_done(struct diff_options *);
+extern int git_config_rename(const char *var, const char *value);
 
 #define DIFF_DETECT_RENAME 1
 #define DIFF_DETECT_COPY   2
diff --git a/merge-recursive.c b/merge-recursive.c
index 5f42c677d5..372ffbbacc 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1524,7 +1524,15 @@ static struct diff_queue_struct *get_diffpairs(struct 
merge_options *o,
diff_setup(&opts);
opts.flags.recursive = 1;
opts.flags.rename_empty = 0;
-   opts.detect_rename = DIFF_DETECT_RENAME;
+   opts.detect_rename = merge_detect_rename(o);
+   /*
+* We do not have logic to handle the detection of copies.  In
+* fact, it may not even make sense to add such logic: would we
+* really want a change to a base file to be propagated through
+* multiple other files by a merge?
+*/
+   if (opts.detect_rename > DIFF_DETECT_RENAME)
+   opts.detect_rename = DIFF_DETECT_RENAME;
opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
o->diff_rename_limit >= 0 ? o->diff_rename_limit :
1000;
@@ -2564,7 +2572,7 @@ static int handle_renames(struct merge_options *o,
ri->head_renames = NULL;
ri->merge_renames = NULL;
 
-   if (!o->detect_rename)
+   if (!merge_detect_rename(o))
return 1;
 
head_pairs = get_diffpairs(o, common, head);
@@ 

[PATCH v4 3/3] merge: pass aggressive when rename detection is turned off

2018-05-02 Thread Ben Peart
Set aggressive flag in git_merge_trees() when rename detection is turned off.
This allows read_tree() to auto resolve more cases that would have otherwise
been handled by the rename detection.

Reviewed-by: Johannes Schindelin 
Signed-off-by: Ben Peart 
---
 merge-recursive.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 372ffbbacc..cea054cfd4 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -355,6 +355,7 @@ static int git_merge_trees(struct merge_options *o,
o->unpack_opts.fn = threeway_merge;
o->unpack_opts.src_index = &the_index;
o->unpack_opts.dst_index = &the_index;
+   o->unpack_opts.aggressive = !merge_detect_rename(o);
setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
init_tree_desc_from_tree(t+0, common);
-- 
2.17.0.windows.1



Re: [PATCH v4 03/10] commit-graph: compute generation numbers

2018-05-02 Thread Jakub Narebski
Derrick Stolee  writes:
> On 4/29/2018 5:08 AM, Jakub Narebski wrote:
>> Derrick Stolee  writes:

[...]
>> It is a bit strange to me that the code uses get_be32 for reading, but
>> htonl for writing.  Is Git tested on non little-endian machines, like
>> big-endian ppc64 or s390x, or on mixed-endian machines (or
>> selectable-endian machines with data endianness set to non
>> little-endian, like ia64)?  If not, could we use for example openSUSE
>> Build Service (https://build.opensuse.org/) for this?
>
> Since we are packing two values into 64 bits, I am using htonl() here
> to arrange the 30-bit generation number alongside the 34-bit commit
> date value, then writing with hashwrite(). The other 32-bit integers
> are written with hashwrite_be32() to avoid translating this data
> in-memory.

O.K., so you are using what is more effective and easier to use.
Nice to know, thanks for the information.

[...]
>>>   +static void compute_generation_numbers(struct commit** commits,
>>> +  int nr_commits)
>>> +{
[...]
>>> +   for (i = 0; i < nr_commits; i++) {
>>> +   if (commits[i]->generation != GENERATION_NUMBER_INFINITY &&
>>> +   commits[i]->generation != GENERATION_NUMBER_ZERO)
>>> +   continue;

[...]

>>>   + compute_generation_numbers(commits.list, commits.nr);
>>> +
>> Nice and simple.  All right.
>>
>> I guess that we do not pass "struct packed_commit_list commits" as
>> argument to compute_generation_numbers instead of "struct commit**
>> commits.list" and "int commits.nr" to compute_generation_numbers() to
>> keep the latter nice and generic?
>
> Good catch. There is no reason to not use packed_commit_list here.

Actually, now that v5 shows how using packed_commit_list looks like, in
my opinion it looks uglier.  And it might be easier to make mistake.

Also, depending on how compiler is able to optimize it, the version
passing packed_commit_list as an argument has one more indirection
(following two pointers) in the loop.

Best,
-- 
Jakub Narębski


Gute Nachrichten

2018-05-02 Thread Sgt Monica Lin Brown
Ich bin Sgt.Monica Lin Brown, ursprünglich aus Lake Jackson Texas USA. Ich habe 
persönlich eine spezielle Recherche im Internet durchgeführt und bin auf Ihre 
Informationen gestoßen. Ich sende Ihnen diese E-Mail von der US-Militärbasis 
Kabul Afghanistan. Ich habe einen gesicherten Geschäftsvorschlag für Sie. 
Fragen Sie mich, wenn Sie Interesse an weiteren Informationen haben


Re: [PATCH 00/13] object store: alloc

2018-05-02 Thread Duy Nguyen
On Tue, May 1, 2018 at 11:33 PM, Stefan Beller  wrote:
> I also debated if it is worth converting alloc.c via this patch series
> or if it might make more sense to use the new mem-pool by Jameson[1].
>
> I vaguely wonder about the performance impact, as the object allocation
> code seemed to be relevant in the past.

If I remember correctly, alloc.c was added because malloc() has too
high overhead per allocation (and we create like millions of them). As
long as you keep allocation overhead low, it should be ok. Note that
we allocate a lot more than the mem-pool's main target (cache entries
if I remember correctly). We may have a couple thousands cache
entries.  We already deal with a couple million of struct object.
-- 
Duy


Re: [PATCH 01/13] repository: introduce object parser field

2018-05-02 Thread Duy Nguyen
On Tue, May 1, 2018 at 11:33 PM, Stefan Beller  wrote:
> /*
> -* Holds any information related to accessing the raw object content.
> +* Holds any information needed to retrieve the raw content
> +* of objects. The object_parser uses this to get object
> +* content which it then parses.
>  */
> struct raw_object_store *objects;
>
> +   /*
> +* State for the object parser. This owns all parsed objects
> +* (struct object) so callers do not have to manage their
> +* lifetime.
> +*/
> +   struct object_parser *parsed_objects;

I like this name 'parsed_objects'. Should we rename the struct after
it (e.g. parsed_object_store as opposed to raw_object_store above)?

Another suggestion is object_pool, if we keep 'struct object' instead
of 'struct parsed_object' and also want to keep current allocation
behavior: no individual deallocation. If you free, you free the whole
pool (e.g. you could run rev-list --all --objects in a separate pool,
once you're done, you delete the whole pool).
-- 
Duy


Re: [PATCH v4 0/3] add additional config settings for merge

2018-05-02 Thread Elijah Newren
Hi Ben,

Thanks for your persistence.

On Wed, May 2, 2018 at 9:01 AM, Ben Peart  wrote:
> This version incorporates Elijah's fixup patch and is now based on
> en/rename-directory-detection in next.

en/rename-directory-detection was in master but reverted.
en/rename-directory-detection-reboot is the new series, and isn't
quite in next yet; Junio just said in the last "What's cooking" that
he intends to merge it down.  However, the two series are obviously
similar, so your rebasing even onto the old one does cut down on the
conflicts considerably.  Rebasing this latest series onto
en/rename-directory-detection-reboot leaves just one minor conflict.

Anyway, I've looked over this latest re-roll and it looks good to me:
Reviewed-by: Elijah Newren 


Re: [PATCH 01/13] repository: introduce object parser field

2018-05-02 Thread Stefan Beller
On Wed, May 2, 2018 at 10:17 AM, Duy Nguyen  wrote:
> On Tue, May 1, 2018 at 11:33 PM, Stefan Beller  wrote:
>> /*
>> -* Holds any information related to accessing the raw object content.
>> +* Holds any information needed to retrieve the raw content
>> +* of objects. The object_parser uses this to get object
>> +* content which it then parses.
>>  */
>> struct raw_object_store *objects;
>>
>> +   /*
>> +* State for the object parser. This owns all parsed objects
>> +* (struct object) so callers do not have to manage their
>> +* lifetime.
>> +*/
>> +   struct object_parser *parsed_objects;
>
> I like this name 'parsed_objects'. Should we rename the struct after
> it (e.g. parsed_object_store as opposed to raw_object_store above)?

I can rename it to parsed_object_store for consistency.

> Another suggestion is object_pool, if we keep 'struct object' instead
> of 'struct parsed_object' and also want to keep current allocation
> behavior: no individual deallocation. If you free, you free the whole
> pool (e.g. you could run rev-list --all --objects in a separate pool,
> once you're done, you delete the whole pool).

That is what the following patches will be about, you can
only free the whole set of parsed objects.

So if you want to do a separate rev walk, you may need to
instantiate a new repository for it (ideally you'd only need a
separate parsed object store).

I'd want to have the ability to have separate pools for submodules,
such that they can be free'd on a per-repo basis.

> --
> Duy


Re: [PATCH 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-02 Thread Duy Nguyen
On Tue, May 1, 2018 at 11:34 PM, Stefan Beller  wrote:
>  #include "cache.h"
>  #include "object.h"
> @@ -30,8 +31,25 @@ struct alloc_state {
> int count; /* total number of nodes allocated */
> int nr;/* number of nodes left in current allocation */
> void *p;   /* first free node in current allocation */
> +
> +   /* bookkeeping of allocations */
> +   void **slabs;

Another way to manage this is linked list: you could reserve one
"object" in each slab to store the "next" (or "prev") pointer to
another slab, then you can just walk through all slabs and free. It's
a bit cheaper than reallocating slabs[], but I guess we reallocate so
few times that readability matters more (whichever way is chosen).

> +   int slab_nr, slab_alloc;
>  };
>
> +void *allocate_alloc_state(void)
> +{
> +   return xcalloc(1, sizeof(struct alloc_state));
> +}
> +
> +void clear_alloc_state(struct alloc_state *s)
> +{
> +   while (s->slab_nr > 0) {
> +   s->slab_nr--;
> +   free(s->slabs[s->slab_nr]);

I think you're leaking memory here. Commit and tree objects may have
more allocations in them (especially trees, but I think we have
commit_list in struct commit too). Those need to be freed as well.

> +   }
> +}
> +
>  static inline void *alloc_node(struct alloc_state *s, size_t node_size)
>  {
> void *ret;
-- 
Duy


Re: [PATCH 01/13] repository: introduce object parser field

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 7:26 PM, Stefan Beller  wrote:
>> Another suggestion is object_pool, if we keep 'struct object' instead
>> of 'struct parsed_object' and also want to keep current allocation
>> behavior: no individual deallocation. If you free, you free the whole
>> pool (e.g. you could run rev-list --all --objects in a separate pool,
>> once you're done, you delete the whole pool).
>
> That is what the following patches will be about, you can
> only free the whole set of parsed objects.
>
> So if you want to do a separate rev walk, you may need to
> instantiate a new repository for it (ideally you'd only need a
> separate parsed object store).

I'm not sure if it's a good idea to create a separate struct
repository just because you want to free this parsed object store.
What if updates are made in both repositories? All the cache (well,
mostly the delta base cache in sha1_file.c) will double memory usage
as well. Yeah not ideal. But I guess making rev-list related code use
a separate parsed object store is no small task (and kinda risky as
well since we migrate from global lookup_* functions to local ones and
need to choose the correct parsed object store to look up from)

For your information, there is already a good use case for this
wholesale memory free: if we can free the rev-list related memory
early in pack-objects (e.g. part of repack operation) then it could
lower memory pressure significantly when running on large repos. This
has been discussed a bit lately.
-- 
Duy


Re: [PATCH] checkout & worktree: introduce a core.DWIMRemote setting

2018-05-02 Thread Eric Sunshine
On Wed, May 2, 2018 at 11:21 AM, Duy Nguyen  wrote:
> On Wed, May 2, 2018 at 12:54 PM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Introduce a core.DWIMRemote setting which can be used to designate a
>> remote to prefer (via core.DWIMRemote=origin) when running e.g. "git
>> checkout master" to mean origin/master, even though there's other
>> remotes that have the "master" branch.
>
> Do we anticipate more dwimy customizations to justify a new dwim group
> in config (i.e. dwim.remote instead of core.dwimRemote)?

A few observations:

1. DWIM has broad meaning; while this certainly falls within DWIM,
it's also just a _default_, so should this instead be named
"defaultRemote"?

2. Building on #1: How well is the term "DWIM" understood by non-power
users? A term, such as "default" is more well known.

3. git-worktree learned --guess-remote and worktree.guessRemote in [1]
and [2], which, on the surface, sound confusingly similar to
"DWIMRemote". Even though I was well involved in the discussion and
repeatedly reviewed the patch series which introduced those, it still
took me an embarrassingly long time, and repeated re-reads of all
commit messages involved, to grasp (or re-grasp) how "guess remote"
and "DWIM remote" differ. If it was that difficult for me, as a person
involved in the patch series, to figure out "guess remote" vs. "DWIM
remote", then I worry that it may be too confusing in general. If the
new config option had been named "defaultRemote", I don't think I'd
have been confused at all.

It may sound as if I'm advocating the name "defaultRemote", but that
wasn't the intention, and I'm not wedded to it; that name simply fell
out naturally as I enumerated the above observations.

[1]: 71d6682d8c (worktree: add --guess-remote flag to add subcommand,
2017-11-29)
[2]: e92445a731 (add worktree.guessRemote config option, 2017-11-29)


RE: [PATCH 00/13] object store: alloc

2018-05-02 Thread Jameson Miller


> -Original Message-
> From: Duy Nguyen 
> Sent: Wednesday, May 2, 2018 1:02 PM
> To: Stefan Beller 
> Cc: Git Mailing List ; Jameson Miller
> 
> Subject: Re: [PATCH 00/13] object store: alloc
> 
> On Tue, May 1, 2018 at 11:33 PM, Stefan Beller  wrote:
> > I also debated if it is worth converting alloc.c via this patch series
> > or if it might make more sense to use the new mem-pool by Jameson[1].
> >
> > I vaguely wonder about the performance impact, as the object
> > allocation code seemed to be relevant in the past.
> 
> If I remember correctly, alloc.c was added because malloc() has too high
> overhead per allocation (and we create like millions of them). As long as you
> keep allocation overhead low, it should be ok. Note that we allocate a lot 
> more
> than the mem-pool's main target (cache entries if I remember correctly). We
> may have a couple thousands cache entries.  We already deal with a couple
> million of struct object.

The work to move cache entry allocation onto a memory pool was motivated by
the fact that we are looking at indexes with millions of entries. If there is 
scaling
concern with the current version of mem-pool, we would like to address it there
as well. Or if there is improvements that can be shared, that would be nice as 
well.

> --
> Duy


Re: [PATCH] checkout & worktree: introduce a core.DWIMRemote setting

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 8:00 PM, Eric Sunshine  wrote:
> 2. Building on #1: How well is the term "DWIM" understood by non-power
> users? A term, such as "default" is more well known.

I'm going off topic but I kinda dislike this term. First time I
encountered it in the code I didn't even know what it meant. Since it
has not been leaked to user documents (I only did a grep on
Documentation/) perhaps a better term should be used, preferably not
another acronym.
--
Duy


Re: [PATCH 00/13] object store: alloc

2018-05-02 Thread Duy Nguyen
On Wed, May 2, 2018 at 8:07 PM, Jameson Miller  wrote:
>
>
>> -Original Message-
>> From: Duy Nguyen 
>> Sent: Wednesday, May 2, 2018 1:02 PM
>> To: Stefan Beller 
>> Cc: Git Mailing List ; Jameson Miller
>> 
>> Subject: Re: [PATCH 00/13] object store: alloc
>>
>> On Tue, May 1, 2018 at 11:33 PM, Stefan Beller  wrote:
>> > I also debated if it is worth converting alloc.c via this patch series
>> > or if it might make more sense to use the new mem-pool by Jameson[1].
>> >
>> > I vaguely wonder about the performance impact, as the object
>> > allocation code seemed to be relevant in the past.
>>
>> If I remember correctly, alloc.c was added because malloc() has too high
>> overhead per allocation (and we create like millions of them). As long as you
>> keep allocation overhead low, it should be ok. Note that we allocate a lot 
>> more
>> than the mem-pool's main target (cache entries if I remember correctly). We
>> may have a couple thousands cache entries.  We already deal with a couple
>> million of struct object.
>
> The work to move cache entry allocation onto a memory pool was motivated by
> the fact that we are looking at indexes with millions of entries. If there is 
> scaling
> concern with the current version of mem-pool, we would like to address it 
> there
> as well. Or if there is improvements that can be shared, that would be nice 
> as well.

I think the two have quite different characteristics. alloc.c code is
driven by overhead. struct blob is only 24 bytes each and about 1/3
the repo is blobs, and each malloc has 16 bytes overhead or so if I
remember correctly. struct cache_entry at minimum in 88 bytes so
relative overhead is not that a big deal (but sure reducing it is
still very nice).

mem-pool is about allocation speed, but I think that's not a concern
for alloc.c because when we do full rev walk, I think I/O is always
the bottleneck (maybe object lookup as well). I don't see a good way
to have the one memory allocator that satisfyies both to be honest. If
you could allocate fixed-size cache entries most of the time (e.g.
larger entries will be allocated using malloc or something, and should
be a small number), then perhaps we can unify. Or if mem-pool can have
an option to allocated fixed size pieces with no overhead, that would
be great (sorry I still have not read that mem-pool series yet)
-- 
Duy


Re: [PATCH] checkout & worktree: introduce a core.DWIMRemote setting

2018-05-02 Thread Ævar Arnfjörð Bjarmason

On Wed, May 02 2018, Eric Sunshine wrote:

> On Wed, May 2, 2018 at 11:21 AM, Duy Nguyen  wrote:
>> On Wed, May 2, 2018 at 12:54 PM, Ævar Arnfjörð Bjarmason
>>  wrote:
>>> Introduce a core.DWIMRemote setting which can be used to designate a
>>> remote to prefer (via core.DWIMRemote=origin) when running e.g. "git
>>> checkout master" to mean origin/master, even though there's other
>>> remotes that have the "master" branch.
>>
>> Do we anticipate more dwimy customizations to justify a new dwim group
>> in config (i.e. dwim.remote instead of core.dwimRemote)?
>
> A few observations:
>
> 1. DWIM has broad meaning; while this certainly falls within DWIM,
> it's also just a _default_, so should this instead be named
> "defaultRemote"?
>
> 2. Building on #1: How well is the term "DWIM" understood by non-power
> users? A term, such as "default" is more well known.

I've got no love for the DWIM term. And I think I should change it in
v2, I just want some way to enable this functionality since this
behavior has annoyed me for a long time.

I wonder though if something like "core.defaultRemote" might not bring
up connotations about whether this would e.g. affect "push" in the minds
of users (not that my initial suggestion is better).

So maybe something like checkout.implicitRemote would be better? And we
could just break the rule that only git-checkout would use it, since
git-worktree could be said to be doing something checkout-like, or just
also add a worktree.implicitRemote.

> 3. git-worktree learned --guess-remote and worktree.guessRemote in [1]
> and [2], which, on the surface, sound confusingly similar to
> "DWIMRemote". Even though I was well involved in the discussion and
> repeatedly reviewed the patch series which introduced those, it still
> took me an embarrassingly long time, and repeated re-reads of all
> commit messages involved, to grasp (or re-grasp) how "guess remote"
> and "DWIM remote" differ. If it was that difficult for me, as a person
> involved in the patch series, to figure out "guess remote" vs. "DWIM
> remote", then I worry that it may be too confusing in general. If the
> new config option had been named "defaultRemote", I don't think I'd
> have been confused at all.

I hadn't looked at this at all before today when I wrote the patch, and
I've never used git-worktree (but maybe I should...), but my
understanding of this --[no-]guess-remote functionality is that it
effectively splits up the functionality that the "git checkout" does,
and we'll unconditionally check out the branch, but the option controls
whether or not we'd set up the equivalent of remote tracking for
git-worktree.

But maybe I've completely misunderstood it.

> It may sound as if I'm advocating the name "defaultRemote", but that
> wasn't the intention, and I'm not wedded to it; that name simply fell
> out naturally as I enumerated the above observations.
>
> [1]: 71d6682d8c (worktree: add --guess-remote flag to add subcommand,
> 2017-11-29)
> [2]: e92445a731 (add worktree.guessRemote config option, 2017-11-29)


RE: [PATCH 00/13] object store: alloc

2018-05-02 Thread Jameson Miller


> -Original Message-
> From: Duy Nguyen 
> Sent: Wednesday, May 2, 2018 2:23 PM
> To: Jameson Miller 
> Cc: Stefan Beller ; Git Mailing List 
> Subject: Re: [PATCH 00/13] object store: alloc
> 
> On Wed, May 2, 2018 at 8:07 PM, Jameson Miller 
> wrote:
> >
> >
> >> -Original Message-
> >> From: Duy Nguyen 
> >> Sent: Wednesday, May 2, 2018 1:02 PM
> >> To: Stefan Beller 
> >> Cc: Git Mailing List ; Jameson Miller
> >> 
> >> Subject: Re: [PATCH 00/13] object store: alloc
> >>
> >> On Tue, May 1, 2018 at 11:33 PM, Stefan Beller 
> wrote:
> >> > I also debated if it is worth converting alloc.c via this patch
> >> > series or if it might make more sense to use the new mem-pool by
> Jameson[1].
> >> >
> >> > I vaguely wonder about the performance impact, as the object
> >> > allocation code seemed to be relevant in the past.
> >>
> >> If I remember correctly, alloc.c was added because malloc() has too
> >> high overhead per allocation (and we create like millions of them).
> >> As long as you keep allocation overhead low, it should be ok. Note
> >> that we allocate a lot more than the mem-pool's main target (cache
> >> entries if I remember correctly). We may have a couple thousands
> >> cache entries.  We already deal with a couple million of struct object.
> >
> > The work to move cache entry allocation onto a memory pool was
> > motivated by the fact that we are looking at indexes with millions of
> > entries. If there is scaling concern with the current version of
> > mem-pool, we would like to address it there as well. Or if there is
> improvements that can be shared, that would be nice as well.
> 
> I think the two have quite different characteristics. alloc.c code is driven 
> by
> overhead. struct blob is only 24 bytes each and about 1/3 the repo is blobs, 
> and
> each malloc has 16 bytes overhead or so if I remember correctly. struct
> cache_entry at minimum in 88 bytes so relative overhead is not that a big deal
> (but sure reducing it is still very nice).
> 
> mem-pool is about allocation speed, but I think that's not a concern for 
> alloc.c
> because when we do full rev walk, I think I/O is always the bottleneck (maybe
> object lookup as well). I don't see a good way to have the one memory 
> allocator
> that satisfyies both to be honest. If you could allocate fixed-size cache 
> entries
> most of the time (e.g.
> larger entries will be allocated using malloc or something, and should be a 
> small
> number), then perhaps we can unify. Or if mem-pool can have an option to
> allocated fixed size pieces with no overhead, that would be great (sorry I 
> still
> have not read that mem-pool series yet)
> --
> Duy

Thank you for the extra details - the extra context was helpful -
especially the motivations for each of the areas. I agree with
your general analysis, but with the extra point that the memory
pool does allocate memory (variable sized) without any overhead,
except for possible alignment considerations and differences in
bookkeeping the larger "blocks" of memory from which small
allocations are made from - but I don't think this would be
enough to have a meaningful overall impact.

The mem-pool only tracks the pointer to the next available bit of
memory, and the end of the available memory range. It has a
similar constraint in that individual allocations cannot be freed
- you have to free the whole block.

It may be that the requirements are different enough (or the
gains worth it) to have another dedicated pooling allocator, but
I think the current design of the memory pool would satisfy both
consumers, even if the memory considerations are a bigger
motivation for blob structs. I would be interested in your
thoughts if you get the opportunity to read the mem-pool series.


Re: [PATCH] git-p4 - Add option --sha1 to submit in p4

2018-05-02 Thread Luke Diamand
On 2 May 2018 at 16:51, Merland Romain  wrote:
> Thanks Luke,
>
> Following your comments, I will:
> - change the option name to --commit if it suits you

Seems like a good name.

> - add an option --force-rebase which defaults to false. Setting it to true
> will rebase even with --commit option used.

Or "--disable-rebase" ?  I think there's no reason you couldn't rebase
after doing a commit like this is there?

And "--disable-rebase" would be useful generally, not just with the
--commit option, but also with the regular options.

"--force-rebase" is a bit confusing since for the normal flow, it
always rebases anyway, and there's no need to force!

But you're the one who is using this in anger, so your opinion likely
counts for more than mine!

> it can be useful too if some commits have been skipped and user wants to
> rebase anyway
> - add an entry in the documentation
> - (learn how to create a test and) add tests for these specific cases

To create a test just look in t/ for the t98*.sh files. Cut-n-paste
one or add to an existing one. Send an email here if you get stuck
(but it's pretty straightforward).

You can run only the git-p4 tests with:

$ (cd t && make T=t98* -j)

>
> What is the best practice ? to send a new email with the new diff ? to
> continue this discussion ?

I think either, probably a new email is easiest. See
Documentation/SubmittingPatches for the general process.


Regards!
Luke


Re: [PATCH 01/13] repository: introduce object parser field

2018-05-02 Thread Jonathan Tan
On Tue,  1 May 2018 14:33:51 -0700
Stefan Beller  wrote:

> Git's object access code can be thought of as containing two layers:
> the raw object store provides access to raw object content, while the
> higher level obj_hash code parses raw objects and keeps track of
> parenthood and other object relationships using 'struct object'.
> Keeping these layers separate should make it easier to find relevant
> functions and to change the implementation of one without having to
> touch the other.

I only understood this after reading the code below. I think this
paragraph can be removed; I don't see its relevance - of course we need
to store metadata about how to load objects somewhere, and caching
objects we have already loaded is a good idea: and the metadata and
cache are two separate things before and after this patch anyway.

> Add an object_parser field to 'struct repository' to prepare obj_hash
> to be handled per repository.  Callers still only use the_repository
> for now --- later patches will adapt them to handle arbitrary
> repositories.

I think this is better reworded as:

  Convert the existing global cache for parsed objects (obj_hash) into
  repository-specific parsed object caches. Existing code that uses
  obj_hash are modified to use the parsed object cache of
  the_repository; future patches will use the parsed object caches of
  other repositories.

> +struct object_parser *object_parser_new(void)
> +{
> + struct object_parser *o = xmalloc(sizeof(*o));
> + memset(o, 0, sizeof(*o));
> + return o;
> +}

I'm not sure that I like this style of code (I prefer the strbuf style
with _INIT and release(), which I think is more flexible) but I don't
feel too strongly about it.

> +struct object_parser {
> + struct object **obj_hash;
> + int nr_objs, obj_hash_size;
> +};

object_parser is probably a bad name. I think Duy had some good
suggestions in [1].

[1] 
https://public-inbox.org/git/CACsJy8CgX6BME=ehidbfmrzboydbnhe6ouxv4fzc-gw7rsi...@mail.gmail.com/

>   /*
> -  * Holds any information related to accessing the raw object content.
> +  * Holds any information needed to retrieve the raw content
> +  * of objects. The object_parser uses this to get object
> +  * content which it then parses.
>*/
>   struct raw_object_store *objects;

I think the additional sentence ("The object_parser uses this...") is
unnecessary and confusing, especially if its identity is going to be one
of storage (like "parsed_objects" implies).

> + /*
> +  * State for the object parser. This owns all parsed objects
> +  * (struct object) so callers do not have to manage their
> +  * lifetime.
> +  */
> + struct object_parser *parsed_objects;

Even after all the commits in this patch set, this does not store any
state for parsing. Maybe just document as:

  All objects in this repository that have been parsed. This structure
  owns all objects it references, so users of "struct object *"
  generally do not need to free them; instead, when a repository is no
  longer used, call object_parser_clear() on this structure.

(And maybe say that the freeing method on struct repository will
automatically call object_parser_clear().)


Re: [PATCH 04/13] alloc: add repository argument to alloc_blob_node

2018-05-02 Thread Jonathan Tan
On Tue,  1 May 2018 14:33:54 -0700
Stefan Beller  wrote:

> Signed-off-by: Stefan Beller 

Add the same boilerplate explanation to this and subsequent commits. If
editing it for every new function name is cumbersome, maybe use this
shortened version:

  This is a small mechanical change; it doesn't change the
  implementation to handle repositories other than the_repository yet.
  Use a macro to catch callers passing a repository other than
  the_repository at compile time.


Re: [PATCH 12/13] object: allow create_object to handle arbitrary repositories

2018-05-02 Thread Jonathan Tan
On Tue,  1 May 2018 14:34:02 -0700
Stefan Beller  wrote:

> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 

Reviewed-by: Jonathan Tan 

Downloading this patch set and viewing the whole function modified in
this patch shows that globals are no longer referenced, so this is good.
Same comment for patch 11.


Re: [PATCH 13/13] alloc: allow arbitrary repositories for alloc functions

2018-05-02 Thread Jonathan Tan
On Tue,  1 May 2018 14:34:03 -0700
Stefan Beller  wrote:

> +void *allocate_alloc_state(void)
> +{
> + return xcalloc(1, sizeof(struct alloc_state));
> +}
> +
> +void clear_alloc_state(struct alloc_state *s)
> +{
> + while (s->slab_nr > 0) {
> + s->slab_nr--;
> + free(s->slabs[s->slab_nr]);
> + }
> +}

These functions are asymmetrical. I understand why it is this way
(because we use pointers, and we want to use FREE_AND_NULL), but would
still prefer _INIT and _release().

>  static inline void *alloc_node(struct alloc_state *s, size_t node_size)
>  {
>   void *ret;
> @@ -45,54 +63,56 @@ static inline void *alloc_node(struct alloc_state *s, 
> size_t node_size)
>   ret = s->p;
>   s->p = (char *)s->p + node_size;
>   memset(ret, 0, node_size);
> +
> + ALLOC_GROW(s->slabs, s->slab_nr + 1, s->slab_alloc);
> + s->slabs[s->slab_nr++] = ret;
> +
>   return ret;
>  }

This unconditionally grows the slabs for each node allocation. Shouldn't
more than one node fit in each slab?

> +extern struct alloc_state the_repository_blob_state;
> +extern struct alloc_state the_repository_tree_state;
> +extern struct alloc_state the_repository_commit_state;
> +extern struct alloc_state the_repository_tag_state;
> +extern struct alloc_state the_repository_object_state;

(Context: these were in alloc.h)

These seem to be used only in object.c, and only in one function - could
we declare them "static" inside that function instead?

> -struct object_parser *object_parser_new(void)
> +struct object_parser *object_parser_new(int is_the_repo)
>  {
>   struct object_parser *o = xmalloc(sizeof(*o));
>   memset(o, 0, sizeof(*o));
> +
> + if (is_the_repo) {
> + o->blob_state = &the_repository_blob_state;
> + o->tree_state = &the_repository_tree_state;
> + o->commit_state = &the_repository_commit_state;
> + o->tag_state = &the_repository_tag_state;
> + o->object_state = &the_repository_object_state;
> + } else {
> + o->blob_state = allocate_alloc_state();
> + o->tree_state = allocate_alloc_state();
> + o->commit_state = allocate_alloc_state();
> + o->tag_state = allocate_alloc_state();
> + o->object_state = allocate_alloc_state();
> + }
>   return o;
>  }

I don't think saving 5 allocations is worth the code complexity (of the
additional parameter).


Re: [PATCH 08/41] packfile: abstract away hash constant values

2018-05-02 Thread brian m. carlson
On Wed, May 02, 2018 at 05:26:25PM +0200, Duy Nguyen wrote:
> On Wed, May 2, 2018 at 2:11 AM, brian m. carlson
>  wrote:
> > On Tue, May 01, 2018 at 12:22:43PM +0200, Duy Nguyen wrote:
> >> While we're abstracting away 20. There's the only 20 left in
> >> sha1_file.c that should also be gone. But I guess you could do that
> >> later since you need to rename fill_sha1_path to
> >> fill_loose_object_path or something.
> >
> > I'm already working on knocking those out.
> 
> Yeah I checked out your part14 branch after writing this note :P You
> still need to rename the function though. I can remind that again when
> part14 is sent out.

I've made a note in my project notes.

> > Unfortunately, I can't, because it's not an object ID.  I think the
> > decision was made to not transform non-object ID hashes into struct
> > object_id, which makes sense.  I suppose we could have an equivalent
> > struct hash or something for those other uses.
> 
> I probably miss something, is hashcmp() supposed to stay after the
> conversion? And will it compare any hash (as configured in the_algo)
> or will it for SHA-1 only?

Yes, it will stick around for the handful of places where we have hashes
like pack checksums.

> If hashcmp() will eventually compare the_algo->rawsz then yes this makes 
> sense.

That's my intention, yes.

> > I believe this is the pack hash, which isn't an object ID.  I will
> > transform it to be called something other than "sha1" and allocate more
> > memory for it in a future series, though.
> 
> Ah ok, it's the "sha1" in the name that bugged me. I'm all good then.

Also noted in my project notes.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: What's cooking in git.git (Apr 2018, #04; Mon, 30)

2018-05-02 Thread Brandon Williams
On 04/30, Junio C Hamano wrote:
> * bw/protocol-v2 (2018-03-15) 35 commits
>   (merged to 'next' on 2018-04-11 at 23ee234a2c)
>  + remote-curl: don't request v2 when pushing
>  + remote-curl: implement stateless-connect command
>  + http: eliminate "# service" line when using protocol v2
>  + http: don't always add Git-Protocol header
>  + http: allow providing extra headers for http requests
>  + remote-curl: store the protocol version the server responded with
>  + remote-curl: create copy of the service name
>  + pkt-line: add packet_buf_write_len function
>  + transport-helper: introduce stateless-connect
>  + transport-helper: refactor process_connect_service
>  + transport-helper: remove name parameter
>  + connect: don't request v2 when pushing
>  + connect: refactor git_connect to only get the protocol version once
>  + fetch-pack: support shallow requests
>  + fetch-pack: perform a fetch using v2
>  + upload-pack: introduce fetch server command
>  + push: pass ref prefixes when pushing
>  + fetch: pass ref prefixes when fetching
>  + ls-remote: pass ref prefixes when requesting a remote's refs
>  + transport: convert transport_get_remote_refs to take a list of ref prefixes
>  + transport: convert get_refs_list to take a list of ref prefixes
>  + connect: request remote refs using v2
>  + ls-refs: introduce ls-refs server command
>  + serve: introduce git-serve
>  + test-pkt-line: introduce a packet-line test helper
>  + protocol: introduce enum protocol_version value protocol_v2
>  + transport: store protocol version
>  + connect: discover protocol version outside of get_remote_heads
>  + connect: convert get_remote_heads to use struct packet_reader
>  + transport: use get_refs_via_connect to get refs
>  + upload-pack: factor out processing lines
>  + upload-pack: convert to a builtin
>  + pkt-line: add delim packet support
>  + pkt-line: allow peeking a packet line without consuming it
>  + pkt-line: introduce packet_read_with_status
>  (this branch is used by bw/server-options.)
> 
>  The beginning of the next-gen transfer protocol.
> 
>  Will merge to 'master'.

Awesome, I see you're already planning on merging this!  I was just
going to drop by and saw that internal testing showed that this *should*
be ready to be merged to master :)  Thanks!

-- 
Brandon Williams


Re: [PATCH v2 00/42] object_id part 13

2018-05-02 Thread brian m. carlson
On Wed, May 02, 2018 at 05:32:24PM +0200, Duy Nguyen wrote:
> Interdiff for people who don't have time to read 42 patches yet

Thanks for this.  I had intended to include tbdiff output in my series
but had forgotten when I reformatted the patches.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Blame / annotate with mixed mac line endings?

2018-05-02 Thread Jason Pyeron
I think this file may have been from a mac at one point. Note the lone CR 
(0x0d) line endings, with CRLFs later.

$ cat src/htdocs/hr/ats/rpts/incl/slt_org.cfm | sed 
's/[a-zA-Z]/x/g;s/[0-9]/8/g' | hexdump -C
  3c 21 2d 2d 2d 0d 78 78  78 78 78 78 78 78 3a 20  |.- |
00c0  78 78 20 78 78 78 20 78  78 20 78 78 78 78 2d 3c  |xx xxx xx -<|
00d0  2f 78 78 78 78 78 78 3e  0d 0a 09 3c 78 78 78 78  |/xx>..#xxx_xxx#,xxx">xxx.|
0190  0d 0a 09 3c 78 78 78 78  78 78 20 78 78 78 78 78  |...#xx|
01b0  78 5f 78 78 78 23 3c 2f  78 78 78 78 78 78 3e 0d  |x_xxx#.|
01c0  0a 3c 2f 78 78 78 78 78  78 78 78 3e 0d 0a 3c 2f  |...|
01db

$ git annotate -L 13,17 --line-porcelain 
'src/htdocs/hr/ats/rpts/incl/slt_org.cfm'
fatal: file src/htdocs/hr/ats/rpts/incl/slt_org.cfm has only 10 lines

$ cat src/htdocs/hr/ats/rpts/incl/slt_org.cfm | wc -l
10

$ cat src/htdocs/hr/ats/rpts/incl/slt_org.cfm | mac2unix | wc -l
18

$ git -c core.autocrlf=false -c core.eol=cr annotate -L 13,17 --line-porcelain 
'src/htdocs/hr/ats/rpts/incl/slt_org.cfm'
fatal: file src/htdocs/hr/ats/rpts/incl/slt_org.cfm has only 10 lines

Any way to hit git with a stick to treat lone CR as a line break for 
blame/annotate?

-Jason

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-   -
- Jason Pyeron  PD Inc. http://www.pdinc.us -
- Principal Consultant  10 West 24th Street #100-
- +1 (443) 269-1555 x333Baltimore, Maryland 21218   -
-   -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



Re: [GSoC] Info: new blog series of my work on Git GSoC '18

2018-05-02 Thread Andrew Ardill
On 2 May 2018 at 17:12, Johannes Schindelin  wrote:
> Hi Pratik,
>
> On Wed, 2 May 2018, Pratik Karki wrote:
>
>> As promised in my proposal, I've started
>> to write a blog series of GSoC '18 with Git. The initial blog is up.
>> You can find it here[1]. The initial one is just to get started and
>> from next iterations, I'll start detailing of my work towards converting
>> rebase to builtin.
>>
>> [1]: https://prertik.github.io/categories/git/
>
> This is awesome! Thanks for doing this,
> Dscho

Agreed, was fun to read.

I'd encourage you to post to the list when you blog, or perhaps
include a link to the blog as part of any regular updates you give on
your project progress.

Would also make for an interesting addition to the newsletter.

I know it can be difficult to sit down and write about what you're
doing, especially when it feels like you could be focusing on 'real
work'. Hopefully you will find the process rewarding; I'm looking
forward to reading about what you find easy and hard, how you learn
the git developer processes, and the challenges you find in converting
shell scripts to a built-in. I'm sure other people are too, and I'll
bet the ones who have been there before will have feedback for you as
well.

I'd find it interesting even if it was a 5-line bullet list of what's
going through your mind with respect to the project! Looking forward
to following along.

Regards,

Andrew Ardill


[PATCH 1/5] submodule foreach: correct '$path' in nested submodules from a subdirectory

2018-05-02 Thread Stefan Beller
From: Prathamesh Chavan 

When running 'git submodule foreach --recursive' from a subdirectory of
your repository, nested submodules get a bogus value for $path:
For a submodule 'sub' that contains a nested submodule 'nested',
running 'git -C dir submodule foreach echo $path' from the root of the
superproject would report path='../nested' for the nested submodule.
The first part '../' is derived from the logic computing the relative
path from $pwd to the root of the superproject. The second part is the
submodule path inside the submodule. This value is of little use and is
hard to document.

There are three different possible solutions that have more value:
(a) The path value is documented as the path from the toplevel of the
superproject to the mount point of the submodule. If 'the' refers to
the superproject holding this submodule ('sub' holding 'nested'),
the path would be expected to be path='nested'.
(b) In case 'the' superproject is referring to the toplevel, which
is the superproject in which the original command was invoked,
then path is expected to be path='sub/nested'.
(c) The documentation explains $path as [...] "relative to the
superproject", following 091a6eb0fe (submodule: drop the
top-level requirement, 2013-06-16), such that the nested submodule
would be expected as path='../sub/nested', when "the" superproject
is the superproject, where the command was run from
(d) or the value of path='nested' is expected if we take the
intermediate superproject into account. [This is the same as
(a); it highlights that the documentation is not clear, but
technically correct if we were to revert 091a6eb0fe.]

The behavior for (c) was introduced in 091a6eb0fe (submodule: drop the
top-level requirement, 2013-06-16) the intent for $path seemed to be
relative to $cwd to the submodule worktree, but that did not work for
nested submodules, as the intermittent submodules were not included in
the path.

If we were to fix the meaning of the $path using (a), (d) such that "path"
is "the path from the intermediate superproject to the mount point of the
submodule", we would break any existing submodule user that runs foreach
from non-root of the superproject as the non-nested submodule
'../sub' would change its path to 'sub'.

If we were to fix the meaning of $path using (b) such that "path"
is "the path from the topmost superproject to the mount point of the
submodule", then we would break any user that uses nested submodules
(even from the root directory) as the 'nested' would become 'sub/nested'.

If we were to fix the meaning of $path using (c) such that "path"
is "the display path from where the original command was invoked to the
submodule", then we would break users that rely on the assumption that
"$toplevel / $path" is the absolute path of the submodule.

All groups can be found in the wild.  The author has no data if one group
outweighs the other by large margin, and offending each one seems equally
bad at first.  However in the authors imagination it is better to go with
(a) as running from a sub directory sounds like it is carried out by a
human rather than by some automation task.  With a human on the keyboard
the feedback loop is short and the changed behavior can be adapted to
quickly unlike some automation that can break silently.

Another argument in favor of (a) is the consistency of the variables
provided, "$toplevel / $path" gives the absolute path of the submodule,
with 'toplevel' (both the variable as well as the documentation) referring
to the immediate superproject of the submodule.

Documentation of the variable is adjusted in a follow-up patch.

Discussed-with: Ramsay Jones 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Stefan Beller 
Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 git-submodule.sh |  1 -
 t/t7407-submodule-foreach.sh | 36 ++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 24914963ca2..331d71c908b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -345,7 +345,6 @@ cmd_foreach()
prefix="$prefix$sm_path/"
sanitize_submodule_env
cd "$sm_path" &&
-   sm_path=$(git submodule--helper relative-path 
"$sm_path" "$wt_prefix") &&
# we make $path available to scripts ...
path=$sm_path &&
if test $# -eq 1
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 6ba5daf42ee..5144cc6926b 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -82,9 +82,9 @@ test_expect_success 'test basic "submodule foreach" usage' '
 
 cat >expect  expect <

[PATCH 0/5] Rebooting pc/submodule-helper-foreach

2018-05-02 Thread Stefan Beller
The "What's cooking" email carried this series for some time now:
> * pc/submodule-helper-foreach (2018-02-02) 5 commits
>  - submodule: port submodule subcommand 'foreach' from shell to C
> - submodule foreach: document variable '$displaypath'
>  - submodule foreach: clarify the '$toplevel' variable documentation
>  - submodule foreach: document '$sm_path' instead of '$path'
>  - submodule foreach: correct '$path' in nested submodules from a subdirectory
> 
>  Expecting a response to review comments
>  e.g. cf. <20180206150044.1bffbb573c088d38c8e44...@google.com>

I reworded the commit message of the first patch and nearly confused
myself again, as "toplevel" doesn't refer to the "topmost" superproject,
just the direct superproject of the submodule.

However I think the code of the first patch is correct as originally presented.
Just the wording of the commit message was changed to explain the reasoning
more extensively.

With this series, we get
* keep the invariant of $toplevel + $path to be an absolute path that is
  correctly pointing at the submodule. "git -C $toplevel config" + $name
  allowing to ask configuration of the submodule.  
* $displaypath will have the relative path form $pwd to the submodule root.
* better documentation.

In later patches we could add $topmost, that points at the superproject
in which the command was started from, and $path_from_topmost, that would
be the relative path from $topmost to the submodule, potentially skipping
intermediate superprojects.

Thanks,
Stefan

Prathamesh Chavan (5):
  submodule foreach: correct '$path' in nested submodules from a
subdirectory
  submodule foreach: document '$sm_path' instead of '$path'
  submodule foreach: clarify the '$toplevel' variable documentation
  submodule foreach: document variable '$displaypath'
  submodule: port submodule subcommand 'foreach' from shell to C

 Documentation/git-submodule.txt |  15 ++--
 builtin/submodule--helper.c | 148 
 git-submodule.sh|  40 +
 t/t7407-submodule-foreach.sh|  38 +++-
 4 files changed, 194 insertions(+), 47 deletions(-)

-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 2/5] submodule foreach: document '$sm_path' instead of '$path'

2018-05-02 Thread Stefan Beller
From: Prathamesh Chavan 

As using a variable '$path' may be harmful to users due to
capitalization issues, see 64394e3ae9 (git-submodule.sh: Don't
use $path variable in eval_gettext string, 2012-04-17). Adjust
the documentation to advocate for using $sm_path,  which contains
the same value. We still make the 'path' variable available and
document it as a deprecated synonym of 'sm_path'.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-submodule.txt | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 71c5618e82a..755ed695f08 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -183,12 +183,14 @@ information too.
 
 foreach [--recursive] ::
Evaluates an arbitrary shell command in each checked out submodule.
-   The command has access to the variables $name, $path, $sha1 and
+   The command has access to the variables $name, $sm_path, $sha1 and
$toplevel:
$name is the name of the relevant submodule section in `.gitmodules`,
-   $path is the name of the submodule directory relative to the
-   superproject, $sha1 is the commit as recorded in the superproject,
-   and $toplevel is the absolute path to the top-level of the superproject.
+   $sm_path is the path of the submodule as recorded in the superproject,
+   $sha1 is the commit as recorded in the superproject, and
+   $toplevel is the absolute path to the top-level of the superproject.
+   Note that to avoid conflicts with '$PATH' on Windows, the '$path'
+   variable is now a deprecated synonym of '$sm_path' variable.
Any submodules defined in the superproject but not checked out are
ignored by this command. Unless given `--quiet`, foreach prints the name
of each submodule before evaluating the command.
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 3/5] submodule foreach: clarify the '$toplevel' variable documentation

2018-05-02 Thread Stefan Beller
From: Prathamesh Chavan 

It does not contain the topmost superproject as the author assumed,
but the direct superproject, such that $toplevel/$sm_path is the
actual absolute path of the submodule.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Junio C Hamano 
Signed-off-by: Stefan Beller 
---
 Documentation/git-submodule.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 755ed695f08..408d5a0387f 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -188,7 +188,8 @@ foreach [--recursive] ::
$name is the name of the relevant submodule section in `.gitmodules`,
$sm_path is the path of the submodule as recorded in the superproject,
$sha1 is the commit as recorded in the superproject, and
-   $toplevel is the absolute path to the top-level of the superproject.
+   $toplevel is the absolute path to its direct superproject, such that
+   $toplevel/$sm_path is the absolute path of the submodule.
Note that to avoid conflicts with '$PATH' on Windows, the '$path'
variable is now a deprecated synonym of '$sm_path' variable.
Any submodules defined in the superproject but not checked out are
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 4/5] submodule foreach: document variable '$displaypath'

2018-05-02 Thread Stefan Beller
From: Prathamesh Chavan 

It was observed that the variable '$displaypath' was accessible but
undocumented. Hence, document it.

Discussed-with: Ramsay Jones 
Signed-off-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Junio C Hamano 
---
 Documentation/git-submodule.txt |  6 --
 t/t7407-submodule-foreach.sh| 22 +++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 408d5a0387f..4372d00c42e 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -183,10 +183,12 @@ information too.
 
 foreach [--recursive] ::
Evaluates an arbitrary shell command in each checked out submodule.
-   The command has access to the variables $name, $sm_path, $sha1 and
-   $toplevel:
+   The command has access to the variables $name, $sm_path, $displaypath,
+   $sha1 and $toplevel:
$name is the name of the relevant submodule section in `.gitmodules`,
$sm_path is the path of the submodule as recorded in the superproject,
+   $displaypath contains the relative path from the current working
+   directory to the submodules root directory,
$sha1 is the commit as recorded in the superproject, and
$toplevel is the absolute path to its direct superproject, such that
$toplevel/$sm_path is the absolute path of the submodule.
diff --git a/t/t7407-submodule-foreach.sh b/t/t7407-submodule-foreach.sh
index 5144cc6926b..77729ac4aa1 100755
--- a/t/t7407-submodule-foreach.sh
+++ b/t/t7407-submodule-foreach.sh
@@ -82,16 +82,16 @@ test_expect_success 'test basic "submodule foreach" usage' '
 
 cat >expect <../../actual
+   git submodule foreach "echo 
\$toplevel-\$name-\$sm_path-\$displaypath-\$sha1" >../../actual
) &&
test_i18ncmp expect actual
 '
@@ -206,25 +206,25 @@ submodulesha1=$(cd 
clone2/nested1/nested2/nested3/submodule && git rev-parse HEA
 
 cat >expect <../../actual
+   git submodule foreach --recursive "echo toplevel: \$toplevel 
name: \$name path: \$sm_path displaypath: \$displaypath hash: \$sha1" 
>../../actual
) &&
test_i18ncmp expect actual
 '
-- 
2.17.0.441.gb46fe60e1d-goog



[PATCH 5/5] submodule: port submodule subcommand 'foreach' from shell to C

2018-05-02 Thread Stefan Beller
From: Prathamesh Chavan 

This aims to make git-submodule foreach a builtin. 'foreach' is ported to
the submodule--helper, and submodule--helper is called from
git-submodule.sh.

Helped-by: Brandon Williams 
Mentored-by: Christian Couder 
Mentored-by: Stefan Beller 
Signed-off-by: Prathamesh Chavan 
Signed-off-by: Stefan Beller 
---
 builtin/submodule--helper.c | 148 
 git-submodule.sh|  39 +-
 2 files changed, 149 insertions(+), 38 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index a404df3ea49..bbbea5868de 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -439,6 +439,153 @@ static void for_each_listed_submodule(const struct 
module_list *list,
fn(list->entries[i], cb_data);
 }
 
+struct cb_foreach {
+   int argc;
+   const char **argv;
+   const char *prefix;
+   char *toplevel;
+   int quiet;
+   int recursive;
+};
+#define CB_FOREACH_INIT { 0 }
+
+static void runcommand_in_submodule_cb(const struct cache_entry *list_item,
+  void *cb_data)
+{
+   struct cb_foreach *info = cb_data;
+   const char *path = list_item->name;
+   const struct object_id *ce_oid = &list_item->oid;
+
+   const struct submodule *sub;
+   struct child_process cp = CHILD_PROCESS_INIT;
+   char *displaypath;
+
+   displaypath = get_submodule_displaypath(path, info->prefix);
+
+   sub = submodule_from_path(&null_oid, path);
+
+   if (!sub)
+   die(_("No url found for submodule path '%s' in .gitmodules"),
+   displaypath);
+
+   if (!is_submodule_populated_gently(path, NULL))
+   goto cleanup;
+
+   prepare_submodule_repo_env(&cp.env_array);
+
+   /*
+   * For the purpose of executing  in the submodule,
+   * separate shell is used for the purpose of running the
+   * child process.
+   */
+   cp.use_shell = 1;
+   cp.dir = path;
+
+   /*
+   * NEEDSWORK: the command currently has access to the variables $name,
+   * $sm_path, $displaypath, $sha1 and $toplevel only when the command
+   * contains a single argument. This is done for maintaining a faithful
+   * translation from shell script.
+   */
+   if (info->argc == 1) {
+   char *toplevel = xgetcwd();
+
+   argv_array_pushf(&cp.env_array, "name=%s", sub->name);
+   argv_array_pushf(&cp.env_array, "sm_path=%s", path);
+   argv_array_pushf(&cp.env_array, "displaypath=%s", displaypath);
+   argv_array_pushf(&cp.env_array, "sha1=%s",
+   oid_to_hex(ce_oid));
+   argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
+
+   /*
+   * Since the path variable was accessible from the script
+   * before porting, it is also made available after porting.
+   * The environment variable "PATH" has a very special purpose
+   * on windows. And since environment variables are
+   * case-insensitive in windows, it interferes with the
+   * existing PATH variable. Hence, to avoid that, we expose
+   * path via the args argv_array and not via env_array.
+   */
+   argv_array_pushf(&cp.args, "path=%s; %s",
+   path, info->argv[0]);
+
+   free(toplevel);
+   } else {
+   argv_array_pushv(&cp.args, info->argv);
+   }
+
+   if (!info->quiet)
+   printf(_("Entering '%s'\n"), displaypath);
+
+   if (info->argv[0] && run_command(&cp))
+   die(_("run_command returned non-zero status for %s\n."),
+   displaypath);
+
+   if (info->recursive) {
+   struct child_process cpr = CHILD_PROCESS_INIT;
+
+   cpr.git_cmd = 1;
+   cpr.dir = path;
+   prepare_submodule_repo_env(&cpr.env_array);
+
+   argv_array_pushl(&cpr.args, "--super-prefix", NULL);
+   argv_array_pushf(&cpr.args, "%s/", displaypath);
+   argv_array_pushl(&cpr.args, "submodule--helper", "foreach", 
"--recursive",
+   NULL);
+
+   if (info->quiet)
+   argv_array_push(&cpr.args, "--quiet");
+
+   if (info->toplevel)
+   argv_array_pushf(&cpr.args, "--toplevel=%s", 
info->toplevel);
+
+   argv_array_pushv(&cpr.args, info->argv);
+
+   if (run_command(&cpr))
+   die(_("run_command returned non-zero status while"
+   "recursing in the nested submodules of %s\n."),
+   displaypath);
+   }
+
+cleanup:
+   free(displaypath);
+}
+
+static int module_foreach(int argc, const char **argv, const char *prefix)
+{
+   

Re: [PATCH 5/5] submodule: port submodule subcommand 'foreach' from shell to C

2018-05-02 Thread Stefan Beller
On Wed, May 2, 2018 at 5:53 PM, Stefan Beller  wrote:

> +struct cb_foreach {
> +   char *toplevel;
...
> +   OPT_STRING(0, "toplevel", &info.toplevel, N_("path"),
> +  N_("path from the top level of the invocation")),

This is a leftover from my experimentation that I hinted at in the cover letter
that would be used to implement $topmost.  I'll remove this in a reroll.


Re: Blame / annotate with mixed mac line endings?

2018-05-02 Thread Junio C Hamano
"Jason Pyeron"  writes:

> Any way to hit git with a stick to treat lone CR as a line break for 
> blame/annotate?

I highly suspect that you would get more help from those whose love
is Git if your inquiry were about a way to ask Git politely to do
what you want to achieve, rather than hitting it with a stick ;-)

Perhaps define a textconv filter to fix the incorrect line endings?
That way, you do not have to rewrite the history only to run blame,
but can pretend as if the contents were all using consistent line
endings (either LF or CRLF) retroactively.



Re: [GSoC] Info: new blog series of my work on Git GSoC '18

2018-05-02 Thread Pratik Karki
On Thu, May 3, 2018 at 6:31 AM, Andrew Ardill  wrote:
> On 2 May 2018 at 17:12, Johannes Schindelin  
> wrote:
>> Hi Pratik,
>>
>> On Wed, 2 May 2018, Pratik Karki wrote:
>>
>>> As promised in my proposal, I've started
>>> to write a blog series of GSoC '18 with Git. The initial blog is up.
>>> You can find it here[1]. The initial one is just to get started and
>>> from next iterations, I'll start detailing of my work towards converting
>>> rebase to builtin.
>>>
>>> [1]: https://prertik.github.io/categories/git/
>>
>> This is awesome! Thanks for doing this,
>> Dscho
>
> Agreed, was fun to read.
>
> I'd encourage you to post to the list when you blog, or perhaps
> include a link to the blog as part of any regular updates you give on
> your project progress.
>
> Would also make for an interesting addition to the newsletter.
>
> I know it can be difficult to sit down and write about what you're
> doing, especially when it feels like you could be focusing on 'real
> work'. Hopefully you will find the process rewarding; I'm looking
> forward to reading about what you find easy and hard, how you learn
> the git developer processes, and the challenges you find in converting
> shell scripts to a built-in. I'm sure other people are too, and I'll
> bet the ones who have been there before will have feedback for you as
> well.
>
> I'd find it interesting even if it was a 5-line bullet list of what's
> going through your mind with respect to the project! Looking forward
> to following along.
>
> Regards,
>
> Andrew Ardill

Thank you. Feedbacks mean a lot to me.
I will be writing a weekly blog of the progress
I made during the week. I will try my best to detail
things I did and intend to do. I will surely notify
everyone as soon as I publish a post.

Cheers,
Pratik Karki


RE: Blame / annotate with mixed mac line endings?

2018-05-02 Thread Jason Pyeron
> -Original Message-
> From: Junio C Hamano
> Sent: Wednesday, May 2, 2018 21:22
> Subject: Re: Blame / annotate with mixed mac line endings?
> 
> "Jason Pyeron"  writes:
> 
> > Any way to hit git with a stick to treat lone CR as a line 
> break for blame/annotate?

> I highly suspect that you would get more help from those whose love
> is Git if your inquiry were about a way to ask Git politely to do
> what you want to achieve, rather than hitting it with a stick ;-)

No offense meant.

> Perhaps define a textconv filter to fix the incorrect line endings?

Worked perfectly! I Read:

https://github.com/git/git/blob/master/t/t8006-blame-textconv.sh

Added to .git/config:

[diff "test"]
textconv = /usr/local/bin/helper.sh

Added to .gitattributes:
*.cfm diff=test

$ cat /usr/local/bin/helper.sh
cat "$1" | mac2unix

The important issue was to not use mac2unix directly, because it modifies the 
file itself.

Read: https://git.wiki.kernel.org/index.php/Textconv but it did not help so 
much.

Thanks,

Jason

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-   -
- Jason Pyeron  PD Inc. http://www.pdinc.us -
- Principal Consultant  10 West 24th Street #100-
- +1 (443) 269-1555 x333Baltimore, Maryland 21218   -
-   -
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- 



Re: Blame / annotate with mixed mac line endings?

2018-05-02 Thread Junio C Hamano
"Jason Pyeron"  writes:

> $ cat /usr/local/bin/helper.sh
> cat "$1" | mac2unix
>
> The important issue was to not use mac2unix directly, because it
> modifies the file itself.

I wonder if

[diff "test"]
textconv = sh -c 'mac2unix <"$1"' -IAmArgv0-

works without an extra helper.



Re: [PATCH v3 04/12] cache.h: add comment explaining the order in object_type

2018-05-02 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> The order in the enum might seem arbitrary, and isn't explained by
> 72518e9c26 ("more lightweight revalidation while reusing deflated
> stream in packing", 2006-09-03) which added it.
>
> Derrick Stolee suggested that it's ordered topologically in
> that as a comment.
> 5f8b1ec1-258d-1acc-133e-a7c248b40...@gmail.com. Makes sense to me, add

When referring to a message-id, please do not omit surrounding <>,
which is part of the message-id string.  That's like writing your
e-mail address as avarab gmail.com without at sign.

>  enum object_type {
>   OBJ_BAD = -1,
>   OBJ_NONE = 0,
> + /*
> +  * Why have our our "real" object types in this order? They're
> +  * ordered topologically:
> +  *
> +  * tag(4)-> commit(1), tree(2), blob(3)
> +  * commit(1) -> tree(2)
> +  * tree(2)   -> blob(3)
> +  */

I am not sure if the above makes sense at all in explaining tag.
With all others, type with smaller type id can refer to another type
that is with equal or larger type id (tree can refer to another
tree).  If tag had the smallest ID among all, it would have made
sense, though.

Before anybody confused raises noise, a gitlink records a commit in
a tree, which may seem to contradict the rule even more, but it
merely records a commit, without "referring" to it in the same sense
as other reference that require connectivity.

>   OBJ_COMMIT = 1,
>   OBJ_TREE = 2,
>   OBJ_BLOB = 3,


Re: [PATCH v3 06/12] get_short_oid: sort ambiguous objects by type, then SHA-1

2018-05-02 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> + /*
> +  * Between object types show tags, then commits, and finally
> +  * trees and blobs.
> +  *
> +  * The object_type enum is commit, tree, blob, tag, but we
> +  * want tag, commit, tree blob. Cleverly (perhaps too

The missing comma between "tree blob"  on the second line made me
read this comment twice, which made me notice the lack of "and"
before "tag" on the previous line.

Assignment is "commit, tree, blob, and then tag" but we want "tag,
commit, tree and then blob", perhaps?


Re: [PATCH v2] wt-status: use rename settings from init_diff_ui_defaults

2018-05-02 Thread Eckhard Maaß
On Tue, May 01, 2018 at 05:08:27PM -0700, Elijah Newren wrote:
> Eckhard, can you add some comments to your commit message mentioning
> the email pointed to by Junio about break detection and rename
> detection being unsafe to use together, as well as the inconsistencies
> in break detection between different commands?

I will work on that.

Greetings,
Eckhard


Re: [PATCH v3 09/12] get_short_oid / peel_onion: ^{tree} should be tree, not treeish

2018-05-02 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> But ^{tree} shows just the trees, but would previously be equivalent
> to the above:
>
> $ git rev-parse e8f2^{tree}
> error: short SHA1 e8f2 is ambiguous
> hint: The candidates are:
> hint:   e8f2093055 tree
> hint:   e8f25a3a50 tree
> hint:   e8f28d537c tree
> hint:   e8f2cf6ec0 tree
> [...]

When a user says "git $cmd e8f2^{tree}", the user is telling Git
that s/he knows e8f2 *is* a tree-ish, but for whatever reason $cmd
wants a tree and does not accept an arbitrary tree-ish---that is the
whole piont of appending ^{tree} as a suffix.  A useful hint in such
a case would be "oh, you said e8f2 is a tree-ish, but there are more
than one tree-ish, so let me show them to you to help you decide
which one among them is the one you meant".  When $cmd is rev-parse,
I would even say that the user is saying "I know e8f2 is a tree-ish,
and I know it not a tree--it merely is a tree-ish.  I want the tree
that e8f2 thing points at".

Limiting that hint to show only real trees does not make any sense
to me.  I do not think we care _too_ deeply, because most of the
time, command line location that expects a tree-ish can be given a
tree-ish, so there is not much reason to use ^{tree} suffix these
days.  But in a case where it _does_ matter, I think this change
makes the "hint" almost useless.

Or am I misleading what you wanted to achieve with this patch?


Re: [PATCH v3 00/12] get_short_oid UI improvements

2018-05-02 Thread Jacob Keller
On Wed, May 2, 2018 at 6:45 AM, Derrick Stolee  wrote:
> On 5/2/2018 8:42 AM, Derrick Stolee wrote:
>>
>> On 5/1/2018 2:40 PM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> The biggest change in v3 is the no change at all to the code, but a
>>> lengthy explanation of why I didn't go for Derrick's simpler
>>> implementation. Maybe I'm wrong about that, but I felt uneasy
>>> offloading undocumented (or if I documented it, it would only be for
>>> this one edge-case) magic on the oid_array API. Instead I'm just
>>> making this patch a bit more complex.
>>
>>
>> I think that's fair. Thanks for going along with me on the thought
>> experiment.
>
>
> Also, v3 looks good to me.
>
> Reviewed-by: Derrick Stolee 
>

I also reviewed this, and it looks good to me as well.

Thanks,
Jake


Re: Generate more revenue from Git

2018-05-02 Thread Michal Sapozhnikov
Hello,

Following up on my last email,

It would be great to setup a call this week.

Looking forward to your response.

Best Regards,
-- 
Michal Sapozhnikov | Business Manager, Luminati SDK | +972-50-2826778 | Skype:
live:michals_43
http://luminati.io/sdk

On 15-Apr-18 14:14, 7d (by eremind) wrote:
> Hi,
> 
> ​​My name is Michal and I lead the SDK partnerships at Luminati.​ I assume 
> your
> software earns money by charging users for a premium subscription or by 
> showing
> ads - both models do not pay out much and harm the user experience.
> 
> We now offer you a third option.
> 
> Luminati’s monetization SDK for Windows desktop provides your users the option
> to use the software for free, and in exchange we pay you $3,000 USD per month,
> for every 100K daily active users.
> Luminati is the largest proxy network, having millions of IPs based on other
> developers who embedded our SDK. More information is available on
> http://luminati.io/sdk_win.
> 
> I’d like to schedule a 15 minute call to let you know how we can start. Are 
> you
> available tomorrow at 12:30pm your local time?
> 
> Best Regards,
> Michal
>