>>>>> "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
PB> Bah, you outran me. ;-) Just being in a different timezone, I guess. PB> I'll change it to use the cool git-pasky stuff (commit-id etc) and its PB> style of committing - that is, it will merely record the update-caches PB> to be done upon commit, and it will read-tree the branch we are merging PB> to instead of the ancestor. (So that git diff gives useful output.) Sorry, I have not seen what you have been doing since pasky 0.3, and I have not even started to understand the mental model of the world your tool is building. That said, my gut feeling is that telling this script about git-pasky's world model might be a mistake. I'd rather see you consider the script as mere "part of the plumbing". Maybe adding an extra parameter to the script to let the user explicitly specify the common ancestor to use would be needed, but I would prefer git-pasky-merge to do its own magic (converting symbolic commit names into raw commit names and such) before calling this low level script. That way people like me who have not migrated to your framework can still keep using it. All the script currently needs is a bare git object database; i.e., nothing other than what is in .git/objects and a couple of commit record SHA1s as its parameters. No .git/heads/, no .git/HEAD.local, no .git/tags, are involved for it to work, and I would prefer to keep things that way if possible. >> * show-diff updates to add -r flag to squelch diffs for files not in >> the working directory. This is mainly useful when verifying the >> result of an automated merge. PB> -r traditionally means recursive - what's the reasoning behind the PB> choice of this letter? Well, '-r' is not necessarily recursive. "ls -r" is reverse, "sort -r" is reverse. "less -r" is raw. "cat -r" is reversible. "nethack -r" is race ;-). You are thinking as an SCM person so it may look that way. "diff -r" is recursive. "darcs add -r" is recursive. But even in the SCM world, "cvs add -r" is not (it means read-only) neither "co -r" (explicit revision) ;-). I would rather pick '-q' if I were doing the patch today, but I was too tired and did not think of a letter when I wrote it. I guess '-r' stood for removed, but I agree it is a bad choice. Any objections to '-q'? PB> use strict? Not in this iteration but eventually yes. PB> It'd be simpler to do just PB> my @common = (map { $common{$_} } PB> sort { $b <=> $a } PB> keys %common) Well, actually you spotted a bug between the implementation and what I wanted to do. It should have been: map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $common{$_} => $_ ] } keys %common That is, sort [anscestor => number of times it appears] tuple by the "number of times it appears" in decreasing order, and project the resulting list to a list of ancestors. It is trying to deal with the following pattern in rev-tree output: TIMESTAMP1 EDGE1:1 ANCESTOR1:3 ANCESTOR2:3 TIMESTAMP2 EDGE2:2 ANCESTOR1:3 and when the above happens I wanted to pick up ANCESTOR1, but that was without no sound reason. PB> But I really think this is a horrible heuristic. I believe you should PB> take the latest commit in the --edges output, and from that choose the PB> base whose rev-tree --edges the_base merged_branch has the least lines PB> on output. (That is, the path to it is shortest - ideally it's already PB> part of the merged_branch.) I'll try something along that line. Honestly the ancestor selection part was what I had most trouble with. Thanks. PB> What about PB> sub OLDMODE { 0 } PB> sub NEWMODE { 1 } PB> sub NEWSHA { 2 } PB> and then using that when accessing the tuple? Would make the code PB> much more readable. Totally agreed; readability cleanup is needed, just as "use strict" you mentioned, before it is ready for public consumption. Remember, however, the primary purpose of the message was to share it with Linus so that I can ask his opinion while the script was still slushy; the contents that array contained was still changing then and was too early for symbolic constants. I'll do that in the next round. PB> It is a good idea to check merge's exit code and give a notice at the PB> end if there were any conflicts. In principle yes, but I noticed that merge already gave me a nice warning message when it found conflicts, so there was no need to do so myself in this case. See sample output: $ perl ./git-merge.perl \ 71796686221a0a56ccc25b02386ed8ea648da14d \ bb95843a5a0f397270819462812735ee29796fb4 Common ancestor: 9f02d4d233223462d3f6217b5837b786e6286ba4 O - COPYING O - README ... O - write-tree.c A M write-blob.c A M show-diff.c ... A M update-cache.c A M git-merge.perl B M merge-tree.c MRG Makefile merge: warning: conflicts during merge $ >> +# Create a temporary directory and go there. >> +system 'rm', '-rf', ',,merge-temp'; PB> Can't we call it just ,,merge? I'd rather have a command line option '-o' (scrapping the current '-o' and renaming it to something else; as you can see I am terrible at picking option names ;-)) to mean "output to this directory". I am not really an Arch person so I do not particulary care about /^,,/. How about "git~merge~$$"? >> +for ((',,merge-temp', '.git')) { mkdir $_; chdir $_; } >> +symlink "../../.git/objects", "objects"; >> +chdir '..'; >> + >> +my $ancestor_tree = read_commit_tree($common); >> +system 'read-tree', $ancestor_tree; >> + >> +my %tree0 = read_diff_tree($ancestor_tree, read_commit_tree($ARGV[0])); >> +my %tree1 = read_diff_tree($ancestor_tree, read_commit_tree($ARGV[1])); >> + >> +my @ancestor_file = read_show_files(); >> +my %ancestor_file = map { $_ => 1 } @ancestor_file; >> + >> +for (@ancestor_file) { >> + if (! exists $tree0{$_} && ! exists $tree1{$_}) { >> + if ($full_checkout) { >> + system 'checkout-cache', $_; >> + } >> + print STDERR "O - $_\n"; PB> Huh, what are you trying to do here? I think you should just record PB> remove, no? (And I wouldn't do anything with my read-tree. ;-) At this moment in the script, we have run "read-tree" the ancestor so the dircache has the original. %tree0 and %tree1 both did not touch the path ($_ here) so it is the same as ancestor. When '-f' is specified we are populating the output working tree with the merge result so that is what that 'checkout-cache' is about. "O - $path" means "we took the original". The idea is to populate the dircache of merge-temp with the merge result and leave uncertain stuff as in the common ancestor state, so that the user can fix them starting from there. Maybe it is a good time for me to summarize the output somewhere in a document. O - $path Tree-A and tree-B did not touch this; the result is taken from the ancestor (O for original). A D $path Only tree-A (or tree-B) deleted this and the other B D $path branch did not touch this; the result is to delete. A M $path Only tree-A (or tree-B) modified this and the other B M $path branch did not touch this; the result is to use one from tree-A (or tree-B). This includes file creation case. *DD $path Both tree-A and tree-B deleted this; the result is to delete. *DM $path Tree-A deleted while tree-B modified this (or *MD $path vice versa), and manual conflict resolution is needed; dircache is left as in the ancestor, and the modified file is saved as $path~A~ in the working directory. The user can rename it to $path and run show-diff to see what Tree-A wanted to do and decide before running update-cache. *MM $path Tree-A and tree-B did the exact same modification; the result is to use that. MRG $path Tree-A and tree-B have different modifications; run "merge" and the merge result is left as $path in the working directory. In cases other than *DM, *MD, and MRG, the result is trivial and is recorded in the dircache. Without '-o' (to be renamed ;-) nor '-f' there will not be a file checked out in the working directory for them. The three merge cases need human attention. The dircache is not touched in these cases and left as the ancestor version, and the working directory gets some file as described above. NOTE NOTE NOTE: I am not dealing with a case where both branches create the same file but with different contents. In such a case the current code falls into MRG path without having a common ancestor, which is nonsense---I can use /dev/null as the common ancestor, I guess. Also NOTE NOTE NOTE I need to detect the case where one branch creates a directory while the other creates a file. There is nothing an automated tool can do in that case but it needs to be detected and be told the user loudly. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html