* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> On Sun, 17 Apr 2005, Ingo Molnar wrote:
> >
> > in fact, this attack cannot even be proven to be malicious, purely via
> > the email from Malice: it could be incredible bad luck that caused that
> > good-looking patch to be mistakenly matching a da
> So until proven otherwise, I worry about accidental hashes, and in
> 160 bits of good hashing, that just isn't an issue either...[Going
> from 128 bits to 160 bits made it] so _unbelievably_ less likely to
> happen that it's not even funny.
You are right. Here's how I learnt to stop worrying an
Dear diary, on Mon, Apr 18, 2005 at 02:49:06AM CEST, I got a letter
where Herbert Xu <[EMAIL PROTECTED]> told me that...
> Therefore the only conclusion I can draw is that we're only calling
> update-cache on the set of changed files, or at most a small superset
> of them. In that case, the cost o
On Mon, Apr 18, 2005 at 01:34:41AM +0200, Petr Baudis wrote:
>
> No. The collision check is done in the opposite cache - when you want to
> write a blob and there is already a file of the same hash in the tree.
> So either the blob is already in the database, or you have a collision.
> Therefore, t
Petr Baudis wrote:
Dear diary, on Mon, Apr 18, 2005 at 01:29:05AM CEST, I got a letter
where Herbert Xu <[EMAIL PROTECTED]> told me that...
I get the feeling that it isn't that bad. For example, if we did it
at the points where the blobs actually entered the tree, then the cost
is always proportio
On Mon, 18 Apr 2005, Herbert Xu wrote:
>
> I wasn't disputing that of course. However, the same effect can be
> achieved in using a single hash with a bigger length, e.g., sha256
> or sha512.
No it cannot.
If somebody actually literally totally breaks that hash, length won't
matter. There ar
Dear diary, on Mon, Apr 18, 2005 at 01:29:05AM CEST, I got a letter
where Herbert Xu <[EMAIL PROTECTED]> told me that...
> I get the feeling that it isn't that bad. For example, if we did it
> at the points where the blobs actually entered the tree, then the cost
> is always proportional to the ch
On Sun, Apr 17, 2005 at 03:35:17PM -0700, Linus Torvalds wrote:
>
> Quite the reverse. Again, you bring up totally theoretical arguments. In
> _practice_ it has indeed been shown that using two hashes _does_ catch
> hash colissions.
>
> The trivial example is using md5 sums with a length. The "
On Mon, 18 Apr 2005, Herbert Xu wrote:
>
> Sorry, it has already been shown that combining two difference hashes
> doesn't necessarily provide the security that you would hope.
Sorry, that's not true.
Quite the reverse. Again, you bring up totally theoretical arguments. In
_practice_ it has i
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> If we want to have any kind of confidence that the hash is reall
> yunbreakable, we should make it not just longer than 160 bits, we should
> make sure that it's two or more hashes, and that they are based on totally
> different principles.
Sorry, it
On Sun, 17 Apr 2005, Ingo Molnar wrote:
>
> in fact, this attack cannot even be proven to be malicious, purely via
> the email from Malice: it could be incredible bad luck that caused that
> good-looking patch to be mistakenly matching a dangerous object.
I really hate theoretical discussions
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> The compromise relies on you having reviewed something harmless, while
> in reality what happened within the DB was far less harmless. And the
> DB remains self-consistent: neither fsck, nor others importing your
> tree will be able to detect the comp
* Brad Roberts <[EMAIL PROTECTED]> wrote:
> While I agree that a hash collision is bad and certainly worth
> preventing during new object creation, for it to actually implant a
> trojan in a build successfully it'd have to meet even more criteria
> than you've layed out. It'd have to...
> -
lt;[EMAIL PROTECTED]>, git@vger.kernel.org
> Subject: Re: Re: Merge with git-pasky II.
>
>
> * Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> > Almost all attacks on sha1 will depend on _replacing_ a file with a
> > bogus new one. So guys, instead of using sha256 or goi
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> Almost all attacks on sha1 will depend on _replacing_ a file with a
> bogus new one. So guys, instead of using sha256 or going overboard,
> just make sure that when you synchronize, you NEVER import a file you
> already have.
here is a bit complex
On Sat, 2005-04-16 at 17:33 +0200, Johannes Schindelin wrote:
> > But if it can be done cheaply enough at a later date even though we end
> > up repeating ourselves, and if it can be done _well_ enough that we
> > shouldn't have just asked the user in the first place, then yes, OK I
> > agree.
>
>
On Sat, 16 Apr 2005, Linus Torvalds wrote:
Almost all attacks on sha1 will depend on _replacing_ a file with a bogus
new one. So guys, instead of using sha256 or going overboard, just make
sure that when you synchronize, you NEVER import a file you already have.
It's really that simple. Add "--igno
On Sat, 16 Apr 2005, Sanjoy Mahajan wrote:
>
> I like this elegant approach, but clever pattern matching can help even
> at commit time. Suppose hello.c is simply:
Here, what you're talking about is not "commit", but "merge".
The git model very much separates the two events. You first generat
> And that "where did this come from" decision should be done at _search_
> time, not commit time.
I like this elegant approach, but clever pattern matching can help even
at commit time. Suppose hello.c is simply:
printf ("Hello %d\n", year);
And then developer A updates hello.c to:
printf
On Sat, Apr 16, 2005 at 06:03:33PM +0200, Petr Baudis wrote:
> Dear diary, on Sat, Apr 16, 2005 at 05:55:37PM CEST, I got a letter
> where Simon Fowler <[EMAIL PROTECTED]> told me that...
> > On Sat, Apr 16, 2005 at 05:19:24AM -0700, David Lang wrote:
> > > Simon
> > >
> > > given that you have mu
On Sat, 16 Apr 2005, Petr Baudis wrote:
> Dear diary, on Sat, Apr 16, 2005 at 05:55:37PM CEST, I got a letter
> where Simon Fowler <[EMAIL PROTECTED]> told me that...
>
> > The id is a sha1 hash of the current time and the full path of the
> > file being added - the chances of that being replica
Dear diary, on Sat, Apr 16, 2005 at 05:55:37PM CEST, I got a letter
where Simon Fowler <[EMAIL PROTECTED]> told me that...
> On Sat, Apr 16, 2005 at 05:19:24AM -0700, David Lang wrote:
> > Simon
> >
> > given that you have multiple machines creating files, how do you deal with
> > the idea of the
On Sat, Apr 16, 2005 at 05:19:24AM -0700, David Lang wrote:
> Simon
>
> given that you have multiple machines creating files, how do you deal with
> the idea of the same 'unique id' being assigned to different files by
> different machines?
>
The id is a sha1 hash of the current time and the fu
Hi,
On Fri, 15 Apr 2005, David Woodhouse wrote:
> But if it can be done cheaply enough at a later date even though we end
> up repeating ourselves, and if it can be done _well_ enough that we
> shouldn't have just asked the user in the first place, then yes, OK I
> agree.
The repetition could be
On Fri, Apr 15, 2005 at 08:32:46AM -0700, Linus Torvalds wrote:
In other words, I'm right. I'm always right, but sometimes I'm more
right
than other times. And dammit, when I say "files don't matter", I'm
really
really Right(tm).
You're right, of course (All Hail Linus!), if you can make it work
On Fri, Apr 15, 2005 at 08:32:46AM -0700, Linus Torvalds wrote:
> In other words, I'm right. I'm always right, but sometimes I'm more right
> than other times. And dammit, when I say "files don't matter", I'm really
> really Right(tm).
>
You're right, of course (All Hail Linus!), if you can make
On Fri, 15 Apr 2005, Barry Silverman wrote:
>
> The issue I am trying to come to grips with in the current design, is
> that the git repository of a number of interrelated projects will soon
> become the logical OR of all blobs, commits, and trees in ALL the
> projects.
Nope. I'm actually again
mbing
components.
Hence my question
-Original Message-
From: Linus Torvalds [mailto:[EMAIL PROTECTED]
Sent: Friday, April 15, 2005 4:31 PM
To: Barry Silverman
Cc: git@vger.kernel.org
Subject: RE: Merge with git-pasky II.
[ I'm cc'ing the git list even though Barry'
> "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
PB> I can't see the conflicts between what I want and what Linus wants.
PB> After all, Linus says that I can use the directory cache in any way I
PB> please (well, the user can, but I'm speaking for him ;-). So I'm doing
PB> so, and with your t
On Fri, 15 Apr 2005, Junio C Hamano wrote:
>
> I was looking at merge-tree.c last night to add recursive
> behaviour (my favorite these days ;-) to it [*1*].
>
> But then I started thinking.
Always good.
> LT> ... For each entry in the directory it says either
> LT> select path
> LT> or
>
Dear diary, on Fri, Apr 15, 2005 at 12:22:26PM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> After I re-read [*R1*], in which Linus talks about dircache,
> especially this section:
>
> - The "current directory cache" describes some baseline. In particular,
>n
[ I'm cc'ing the git list even though Barry's question wasn't cc'd.
Because I think his question is interesting and astute per se, even
if I disagree with the proposal ]
On Fri, 15 Apr 2005, Barry Silverman wrote:
>
> If git is totally project based, and each commit represents total state
>
> "LT" == Linus Torvalds <[EMAIL PROTECTED]> writes:
LT> In the meantime I wrote a very stupid "merge-tree" which
LT> does things slightly differently, but I really think your
LT> approach (aka my original approach) is actually a lot
LT> faster. I was just starting to worry that the ball didn'
Dear diary, on Fri, Apr 15, 2005 at 02:58:25AM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> > "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
> >> I think the above would result in what SCM person would call
> >> "merge upstream/sidestream changes into my work
These notions that one can always best answer questions by looking at
the content, and that "Individual files DO NOT EXIST" seem over stated,
to me.
Granted, overstated for a good reason. A couple sticks of dynamite are
needed to shake loose some old SCM thinking habits.
===
Ingo has a point wh
> intra file diffs: here are two versions of the same file.
Ah so. Linus faked me out.
I was _sure_ that by "file" he meant "file" -- as in a bucket of bits
with a unique identifying .
In that message, I guess by "file" he meant "a version controlled
file, consisting of a series of content ver
On Fri, 15 Apr 2005, C. Scott Ananian wrote:
>
> I think examining the rsync algorithms should convince you that finding
> common chunks can be fairly efficient.
Note that "efficient" really depends on how good a job you want to do, so
you can tune it to how much CPU you can afford to waste o
On Fri, 15 Apr 2005, David Woodhouse wrote:
given piece of content. Also because we actually have the developer's
attention at commit time, and we can get _real_ answers from the user
about what she was doing, instead of having to guess.
Yes, but it's still hard to get *accurate* information. And
On Fri, 15 Apr 2005, Paul Jackson wrote:
Um ah ... could you explain what you mean by inter and intra file diffs?
intra file diffs: here are two versions of the same file. what changed?
inter file diffs: here is a new file, and here are *all the files in the
current committed version*. Where di
On Fri, 2005-04-15 at 08:32 -0700, Linus Torvalds wrote:
> - you're doing the work at the wrong point. Doing it _well_ is quite
>expensive. So if you do it at commit time, you cannot _afford_ to do it
>well, and you'll always fall back to doing an ass-backwards job that
>doesn't rea
Linus wrote:
> For example, just doing intra-file diffs is a lot _easier_ and less
> time-consuming than doing inter-file diffs.
Um ah ... could you explain what you mean by inter and intra file diffs?
Google found a three year old message by Andrew Morton, discussing
inter and intra file fragm
On Fri, 15 Apr 2005, David Woodhouse wrote:
>
> And when I'm looking for the change that broke something, I can almost
> always tell which file it's in and go looking in _that_ file.
Read my email about finding "what changed" that I sent out a minute ago.
I claim that my algorithm for finding
On Fri, 15 Apr 2005, David Woodhouse wrote:
>
> And you're right; it shouldn't have to be for renames only. There's no
> need for us to limit it to one "source" and one "destination"; the SCM
> can use it to track content as it sees fit.
Listen to yourself, and think about the problem for a sec
On Fri, 2005-04-15 at 07:53 -0700, Linus Torvalds wrote:
> Files DO NOT matter. Never have. It's an implementation limitation to
> think they do. You'll screw yourself up, and when somebody comes up with a
> half-way efficient way to generate inter-fiel diffs, your architecture is
> totally and
On Fri, 2005-04-15 at 16:53 +0200, Ingo Molnar wrote:
> but the specific scenario you described would require _Linus'_ tree to
> be in limbo for a long time, and have uncommitted half-done edits.
> I.e.:
>
>(A1B2)--(A2B2)--(A2'B3)
> / \ /\
>/\ / \
> (A1
* David Woodhouse <[EMAIL PROTECTED]> wrote:
> On Fri, 2005-04-15 at 11:36 +0200, Ingo Molnar wrote:
> > do such cases occur frequently? In the kernel at least it's not too
> > typical.
>
> Isn't it? I thought it was a fairly accurate representation of the
> process "I make a whole bunch of c
On Fri, 15 Apr 2005, David Woodhouse wrote:
>
> I suspect that finding the common commit is actually a per-file thing;
> it's not just something you do for the _commit_ graph, then use for
> merging each file in the two branches you're trying to merge.
I disagree.
Conceptually, you should neve
On Fri, Apr 15, 2005 at 02:03:08PM +0200, Johannes Schindelin wrote:
> I disagree. In order to be trusted, this thing has to catch the following
> scenario:
>
> Skywalker and Solo start from the same base. They commit quite a lot to
> their trees. In between, Skywalker commits a tree, where the fu
Hi,
On Fri, 15 Apr 2005, David Woodhouse wrote:
> On Thu, 2005-04-14 at 11:36 -0700, Linus Torvalds wrote:
> > And "merge these two trees" (which works on a _tree_ level)
> > or "find the common commit" (which works on a _commit_ level)
>
> I suspect that finding the common commit is actually a p
> "CL" == Christopher Li <[EMAIL PROTECTED]> writes:
CL> Then do you emit the entry for it's parents directory?
In GIT object model, directory modes do not matter. It is not
designed to record directories, and running "update-cache --add
foo" when foo is a directory fails.
The data model of
After I re-read [*R1*], in which Linus talks about dircache,
especially this section:
- The "current directory cache" describes some baseline. In particular,
note the "some" part. It's not tied to any special baseline, and you
can change your baseline any way you please.
So it does NOT
On Fri, 2005-04-15 at 11:36 +0200, Ingo Molnar wrote:
> do such cases occur frequently? In the kernel at least it's not too
> typical.
Isn't it? I thought it was a fairly accurate representation of the
process "I make a whole bunch of changes to files I maintain, pulling
from Linus while occasio
On Thu, 2005-04-14 at 17:42 -0700, Linus Torvalds wrote:
> I've not even been convinved that renames are worth it. Nobody has
> really given a good reason why.
>
> There are two reasons for renames I can think of:
>
> - space efficiency in delta-based trees.
> - "annotate".
Neither of those we
On Fri, Apr 15, 2005 at 12:43:47AM -0700, Junio C Hamano wrote:
> > "CL" == Christopher Li <[EMAIL PROTECTED]> writes:
>
> CL> Is that SHA1 for tree or the file object?
>
> I am talking about a single file here.
>
Then do you emit the entry for it's parents directory?
e.g. /foo/bar get creat
* David Woodhouse <[EMAIL PROTECTED]> wrote:
> Consider a simple repository which contains two files A and B. We
> start off with the first version of each ('A1B1'), and the owner of
> each file takes a branch and modifies their own file. There is
> cross-pulling between the two, and then each
On Thu, 2005-04-14 at 11:36 -0700, Linus Torvalds wrote:
> And "merge these two trees" (which works on a _tree_ level)
> or "find the common commit" (which works on a _commit_ level)
I suspect that finding the common commit is actually a per-file thing;
it's not just something you do for the _comm
> "CL" == Christopher Li <[EMAIL PROTECTED]> writes:
>> - Result is this object $SHA1 with mode $mode at $path (takes
>> one of the trees); you can do update-cache --cacheinfo (if
>> you want to muck with dircache) or cat-file blob (if you want
>> to get the file) or both.
CL> Is that SHA1 fo
iginal Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Linus Torvalds
Sent: Thursday, April 14, 2005 8:43 PM
To: Junio C Hamano
Cc: Petr Baudis; git@vger.kernel.org
Subject: Re: Merge with git-pasky II.
On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> You say "me
On Thu, Apr 14, 2005 at 05:58:25PM -0700, Junio C Hamano wrote:
>
> I do like, however, the idea of separating the step of doing any
> checkout/merge etc. and actually doing them. So the command set
> of parse-your-output needs to be defined. Based on what I have
> done so far, it would consist
> "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
>> I think the above would result in what SCM person would call
>> "merge upstream/sidestream changes into my working directory".
PB> And that's exactly what I'm doing now with git merge. ;-) In fact,
PB> ideally the whole change in my scripts
On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> You say "merge these two trees" above (I take it that you mean
> "merge these two trees, taking account of this tree as their
> common ancestor", so actually you are dealing with three trees),
Yes. We're definitely talking three trees.
> and I am t
BTW, I am not competing with Junio script. If that is the way
we all agree on. It is should be very easy for Junio to fix his
perl script. right?
Chris
On Thu, Apr 14, 2005 at 04:37:17PM -0400, Christopher Li wrote:
> Is that some thing you want to see? Maybe clean up the error printing.
>
>
>
Is that some thing you want to see? Maybe clean up the error printing.
Chris
--- /dev/null 2003-01-30 05:24:37.0 -0500
+++ merge.py2005-04-14 16:34:39.0 -0400
@@ -0,0 +1,76 @@
+#!/usr/bin/env python
+
+import re
+import sys
+import os
+from pprint import pprint
+
+def get_t
On Fri, Apr 15, 2005 at 01:31:59AM +0200, Petr Baudis wrote:
> > I am just trying to follow my understanding of what Linus
> > wanted. One of the guiding principle is to do as much things as
> > in dircache without ever checking things out or touching working
> > files unnecessarily.
>
> I'm just
Hi Junio,
I think if the merge tree belong to plumbing, you can do
even less in the merge.perl. You can just print out the
instruction for the upper level SCM what to to without
actually doing it yourself.
So you don't have to do touch anything in the tree.
That is the way I use in my previous py
Dear diary, on Fri, Apr 15, 2005 at 01:12:34AM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> > "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
>
> PB> What I would like your script to do is therefore just do the
> PB> merge in a given already prepared (includi
> "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
PB> What I would like your script to do is therefore just do the
PB> merge in a given already prepared (including built index)
PB> directory, with a passed base. The base should be determined
PB> by a separate tool (I already saw some patches);
On Thu, Apr 14, 2005 at 11:12:35AM -0700, Junio C Hamano wrote:
> > "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
>
> At this moment in the script, we have run "read-tree" the
> ancestor so the dircache has the original. %tree0 and %tree1
> both did not touch the path ($_ here) so it is the
Dear diary, on Thu, Apr 14, 2005 at 10:23:26PM CEST, I got a letter
where Erik van Konijnenburg <[EMAIL PROTECTED]> told me that...
> On Thu, Apr 14, 2005 at 09:35:07PM +0200, Petr Baudis wrote:
> > Hmm. I actually don't like this naming. I think it's not too consistent,
> > is irregular, therefore
On Thu, Apr 14, 2005 at 09:35:07PM +0200, Petr Baudis wrote:
> Hmm. I actually don't like this naming. I think it's not too consistent,
> is irregular, therefore parsing it would be ugly. What I propose:
>
> 12c\tname <- legend
> <- original file
> D <- tree #1 removed file
> D
Dear diary, on Thu, Apr 14, 2005 at 09:59:04PM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> > "LT" == Linus Torvalds <[EMAIL PROTECTED]> writes:
>
> LT> On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> >> Sorry, I have not seen what you have been doing since p
> "LT" == Linus Torvalds <[EMAIL PROTECTED]> writes:
LT> On Thu, 14 Apr 2005, Junio C Hamano wrote:
>> Sorry, I have not seen what you have been doing since pasky 0.3,
>> and I have not even started to understand the mental model of
>> the world your tool is building. That said, my gut feeli
Dear diary, on Thu, Apr 14, 2005 at 08:12:35PM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> > "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
>
> PB> Bah, you outran me. ;-)
>
> Just being in a different timezone, I guess.
>
> PB> I'll change it to use the
On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> Sorry, I have not seen what you have been doing since pasky 0.3,
> and I have not even started to understand the mental model of
> the world your tool is building. That said, my gut feeling is
> that telling this script about git-pasky's world mode
> "PB" == Petr Baudis <[EMAIL PROTECTED]> writes:
PB> Bah, you outran me. ;-)
Just being in a different timezone, I guess.
PB> I'll change it to use the cool git-pasky stuff (commit-id etc) and its
PB> style of committing - that is, it will merely record the update-caches
PB> to be done upon
Dear diary, on Thu, Apr 14, 2005 at 01:14:13PM CEST, I got a letter
where Junio C Hamano <[EMAIL PROTECTED]> told me that...
> Here is a diff to update the git-merge.perl script I showed you
> earlier today ;-). It contains the following updates against
> your HEAD (bb95843a5a0f397270819462812735e
Here is a diff to update the git-merge.perl script I showed you
earlier today ;-). It contains the following updates against
your HEAD (bb95843a5a0f397270819462812735ee29796fb4).
* git-merge.perl command we talked about on the git list. I've
covered the changed-to-the-same case etc. I still
On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> I have to handle the following cases. I think I currently do
> wrong things to them:
>
> 5.1a both head modify to the same thing.
> 5.1b one head removes, the other does not do anything.
> 5.1c both head remove.
> 5.3 one head removes, the
> "LT" == Linus Torvalds <[EMAIL PROTECTED]> writes:
LT> But I'm really happy that you seem to have implemented my first
LT> suggestion and I seem to have been wasting my time.
Thanks for the kind words.
>> 5. for each path involved:
>>
>> 5.0 if neither heads change it, leave it as is;
>
On Thu, 14 Apr 2005, Junio C Hamano wrote:
>
> I now have a Perl script that uses rev-tree, cat-file,
> diff-tree, show-files (with one modification so that it can deal
> with pathnames with embedded newlines), update-cache (with one
> modification so that I can add an entry for a file that does
> "LT" == Linus Torvalds <[EMAIL PROTECTED]> writes:
LT> On that note - I've been avoiding doing the merge-tree thing, in the hope
LT> that somebody else does what I've described.
I now have a Perl script that uses rev-tree, cat-file,
diff-tree, show-files (with one modification so that it c
> Oh, my bad. I am not trying to start a language war here.
Neither am I - no problem what so ever.
Besides, I think we'd be on the same side.
My point was only a gentle one -- as is often the case when dealing with
the strange species called human, whether or not you can get away with
somethin
82 matches
Mail list logo