Evgeny Kotkov via dev wrote on Sun, Jan 29, 2023 at 16:36:12 +0300:
> Daniel Shahaf <[email protected]> writes:
>
> > > That could happen after a public disclosure of a pair of executable
> > > files/scripts where the forged version allows for remote code execution.
> > > Or maybe something similar with a file format that is often stored in
> > > repositories and that can be executed or used by a build script, etc.
> > >
> >
> > Err, hang on. Your reference described a chosen-prefix attack, while
> > this scenario concerns a single public collision. These are two
> > different things.
>
> A chosen-prefix attack allows finding more meaningful collisions such as
> working executables/scripts. When such collisions are made public, they
> would have a greater exploitation potential than just a random collision.
>
Right. So we're assuming Mallory generates a chosen-prefix collision,
and then somehow pulls off steps #1 and #2-as-amended [both quoted
below], with Alice noticing none of that.
That still sounds like something we should assume Mallory can pull off.
> > Disclosure of of a pair of executable files/scripts isn't by itself
> > a problem unless one of the pair ("file A") is in a repository
> > somewhere. Now, was the colliding file ("file B") generated _before_ or
> > _after_ file A was committed?
> >
> > - If _before_, then it would seem Mallory had somehow managed to:
> >
> > 1. get a file of his choosing committed to Alice's repository; and
> >
> > 2. get a wc of Alice's repository into one of the codepaths that
> > assume SHA-1 is one-to-one / collission-free (currently that's the
> > ra_serf optimization and the 1.15 wc status).
>
> Not only. There are cases when the working copy itself installs the working
> file with a hash lookup in the pristine store. This is more true for 1.14
> than trunk, because in trunk we have the streamy checkout/update that avoid
> such lookups by writing straight to the working file. However, some of
> the code paths still install the contents from the pristine store by hash.
> Examples include reverting a file, copying an unmodified file, switching
> a file with keywords, the mentioned ra_serf optimization, and etc.
>
Thanks. In terms of that step #2, all these are also candidates for
"one of the codepaths", then.
> > Now, step #1 seems plausible enough. As to step #2, it's not clear to
> > me how file B would reach the wc in step #2…
>
> If Mallory has write access, she could commit both files, thus arranging for
> a possible content change if both files are checked out to a single working
> copy. This isn't the same as just directly modifying the target file, because
> file content isn't expected to change due to changes in other files (that can
> be of any type), so this attack has much better chances of being unnoticed.
>
Well, yes, but the write access requirement lowers severity.
> If Mallory doesn't have write access, there should be other vectors, such
> as distributing a pair of files (harmless in the context of their respective
> file formats) separately via two upstream channels. Then, if both of the
> upstream distributions are committed into a repository and their files are
> checked out together, the content will change, allowing for a malicious
> action.
I take it we're still under the assumption that someone's repository has
rep-sharing disabled (or unsupported, i.e., pre-1.6 format) despite the
recommendation in security/sha1-advisory.txt, since otherwise the commit
would be rejected.
So, back to my question which you have snipped:
> > So, I agree it's a scenario we should address. What options do we
> > have to address it? (I grant that migrating away from SHA-1 is one
> > option.)
Care to address that?
Daniel
>
> Regards,
> Evgeny Kotkov