Re: Summary of the current state of the tag2upload discussion

Russ Allbery Tue, 25 Jun 2024 12:03:53 -0700

Jun MO <royclark...@gmail.com> writes:
> Russ Allbery <r...@debian.org> writes:


>> For the third purpose, I believe only weak intent information can be
>> derived from the uploader signature today. It is common practice in
>> Debian to verify the Git tree that one wants to upload, run a package
>> build step, and then blindly sign the resulting source package. [...]

> I feel this is somehow ... wrong. I think, *currently*, it should be a
> moral obligation for a DD to make sure the resulting source package is
> correct.

Yes, I think this is the initial reaction that most people have.  You're
uploading this thing and it's your responsibility to check that it's
correct.  This sounds correct in theory and doesn't work in practice, but
the reasons why are subtle and require getting into some of the philosophy
of security.

(Warning: this is kind of long.)

[snip the elaboration of this point, which I think correctly describes the
problem]

> However due to possible bug/change in the chaintools, malwares, mistakes
> or other things, the DD's intent may not present in the resulted
> source/binary package. And *currently* in buildd, binary packages are
> still built from source packages. So I think it should be a moral
> obligation for a DD to make sure his/her intent is present in the source
> packages (and finally present in the binary packages).[1] I know that
> DDs are volunteers and it is impossible for them to perform a thorough
> inspection of the source package. But I feel that it is lack of moral
> obligation that a DD blindly sign the resulting source package without
> even spend a few second look what is inside it, if he/she knows the
> resulting source package may differ from his/her intent. And for
> tag2upload, I think there is the same moral obligation for a DD even
> though he/she do not need to sign the source package.

The basic problem (and this may sound flippant but it isn't) is that DDs
are humans and humans are so bad at the type of validation that you would
want them to perform that there is essentially no chance that this will
happen systematically and effectively.

People, when trying to solve a security problem, will tend towards a
fairly straightforward approach at first: figure out where the attack
happened, figure out how to detect the attack, and then add a new process
to check for that attack.  When another attack is found, add another
process to detect that attack.  And so forth.

This is a very natural approach, and in some cases it can work, but in the
general case what you get is airport security.  It has precisely that
model: planes were hijacked with weapons or blown up with bombs, and
therefore we check every passenger's luggage for weapons or bombs.  Each
time we find a new weapon, we add a new check.  In this case, we have gone
far, far beyond anything that Debian would have the resources to do: we
hire an army of people whose entire job is to do these checks, we give
them a bunch of expensive equipment, and we regularly test them to see how
well they do at performing this screening.

In independent tests of TSA screening (the US airport security agency),
that screening misses 80-95% of weapons or bombs.

This is a very common cautionary tale in the security community, and it's
not because the agency is incompetent, or at least any more incompetent
than any group of fallible humans told to do a boring job over and over
again will be.  This is not a TSA problem, it's a human being problem.

The assumption on which this security approach is based is not as true as
people want it to be.  You cannot simply tell people to check something
and expect them to find problems if those problems are extremely rare and
99% of the time they will find nothing.  The human brain is literally
hostile to this activity.  It is not optimized for it and will not do it
reliably.  To get the accuracy as high as airport security achieves
already requires training, testing, and a whole lot of assistance from
technology that doesn't get bored or inattentive, and it still fails more
often than not.

You have probably run into a similar version of this problem if you have
proofread something that you have written.  If you are like most people
(there are some people who are an exception to this), you will miss
obvious, glaring errors that someone else will see immediately because you
know what you intended to write and therefore that's what you see.  With a
great deal of concentration and techniques to override your brain's normal
behavior, such as reading all the sentences in reverse order, you can
improve your accuracy rate, but you will still regularly miss errors.

Security is even worse because it's adversarial.  You're not attempting to
spot static bugs that are sitting there waiting to be detected.  You are
trying to defeat an intelligent, creative, adaptive enemy who watches what
you do and adjusts their attack approach to focus on your weak spots.  And
in the case of computer security, *unlike* airport security, catching the
attacker once doesn't let you arrest them and prevent them from ever
attacking you again.  The attacker can just try again, and again, and
again until they succeed.

This doesn't mean that spot checks are useless.  Doing an occasional
manual check can catch unanticipated problems and snare a lot of
attackers, and it's part of a good overall security strategy.  But this
approach isn't *reliable* and shouldn't be the front line of the strategy.

This problem is one of the reasons why there is so much emphasis in
computer security at looking at the whole system, including the humans
involved, and anticipating how the humans will fail.  Humans are
vulnerable to boredom, to inattentiveness, to alert fatigue, to seeing
what they expect to see, to laziness and haste, to emotional manipulation,
and any number of other factors that interfere with them following
processes.  The security design has to account for this.

Some processes are worse than others.  Processes that require a human to
check something for errors that will not be present 99% of the time, and
where it is overwhelmingly likely that no one will ever know if the human
did the check properly or not, is nearly a worst-case scenario.  The check
just won't happen reliably, no matter the intentions of everyone involved.
People will skip it "just this time."  People will open the thing they're
supposed to look at and their eyes will point at the screen but their
brain will not actually analyze anything they're looking at because it
doesn't believe it will find anything.

Systems that need better humans won't work reliably.  You can say that
it's a moral obligation and make people feel guilty for not following the
system properly, but this still doesn't get you reliable performance, only
a bunch of stressed, guilty people who are now demotivated on top of all
the problems that you already had.

This is why good security design is all about having humans do the things
that they're good at or at least okay at (patch review, for example, which
can still be tedious but which is substantially more novel and interesting
than reviewing source packages that will be exactly what you expect nearly
every time), and not relying on humans to do the things they're bad at.
This is the whole philosophy behind reproducible builds: don't make humans
tediously compare things, set up mechanisms so that computers can compare
things.  Computers don't get bored or inattentive or tired and do not care
in the slightest about doing exactly the same thing millions of times to
get only one anomalous result.

This is also why even for the case of finding bugs in source packages,
which is much less of a worst-case scenario than finding maliciously
injected code that is trying to hide, "test the package before you upload
it" is only a fallback strategy when you don't have anything better.  A
much better strategy is to encode the tests that you would perform as
programs and make the computer run them.  This is why we write test suites
and autopkgtests: then all the tests happen with every upload and the
computer finds problems and no human has to try to force themselves (and
likely fail) to perform the same tedious steps over and over again.

One final philosophical note: humans also have incredibly limited energy.
This is particularly a problem in a volunteer project, as you noted in
your footnote.  There is way more to do than we have resources to do.  I
want to use that energy as wisely as possible.  That means I
*particularly* do not want that energy to go into doing things that humans
are bad at and that probably won't be done well anyway.  This means
designing the whole upload system so that we can create mechanisms like
reproducible binary builds, reproducible source builds, autopkgtests, and
other ways to move the load onto computers and off of humans and save that
precious human attention for the things that only humans can do.

-- 
Russ Allbery (r...@debian.org)              <https://www.eyrie.org/~eagle/>

Re: Summary of the current state of the tag2upload discussion

Reply via email to