i386 mp_machdep.c

Maxim Sobolev Fri, 09 Nov 2007 17:38:31 -0800

For what it is worth I think Nate has the correct point. We should notforce this setting upon each and every user if it can realisticallyaffect only 0.0001% of our userbase. Leaving this setting on forproduction server that does lot of sensitive crypto and has lot ofuntrusted remote users is like leaving open access to the server consolein the common room. At the same time, most of us have open access toconsoles of their laptops/desktops at home/work.

By the way, I wonder how sun4v (aka Niagara) fares in this respect. Aslong as I know, they use similar concept, when 8 physical cores can run32 threads. Should we disable it by default there as well? ;-)


-Maxim

Nate Lawson wrote:

Colin Percival wrote:

Nate Lawson wrote:

I'm still waiting for what will be done to prevent the attack on
uniprocessor or multi-core machines (shared L2).  Continuing to focus on
hyperthreading is like locking the screen door on your submarine.

Exploiting the a cache collision channel through the L2 cache is much harder
than through the L1 cache, and is likely impossible under many circumstances
(OpenSSL has been fixed to prevent the most easily exploitable cache side
channel).  In addition, there are other attacks, e.g., using shared branch
prediction tables, to which hyperthreaded processors are vulnerable but which
do not affect multicore systems at all.


Even uniprocessor is vulnerable to BTB side channel if a context switch
occurs so saying multicore systems are not affected at all is a bit of a
stretch.  I agree HTT gives you the best vantage point for
observing/affecting all of these microarchitectural details.  However,
if something leaves a state change in the CPU that is visible from HTT,
that state also survives a context switch.  The only question is how
many more samples are required than when using HTT.

Rather than locking the screen door on a submarine, I'd say that a more apt
comparison would be turning off a fire hydrant even though a garden hose is
still running.  I recommend the use of more sophisticated countermeasures
against side channel attacks where highly sensitive keying material is
concerned; but this does not invalidate the utility of applying such a very
simple countermeasure which prevents a very easy attack.


[I wrote the below privately but think it might be useful as part of
this thread]

Research since 2005 has confirmed that HTT is not the only way cache
timing behavior can be observed.  Multi-core and uniprocessor are both
vulnerable to the attack you publicized and L2 is vulnerable, albeit
with more samples required.

Further research into new side channels like the branch target buffer
(that cannot be turned off like HTT) has shown that the cryptographic
software itself must contain countermeasures in addition to operating
system support for a "stealth mode".  Continuing to disable HTT and
claiming this helps is dishonest to our users since it's not a true
stealth mode.  (Context switches can still occur, revealing intermediate
state and disabling HTT doesn't address multi-core.)

Fixes which address all of the cache-related threats to RSA [1] and
mitigate BTB attacks [2] were contributed to OpenSSL.  The first fix
involves "striping" the windowed exponent across cache lines so that use
of any of the exponents has the same cache access behavior.  The second
involves removing conditional branches from the modular arithmetic.

I think the solution should be to document the security info for users
and developers.  Users should be notified that if they are deploying a
server, they should be sure their cryptography libraries address side
channel attacks.  FreeBSD's default configuration of included libraries
like OpenSSL should be noted.  Developers of cryptographic libraries
should be notified that they are responsible for avoiding data-dependent
behavior and being aware of microarchitectural side channels.

Careful coding can address most side channel attacks, but I still think
OS's need a standard API for a stealth mode where a privileged process
can request exclusive access to the CPU it is running on for a short
quantum, with a guarantee that they will not be preempted unless they
exceed that quantum.  Additional support for cleaning the
microarchitectural side effects (cache, BTB, etc.) would be a bonus.  I
don't know of any standards efforts in this area but it might be
interesting to note.  Fast implementations of AES are a good example
where such support is needed since it is impossible to eliminate cache
timing differences of the table lookups without such a mode.

[1] OpenSSL 0.9.7h, change 10/2005 by Matthew D. Wood of Intel,
http://www.openssl.org/news/changelog.html
[2] OpenSSL 0.9.8f, change 10/2007 by Matthew D. Wood of Intel,
http://www.openssl.org/news/changelog.html


_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: cvs commit: src/sys/amd64/amd64 mp_machdep.c src/sys/i386/i386 mp_machdep.c

Reply via email to