On 9/23/25 20:42, Adiletta, Andrew wrote:
[...]
However, on the practicality, I do believe that we did not
mischaracterize the attack in the paper, and as Alexander concisely
mentioned, we are really trying to emphasize the issues with simple
0/1 flag logic that leads down to sensitive execution flows. Also,
great point about the exit codes, in hindsight that would've been a
good point to address as well. But ulimately, we did make
syncronization an assumption as stated in the paper, citing that there
are other teams working on syncronization methods
(https://www.usenix.org/conference/usenixsecurity22/presentation/aldaya
<https://www.usenix.org/conference/usenixsecurity22/presentation/aldaya>).
This seems to confirm my previous assessment that your attack from the
paper is a proof-of-concept.
Slight clarification on Peter's point - I agree with your point about
rad-hard faults protection on modern CPUs, although the threat model
for Mayhem was that registers, as a limited resource, need to
constantly swap back to DRAM, where register values can be corrupted
via Rowhammer and the corrupted values are then stored in the register
when they are brought back from DRAM.
The critical issue for exploiting Rowhammer to corrupt spilled register
values seems to be how long those spilled values remain live in DRAM
before they are reloaded into the register file and ultimately used.
As I understand it, Rowhammer is a slow, probabilistic attack,
requiring, at minimum, many milliseconds for a chance at success. If the
target program is going to reload the values within microseconds, the
probability of a successful attack rapidly approaches zero and there is
no practical risk. (A machine that vulnerable to Rowhammer is likely to
flip too many bits in normal operation and crash so frequently that it
gets replaced.)
This may point towards a previously-unrecognized security risk in OS
schedulers, where delaying a privileged process is more than merely a
denial-of-service. Perhaps dynamic CPU affinities could be used to
reduce the risk by reserving a core for privileged tasks when any are
runnable?
Another solution could be a "scheduler yield" primitive that yields the
processor but guarantees a full timeslice when the program is next
resumed. As long as the critical window can fit within a single
scheduler timeslice, this would close the window on exploitation.
From a hardware architectural standpoint, it might make sense to do a
hash check before and after register values are pushed and popped to
prevent this type of attack, but that was a bit out-of-scope for the
paper. But also, agree with your point on ECC as being a potentially
unreliable mitigation.
Any practical hash check will be no better than ECC: the attacker need
only also flip a few more bits. Do not pretend that checksums are going
to fix this.
If you think that a cryptographic digest is suitable for protecting
register values spilled to the stack, then you have no idea how often
registers are spilled and reloaded.
-- Jacob