On Tue, Nov 05, 2024 at 02:56:42PM +0100, Jan Beulich wrote: > The original implementation has two issues: For one it doesn't preserve > non-canonical-ness of inputs in the range 0x8000000000000000 through > 0x80007fffffffffff. Bogus guest pointers in that range would not cause a > (#GP) fault upon access, when they should. > > And then there is an AMD-specific aspect, where only the low 48 bits of > an address are used for speculative execution; the architecturally > mandated #GP for non-canonical addresses would be raised at a later > execution stage. Therefore to prevent Xen controlled data to make it > into any of the caches in a guest controllable manner, we need to > additionally ensure that for non-canonical inputs bit 47 would be clear. > > See the code comment for how addressing both is being achieved. > > Fixes: 4dc181599142 ("x86/PV: harden guest memory accesses against > speculative abuse") > Signed-off-by: Jan Beulich <jbeul...@suse.com> > --- > RFC: Two variants of part of the logic are being presented, both with > certain undesirable aspects: The first form is pretty large and > ugly (some improvement may be possible by introducing further > helper macros). The alternative form continues to use RCR, which > generally would be nice to do away with. Then again that's also > slightly smaller generated code.
Oh, I assume that's why there's a hardcoded .if 1, I was wondering about that. What's the specific issue with using rcr? And why is the more complex set of macros that use setc plus a shl better? Why not use cmovc: mov $(1 << 63), \scratch1 cmovc \scratch1, \scratch2 AFAICT \scratch1 is not used past the btr instruction, and hence can be used for the cmovc? Thanks, Roger.