On Mon, Oct 14, 2024 at 04:39:26PM +0100, Andrew Cooper wrote: > On 14/10/2024 1:30 pm, Kirill A. Shutemov wrote: > > +++ b/arch/x86/lib/getuser.S > > @@ -37,9 +37,14 @@ > > +#define SHIFT_LEFT_TO_MSB ALTERNATIVE \ > > + "shl $(64 - 48), %rdx", \ > > + "shl $(64 - 57), %rdx", X86_FEATURE_LA57 > > + > > .macro check_range size:req > > .if IS_ENABLED(CONFIG_X86_64) > > mov %rax, %rdx > > + SHIFT_LEFT_TO_MSB > > sar $63, %rdx > > or %rdx, %rax > > .else > > That looks like it ought to DTRT in some cases, but I'll definitely ask > AMD for confirmation. > > But, I expect it will malfunction on newer hardware when > CONFIG_X86_5LEVEL=n, because it causes Linux to explicitly ignore the > LA57 bit. That can be fixed by changing how CONFIG_X86_5LEVEL works. > > I also expect it will malfunction under virt on an LA57-capable system > running a VM in LA48 mode (this time, Linux doesn't get to see the > relevant uarch detail), and I have no good suggestion here.
BTW the paper [1] nonchalantly mentions: "All Intel CPUs that are vulnerable to MDS attacks inherently have the same flaw described here." Anyway, I'd really like to make forward progress on getting rid of the LFENCEs in copy_from_user() and __get_user(), so until if/when we hear back from both vendors, how about we avoid noncanonical exceptions altogether (along with the edge cases mentioned above) and do something like the below? Sure, it could maybe be optimized by a few bytes if we were given more concrete recommendations, but that can be done later if/when that happens. In the meantime we'd have no canonical exception worries and can use a similar strategy to get rid of the LFENCEs. [1] https://arxiv.org/ftp/arxiv/papers/2108/2108.10771.pdf diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S index 112e88ebd07d..dfc6881eb785 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -41,12 +41,21 @@ "shl $(64 - 57), %rdx", X86_FEATURE_LA57, \ "", ALT_NOT(X86_BUG_CANONICAL) #ifdef CONFIG_X86_5LEVEL #define LOAD_TASK_SIZE_MINUS_N(n) \ ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rdx), \ __stringify(mov $((1 << 56) - 4096 - (n)),%rdx), X86_FEATURE_LA57 #else #define LOAD_TASK_SIZE_MINUS_N(n) \ mov $(TASK_SIZE_MAX - (n)),%_ASM_DX #endif .macro check_range size .if IS_ENABLED(CONFIG_X86_64) /* If above TASK_SIZE_MAX, convert to all 1's */ LOAD_TASK_SIZE_MINUS_N(size - 1) cmp %rax, %rdx sbb %rdx, %rdx or %rdx, %rax