On 12/10/2024 4:44 pm, Linus Torvalds wrote: > On Sat, 12 Oct 2024 at 01:49, Andrew Cooper <andrew.coop...@citrix.com> wrote: >> You do realise mask_user_address() is unsafe under speculation on AMD >> systems? > That had *better* not be true.
Yeah I'd prefer it wasn't true either. >> Had the mask_user_address() patch been put for review, this feedback >> would have been given then. > That's BS. > > Why? Look at commit b19b74bc99b1 ("x86/mm: Rework address range check > in get_user() and put_user()"). That looks like 3 Intel tags and 0 AMD tags. But ok, I didn't spot this one, and it looks unsafe too. It was not reviewed by anyone that had a reasonable expectation to know AMD's microarchitectural behaviour. Previously, the STAC protected against bad prediction of the JAE and prevented dereferencing the pointer if it was greater than TASK_SIZE. Importantly for the issue at hand, the calculation against TASK_SIZE excluded the whole non-canonical region. > This mask_user_address() thing is how we've been doing a regular > get/put_user() for the last 18 months. It's *exactly* the same > pattern: > > mov %rax, %rdx > sar $63, %rdx > or %rdx, %rax > > ie we saturate the sign bit. This logic is asymmetric. For an address in the upper half (canonical or non-canonical), it ORs with -1 and fully replaces the prior address. For an address in the lower half (canonical or non-canonical), it leaves the value intact, as either canonical or non-canoncal. Then the pointer is architecturally dereferenced, relying on catching #PF/#GP for the slow path. Architecturally, this is safe. Micro-architecturally though, AMD CPUs use bit 47, not 63, in the TLB lookup. This behaviour dates from the K8, and is exposed somewhat in the virt extensions. When userspace passes in a non-canonical pointer in the low half of the address space but with bit 47 set, it will be considered a high-half pointer when sent for TLB lookup, and the pagetables say it's a supervisor mapping, so the memory access will be permitted to go ahead speculatively. Only later does the pipeline realise the address was non-canonical and raise #GP. This lets userspace directly target and load anything cacheable in the kernel mappings. It's not as easy to exploit as Meltdown on Intel, but it known behaviour, and been the subject of academic work for 4 years. ~Andrew