On Mon, 28 Oct 2024 at 01:29, Kirill A. Shutemov <kir...@shutemov.name> wrote: > > I think it is worth looking at this approach again as it gets better > code: removes two instructions from get_user() and doesn't get flags > involved.
The problem with the 'and' is that it's just re-introducing the same old bug. > The mask only clears bit 47/56. It gets us stronger canonicality check on > machines with LAM off. But it means that you can just pass in non-canonical addresses, and cause the same old AMD data leak. And yes, it so happens that the AMD microarchitectures use bit 47/56 for "user address", and now it's only leaking user data, but that's a random microarchitectural choice - and a really horribly bad one at that. IOW, the WHOLE BUG was because some AMD engineer decided to use bits [0..48] for indexing, when they SHOULD have used [0..47,63], since bits 63 and 48 *should* be the same, and the latter is the one that is a hell of a lot more convenient and obvious, and would have made ALL of this completely irrelevant, and we'd have just used the sign bit and everything would be fine. But instead of picking the sane bit, they picked a random bit. And now I think your patch has two bugs in it: - it depends on that random undocumented microarchitectural mistake - it uses __VIRTUAL_MASK_SHIFT which probably doesn't even match what hardware does I The *hardware* presumably uses either bit 47/56 because that's the actual hardware width of the TLB / cache matching. But __VIRTUAL_MASK_SHIFT is hardcoded to 47 is 5-level page tables are disabled. So no. We're not replacing one simple microarchitectural assumption (sign bit is sufficient for safety) with another more complicated one (bit X is sufficient for safety). Linus