On Mon, 28 Oct 2024 at 10:31, Kirill A. Shutemov wrote:
>
> On Mon, Oct 28, 2024 at 08:44:10AM -1000, Linus Torvalds wrote:
> > I The *hardware* presumably uses either bit 47/56 because that's the
> > actual hardware width of the TLB / cache matching.
>
> It is a guess, right?
Yes, I have no idea
On Mon, Oct 28, 2024 at 08:44:10AM -1000, Linus Torvalds wrote:
> I The *hardware* presumably uses either bit 47/56 because that's the
> actual hardware width of the TLB / cache matching.
It is a guess, right?
The bug makes more sense to me if the bit number depends on the active
paging mode: thi
On Mon, 28 Oct 2024 at 01:29, Kirill A. Shutemov wrote:
>
> I think it is worth looking at this approach again as it gets better
> code: removes two instructions from get_user() and doesn't get flags
> involved.
The problem with the 'and' is that it's just re-introducing the same old bug.
> The
On Wed, Oct 16, 2024 at 03:34:11PM -0700, Linus Torvalds wrote:
> On Wed, 16 Oct 2024 at 15:13, Kirill A. Shutemov wrote:
> >
> > It is worse than that. If we get LAM_SUP enabled (there's KASAN patchset
> > in works) this check will allow arbitrary kernel addresses.
>
> Ugh. I haven't seen the LA
On Fri, 25 Oct 2024 at 01:56, David Laight wrote:
>
> > > Especially if there is always a (PAGE sized) gap between the highest
> > > user address and the lowest kernel address so the 'size' argument
> > > to access_ok() can be ignored on the assumption that the accesses
> > > are (reasonably) line
...
> access_ok() itself is so rarely used these days that we could out-line
> it. But the code cost of a function call is likely higher than
> inlining the 8-byte constant and a couple of instructions: not because
> the call instruction itself, but because of the code generation pain
> around it
On Thu, 24 Oct 2024 at 02:22, David Laight wrote:
>
> Would it be better to make the 'bogus' constant one that makes
> all accesses fail?
Well, the bogus constant is actually used for other runtime cases too
(well, right now that's only the dentry pointer, but could be more),
it's not some kind o
From: Linus Torvalds
> Sent: 23 October 2024 21:08
>
> On Wed, 23 Oct 2024 at 12:17, Linus Torvalds
> wrote:
> >
> > NOTE! This is obviously untested and I didn't check that it does the
> > cmp/sbb/or the right way around.
>
> Well, it boots. The code generation (from strncpy_from_user()) seems
On Wed, 23 Oct 2024 at 16:32, Linus Torvalds
wrote:
>
> And I guess I should make "__put_user()" do the same thing, just so
> that we only have one sequence.
No, I decided it's not worth it. The put_user side already also
doesn't do any other speculation barriers, simply because it has no
specula
On Wed, 23 Oct 2024 at 12:17, Linus Torvalds
wrote:
>
> NOTE! This is obviously untested and I didn't check that it does the
> cmp/sbb/or the right way around.
Well, it boots. The code generation (from strncpy_from_user()) seems ok:
movabs $0x123456789abcdef,%rcx
cmp%rsi,%rcx
On Wed, 23 Oct 2024 at 13:07, Linus Torvalds
wrote:
>
> Well, it boots. The code generation (from strncpy_from_user()) seems ok:
Actually, doing some more sanity checking, that patch is wrong.
Not *badly* wrong, but for some reason I did the "sbb" in 32-bit
(quite intentionally, but it's very wr
On Wed, 23 Oct 2024 at 02:45, Borislav Petkov wrote:
>
> So I was able to get some info: CLAC/STAC in everything Zen4 and older is
> serializing so there's no issue there.
>
> Zen5 is a different story and AC is being renamed there and thus,
> non-serializing. So we'd need to do something there, p
On Sun, Oct 20, 2024 at 04:35:21PM -0700, Linus Torvalds wrote:
> So I think this time we push back on the hardware people and tell them
> it's *THEIR* damn problem, and if they can't even be bothered to say
> yay-or-nay, we just sit tight.
So I was able to get some info: CLAC/STAC in everything
On Mon, Oct 21, 2024 at 01:48:15PM +0300, Kirill A. Shutemov wrote:
> On Sun, Oct 20, 2024 at 03:59:25PM -0700, Linus Torvalds wrote:
> > On Sun, 20 Oct 2024 at 15:44, Josh Poimboeuf wrote:
> > >
> > > Anyway, I'd really like to make forward progress on getting rid of the
> > > LFENCEs in copy_fro
On Tue, Oct 22, 2024 at 01:16:58AM -0700, Pawan Gupta wrote:
> On Mon, Oct 21, 2024 at 01:48:15PM +0300, Kirill A. Shutemov wrote:
> > On Sun, Oct 20, 2024 at 03:59:25PM -0700, Linus Torvalds wrote:
> > > On Sun, 20 Oct 2024 at 15:44, Josh Poimboeuf wrote:
> > > >
> > > > Anyway, I'd really like t
On Mon, Oct 21, 2024 at 07:36:50PM -0700, Linus Torvalds wrote:
> On Mon, 21 Oct 2024 at 03:48, Kirill A. Shutemov wrote:
> >
> > LAM brings own speculation issues[1] that is going to be addressed by
> > LASS[2]. There was a patch[3] to disable LAM until LASS is landed, but it
> > never got applie
On Mon, 21 Oct 2024 at 03:48, Kirill A. Shutemov wrote:
>
> LAM brings own speculation issues[1] that is going to be addressed by
> LASS[2]. There was a patch[3] to disable LAM until LASS is landed, but it
> never got applied for some reason.
Bah., I think the reason was that nobody knew what - i
On Sun, Oct 20, 2024 at 03:59:25PM -0700, Linus Torvalds wrote:
> On Sun, 20 Oct 2024 at 15:44, Josh Poimboeuf wrote:
> >
> > Anyway, I'd really like to make forward progress on getting rid of the
> > LFENCEs in copy_from_user() and __get_user(), so until if/when we hear
> > back from both vendors
On Sun, 20 Oct 2024 at 16:11, Josh Poimboeuf wrote:
>
> Until it's s/think/know/ can we please put something in place?
Honestly, I'm pretty damn fed up with buggy hardware and completely
theoretical attacks that have never actually shown themselves to be
used in practice.
So I think this time we
On Sun, Oct 20, 2024 at 04:11:25PM -0700, Josh Poimboeuf wrote:
> #define FORCE_CANONICAL
> \
> ALTERNATIVE_2
> \
> "shl $(64 - 48), %rdx",
On Sun, Oct 20, 2024 at 03:59:25PM -0700, Linus Torvalds wrote:
> On Sun, 20 Oct 2024 at 15:44, Josh Poimboeuf wrote:
> >
> > Anyway, I'd really like to make forward progress on getting rid of the
> > LFENCEs in copy_from_user() and __get_user(), so until if/when we hear
> > back from both vendors
On Sun, 20 Oct 2024 at 15:44, Josh Poimboeuf wrote:
>
> Anyway, I'd really like to make forward progress on getting rid of the
> LFENCEs in copy_from_user() and __get_user(), so until if/when we hear
> back from both vendors, how about we avoid noncanonical exceptions
> altogether (along with the
On Mon, Oct 14, 2024 at 04:39:26PM +0100, Andrew Cooper wrote:
> On 14/10/2024 1:30 pm, Kirill A. Shutemov wrote:
> > +++ b/arch/x86/lib/getuser.S
> > @@ -37,9 +37,14 @@
> > +#define SHIFT_LEFT_TO_MSB ALTERNATIVE \
> > + "shl $(64 - 48), %rdx", \
> > + "shl $(64 - 57), %rdx", X86_FEATURE_LA57
>
On Wed, Oct 16, 2024 at 03:32:32PM -0700, Linus Torvalds wrote:
> My preference would actually be to do nothing, on the assumption that
> the AMD issue is actually impossible to trigger (due to CLAC/STAC
> serializing memory address checks - which the timings certainly imply
> they do).
I'm verify
On Wed, 16 Oct 2024 at 15:13, Kirill A. Shutemov wrote:
>
> It is worse than that. If we get LAM_SUP enabled (there's KASAN patchset
> in works) this check will allow arbitrary kernel addresses.
Ugh. I haven't seen the LAM_SUP patches.
But yeah, I assume any LAM_SUP model would basically then ma
On Wed, 16 Oct 2024 at 15:03, Andrew Cooper wrote:
>
> That doesn't have the same semantics, does it?
Correct. It just basically makes all positive addresses be force-canonicalized.
> If AMD think it's appropriate, then what you probably want is the real
> branch as per before (to maintain archi
On Wed, Oct 16, 2024 at 11:02:56PM +0100, Andrew Cooper wrote:
> On 16/10/2024 5:10 pm, Linus Torvalds wrote:
> > --- a/arch/x86/lib/getuser.S
> > +++ b/arch/x86/lib/getuser.S
> > @@ -37,11 +37,14 @@
> >
> >#define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence",
> > X86_FEATURE_LFENCE_RDTSC
On 16/10/2024 5:10 pm, Linus Torvalds wrote:
> --- a/arch/x86/lib/getuser.S
> +++ b/arch/x86/lib/getuser.S
> @@ -37,11 +37,14 @@
>
>#define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence",
> X86_FEATURE_LFENCE_RDTSC
>
> +#define X86_CANONICAL_MASK ALTERNATIVE \
> + "movq $0x80007
On Mon, 14 Oct 2024 at 09:55, Linus Torvalds
wrote:
>
> On Mon, 14 Oct 2024 at 05:30, Kirill A. Shutemov wrote:
> >
> > Given that LAM enforces bit 47/56 to be equal to bit 63 I think we can do
> > this unconditionally instead of masking:
> >
> > diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib
On Mon, Oct 14, 2024 at 04:39:26PM +0100, Andrew Cooper wrote:
> But, I expect it will malfunction on newer hardware when
> CONFIG_X86_5LEVEL=n, because it causes Linux to explicitly ignore the
> LA57 bit. That can be fixed by changing how CONFIG_X86_5LEVEL works.
https://lore.kernel.org/all/8073
On Mon, 14 Oct 2024 at 05:30, Kirill A. Shutemov wrote:
>
> Given that LAM enforces bit 47/56 to be equal to bit 63 I think we can do
> this unconditionally instead of masking:
>
> diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
> index d066aecf8aeb..86d4511520b1 100644
> --- a/arch/x
On 14/10/2024 1:30 pm, Kirill A. Shutemov wrote:
> On Sun, Oct 13, 2024 at 11:50:55PM -0700, Linus Torvalds wrote:
>> Anyway, the attached patch
>>
>> diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
>> index d066aecf8aeb..7d5730aa18b8 100644
>> --- a/arch/x86/lib/getuser.S
>> +++ b/arc
From: Linus Torvalds
> Sent: 14 October 2024 12:21
>
> On Mon, 14 Oct 2024 at 02:59, David Laight wrote:
> >
> > Isn't LAM just plain stupid unless the hardware validates the bits
> > against the TLB?
>
> What? No. You can't do that. At some point, the TLB isn't filled, and
> then you have to do
On Sun, Oct 13, 2024 at 11:50:55PM -0700, Linus Torvalds wrote:
> Anyway, the attached patch
>
> diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
> index d066aecf8aeb..7d5730aa18b8 100644
> --- a/arch/x86/lib/getuser.S
> +++ b/arch/x86/lib/getuser.S
> @@ -37,11 +37,17 @@
>
> #define
On Sat, Oct 12, 2024 at 05:53:19PM -0700, Linus Torvalds wrote:
> On Sat, 12 Oct 2024 at 10:44, Linus Torvalds
> wrote:
> >
> > Anyway, what's the speculation window size like?
>
> Note that this is important basically because we do *NOT* want to
> check the address against TASK_SIZE_MAX like we
On Mon, 14 Oct 2024 at 02:59, David Laight wrote:
>
> Isn't LAM just plain stupid unless the hardware validates the bits
> against the TLB?
What? No. You can't do that. At some point, the TLB isn't filled, and
then you have to do the access with lots of linear address bits
masked.
> Doesn't ARM6
On 14/10/2024 7:50 am, Linus Torvalds wrote:
> On Sun, 13 Oct 2024 at 20:54, Josh Poimboeuf wrote:
>> If I understand correctly, LAM bits are for the benefit of SW and are
>> ignored by HW?
That is what is written in the ISE today.
But, I'll note that on ARM, MTE (Memory Tagging Extensions) use
...
> > If I understand correctly, LAM bits are for the benefit of SW and are
> > ignored by HW? Can we just mask those bits off?
>
> Yes. But then you waste time on the masking, but particularly if it
> then causes silly extra overhead just to get the mask.
Isn't LAM just plain stupid unless th
On Sun, 13 Oct 2024 at 20:54, Josh Poimboeuf wrote:
>
> If I understand correctly, LAM bits are for the benefit of SW and are
> ignored by HW? Can we just mask those bits off?
Yes. But then you waste time on the masking, but particularly if it
then causes silly extra overhead just to get the mas
On Sat, Oct 12, 2024 at 06:21:12PM -0700, Linus Torvalds wrote:
> On Sat, 12 Oct 2024 at 17:53, Linus Torvalds
> wrote:
> >
> > So no, the address masking can not depend on things like
> > __VIRTUAL_MASK_SHIFT, it would need to at least take LAM into account
> > too. Not that I know if there are a
On Sat, Oct 12, 2024 at 09:09:23AM -0500, Josh Poimboeuf wrote:
> On Sat, Oct 12, 2024 at 09:48:57AM +0100, Andrew Cooper wrote:
> > On 12/10/2024 5:09 am, Josh Poimboeuf wrote:
> > > For x86-64, the barrier_nospec() in copy_from_user() is overkill and
> > > painfully slow. Instead, use pointer ma
On Sat, 12 Oct 2024 at 17:53, Linus Torvalds
wrote:
>
> So no, the address masking can not depend on things like
> __VIRTUAL_MASK_SHIFT, it would need to at least take LAM into account
> too. Not that I know if there are any CPU's out there that actually
> have LAM enabled.
Lunar Lake and Arrow L
On Sat, 12 Oct 2024 at 10:44, Linus Torvalds
wrote:
>
> Anyway, what's the speculation window size like?
Note that this is important basically because we do *NOT* want to
check the address against TASK_SIZE_MAX like we used to, because not
only is TASK_SIZE_MAX not a compile-time constant, but wi
On Sat, 12 Oct 2024 at 10:23, Andrew Cooper wrote:
>>
> This logic is asymmetric.
>
> For an address in the upper half (canonical or non-canonical), it ORs
> with -1 and fully replaces the prior address.
Right. The point is that non-canonical addresses will fault, and
kernel addresses are guarant
On 12/10/2024 4:44 pm, Linus Torvalds wrote:
> On Sat, 12 Oct 2024 at 01:49, Andrew Cooper wrote:
>> You do realise mask_user_address() is unsafe under speculation on AMD
>> systems?
> That had *better* not be true.
Yeah I'd prefer it wasn't true either.
>> Had the mask_user_address() patch been
On Sat, 12 Oct 2024 at 07:21, Borislav Petkov wrote:
>
> Commit
>
> 2865baf54077 ("x86: support user address masking instead of
> non-speculative conditional")
No.
Thos started long before. Again, see commit b19b74bc99b1 ("x86/mm:
Rework address range check in get_user() and put_user(
On Sat, 12 Oct 2024 at 01:49, Andrew Cooper wrote:
>
> You do realise mask_user_address() is unsafe under speculation on AMD
> systems?
That had *better* not be true.
> Had the mask_user_address() patch been put for review, this feedback
> would have been given then.
That's BS.
Why? Look at co
On 12/10/2024 3:09 pm, Josh Poimboeuf wrote:
> On Sat, Oct 12, 2024 at 09:48:57AM +0100, Andrew Cooper wrote:
>> On 12/10/2024 5:09 am, Josh Poimboeuf wrote:
>>> For x86-64, the barrier_nospec() in copy_from_user() is overkill and
>>> painfully slow. Instead, use pointer masking to force the user
On Sat, Oct 12, 2024 at 09:48:57AM +0100, Andrew Cooper wrote:
> On 12/10/2024 5:09 am, Josh Poimboeuf wrote:
> > For x86-64, the barrier_nospec() in copy_from_user() is overkill and
> > painfully slow. Instead, use pointer masking to force the user pointer
> > to a non-kernel value even in specul
On 12/10/2024 5:09 am, Josh Poimboeuf wrote:
> For x86-64, the barrier_nospec() in copy_from_user() is overkill and
> painfully slow. Instead, use pointer masking to force the user pointer
> to a non-kernel value even in speculative paths.
>
> Signed-off-by: Josh Poimboeuf
You do realise mask_us
For x86-64, the barrier_nospec() in copy_from_user() is overkill and
painfully slow. Instead, use pointer masking to force the user pointer
to a non-kernel value even in speculative paths.
While at it, harden the x86 implementations of raw_copy_to_user() and
clear_user(): a write in a mispredicte
51 matches
Mail list logo