Re: [PATCH 5/5] powerpc: Implement masked user access

Segher Boessenkool Sun, 22 Jun 2025 17:47:40 -0700

Hi!

On Sun, Jun 22, 2025 at 06:13:51PM +0100, David Laight wrote:
> On Sun, 22 Jun 2025 11:52:43 +0200
> Christophe Leroy <christophe.le...@csgroup.eu> wrote:
> > e500 has the isel instruction which allows selecting one value or
> > the other without branch and that instruction is not speculative, so
> > use it. Allthough GCC usually generates code using that instruction,
> > it is safer to use inline assembly to be sure. The result is:


The instruction (which is a standard Power instruction since
architecture version 2.03, published in 2006) can in principle be
speculative, but there exist no Power implementations that do any data
speculation like this at all.

If you want any particular machine instructions to be generated you have
to manually write it, sure, in inline asm or preferably in actual asm.
But you can be sure that GCC will generate isel or similar (like the
v3.1 set[n]bc[r] insns, best instructions ever!), whenever appropriate,
i.e. when it is a) allowed at all, and b) advantageous.

> >   14:       3d 20 bf fe     lis     r9,-16386
> >   18:       7c 03 48 40     cmplw   r3,r9
> >   1c:       7c 69 18 5e     iselgt  r3,r9,r3
> > 
> > On other ones, when kernel space is over 0x80000000 and user space
> > is below, the logic in mask_user_address_simple() leads to a
> > 3 instruction sequence:
> > 
> >   14:       7c 69 fe 70     srawi   r9,r3,31
> >   18:       7c 63 48 78     andc    r3,r3,r9
> >   1c:       51 23 00 00     rlwimi  r3,r9,0,0,0
> > 
> > This is the default on powerpc 8xx.
> > 
> > When the limit between user space and kernel space is not 0x80000000,
> > mask_user_address_32() is used and a 6 instructions sequence is
> > generated:
> > 
> >   24:       54 69 7c 7e     srwi    r9,r3,17
> >   28:       21 29 57 ff     subfic  r9,r9,22527
> >   2c:       7d 29 fe 70     srawi   r9,r9,31
> >   30:       75 2a b0 00     andis.  r10,r9,45056
> >   34:       7c 63 48 78     andc    r3,r3,r9
> >   38:       7c 63 53 78     or      r3,r3,r10
> > 
> > The constraint is that TASK_SIZE be aligned to 128K in order to get
> > the most optimal number of instructions.
> > 
> > When CONFIG_PPC_BARRIER_NOSPEC is not defined, fallback on the
> > test-based masking as it is quicker than the 6 instructions sequence
> > but not necessarily quicker than the 3 instructions sequences above.
> 
> Doesn't that depend on whether the branch is predicted correctly?
> 
> I can't read ppc asm well enough to check the above.

[ PowerPC or Power (or Power Architecture, or Power ISA) ]

> And the C is also a bit tortuous.

I can read the code ;-)  All those instructions are normal simple
integer instructions.  Shifts, adds, logicals.

In general, correctly predicted non-taken bvranches cost absolutely
nothing.  Correctly predicted taken branches cost the same as any taken
branch, so a refetch, maybe resulting in a cycle or so of decode bubble.
And a mispredicted branch can be very expensive, say on the order of a
hundred cycles (but usually more like ten, which is still a lot of insns
worth).

So branches are great for predictable stuff, and "not so great" for
not so predictable stuff.


Segher

Re: [PATCH 5/5] powerpc: Implement masked user access

Reply via email to