On Tue, 19 Mar 2024 13:05:54 PDT (-0700), Vineet Gupta wrote:


On 3/19/24 06:10, Jeff Law wrote:
On 3/19/24 12:48 AM, Andrew Waterman wrote:
On Mon, Mar 18, 2024 at 5:28 PM Vineet Gupta <vine...@rivosinc.com> wrote:
On 3/16/24 13:21, Jeff Law wrote:
|   59944:    add     s0,sp,2047  <----
|   59948:    mv      a2,a0
|   5994c:    mv      a3,a1
|   59950:    mv      a0,sp
|   59954:    li      a4,1
|   59958:    lui     a1,0x1
|   5995c:    add     s0,s0,1     <---
|   59960:    jal     59a3c

SP here becomes unaligned, even if transitively which is undesirable as
well as incorrect:
   - ABI requires stack to be 8 byte aligned

It's 16-byte aligned in the default ABI, for Q, but that doesn't really matter.

   - asm code looks weird and unexpected
   - to the user it might falsely seem like a compiler bug even when not,
     specially when staring at asm for debugging unrelated issue.
It's not ideal, but I think it's still ABI compliant as-is.  If it
wasn't, then I suspect things like virtual origins in Ada couldn't be
made ABI compliant.
To be clear are u suggesting ADD sp, sp, 2047 is ABI compliant ?
I'd still like to avoid it as I'm sure someone will complain about it.

With the patch, we get following correct code instead:

| ..
| 59944:     add     s0,sp,2032
| ..
| 5995c:     add     s0,s0,16
Alternately you could tighten the positive side of the range of the
splitter from patch 1/3 so that you could always use 2032 rather than
2047 on the first addi.   ie instead of allowing 2048..4094, allow
2048..4064.
2033..4064 vs. 2048..4094

Yeah I was a bit split about this as well. Since you are OK with either,
I'll keep them as-is and perhaps add this observation to commitlog.
There's a subset of embedded use cases where an interrupt service
routine continues on the same stack as the interrupted thread,
requiring sp to always have an ABI-compliant value (i.e. 16B aligned,
and with no important data on the stack at an address below sp).

Although not all use cases care about this property, it seems more
straightforward to maintain the invariant everywhere, rather than
selectively enforce it.
Just to be clear, the changes don't misalign the stack pointer at all.
They merely have the potential to create *another* pointer into the
stack which may or may not be aligned.  Which is totally normal, it's no
different than taking the address of a char on the stack.

IIRC the "always"-ness of the stack pointer alignment has come up before, and we decided to keep it aligned for these embedded interrupt-related reasons that Andrew points out. That's a bit different than most other ABI requirements, where we're just enforcing them on function boundaries.

That said, this one sounds like just a terminology issue: it's not SP that's misaligned, but intemediate SP-based addressing calculations. I'm not sure if there's a word for these SP-based intermediate values, during last week's team meeting we came up with "stack anchors".

Right I never saw any sp,sp,2047 getting generated - not even in the
first version of patch which lacked any filtering of stack regs via
riscv_reg_frame_related () and obviously didn't have the stack variant
of splitter. I don't know if that is just being lucky and not enough
testing exposure (I only spot checked buildroot libc, vmlinux) or
something somewhere enforces that.

I guess we could run the tests with something like

   diff --git a/target/riscv/translate.c b/target/riscv/translate.c
   index ab18899122..e87cc83067 100644
   --- a/target/riscv/translate.c
   +++ b/target/riscv/translate.c
   @@ -320,12 +320,18 @@ static void gen_goto_tb(DisasContext *ctx, int n, 
target_long diff)
     */
    static TCGv get_gpr(DisasContext *ctx, int reg_num, DisasExtend ext)
    {
   -    TCGv t;
   +    TCGv t, gpr;
if (reg_num == 0) {
            return ctx->zero;
        }
+ if (reg_num == 2) {
   +        gpr = tcg_gen_temp_new();
   +        tcg_gen_andi_tl(gpr, cpu_gpr[reg_num], 0xF);
   +    } else
   +        gpr = cpu_gpr[reg_num];
   +
        switch (get_ol(ctx)) {
        case MXL_RV32:
            switch (ext) {
   @@ -333,11 +339,11 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, 
DisasExtend ext)
                break;
            case EXT_SIGN:
                t = tcg_temp_new();
   -            tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
   +            tcg_gen_ext32s_tl(t, gpr);
                return t;
            case EXT_ZERO:
                t = tcg_temp_new();
   -            tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
   +            tcg_gen_ext32u_tl(t, gpr);
                return t;
            default:
                g_assert_not_reached();
   @@ -349,7 +355,7 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, 
DisasExtend ext)
        default:
            g_assert_not_reached();
        }
   -    return cpu_gpr[reg_num];
   +    return gpr;
    }
static TCGv get_gprh(DisasContext *ctx, int reg_num)

and see if anything blows up? I'm not sure it's worth opening that can of worms, though...

However given that misaligned pointer off of stack is a non-issue, I
think we can do the following:

1. keep just one splitter with 2047 based predicates and constraint (and
not 2032) for both stack-related and general regs.
2. gate the splitter on only operands[0] being not stack related
(currently it checks for either [0] or [1]) - this allows the prominent
case where SP is simply a src, and avoids when any potential shenanigans
to SP itself.

That seems reasonable to me.


-Vineet

Reply via email to