Hi,

When analyzing coremark build for RISC-V, noticed redundant constants not being eliminated. While this is a recurrent issue with RV, this specific instance is not unique to RV as I can trigger similar output on aarch64 with -fno-if-conversion, hence something which could be addressed in common passes.

-O3 -march=rv64gc_zba_zbb_zbc_zbs

crcu8:
        xor     a3,a0,a1
        andi    a3,a3,1
        srli    a4,a0,1
        srli    a5,a1,1
        beq     a3,zero,.L2

        li      a3,-24576       # 0xFFFF_A000
        addi    a3,a3,1         # 0xFFFF_A001
        xor     a5,a5,a3
        zext.h  a5,a5

.L2:
        xor     a4,a4,a5        
        andi    a4,a4,1 
        srli    a3,a0,2 
        srli    a5,a5,1 
        beq     a4,zero,.L3     

        li      a4,-24576       # 0xFFFF_A000
        addi    a4,a4,1         # 0xFFFF_A001
        xor     a5,a5,a4        
        zext.h  a5,a5           

.L3:
        xor     a3,a3,a5        
        andi    a3,a3,1 
        srli    a4,a0,3 
        srli    a5,a5,1 
        beq     a3,zero,.L4

        li      a3,-24576       # 0xFFFF_A000
        addi    a3,a3,1         # 0xFFFF_A001
...
...

I see that with small tests cse1 is able to substitute redundant constant reg with equivalent old reg.

cse_insn
   make_regs_equiv()
   ..
   canon_reg()

e.g.
void foo(void)
{
   bar(2072, 2096);
}

foo:
    li  a0,4096
    addi        a1,a0,-2000
    addi        a0,a0,-2024
    tail        bar

And it seems to even work across basic blocks.

e.g.

void foo(int x) # -O2
{
    bar1(2072);
    foo2();
    if (x)
        bar2(2096);
}

foo:
 ...
    li s1, 4096
    addi a0, s1, -2024
 ...
    addi a0, a1, -2000
    tail bar2

It seems cse redundancy elimination logic is scoped to "paths" created by cse_find_path() and those seems to chain only 2 basic blocks.

;; Following path with 17 sets: 2 3
;; Following path with 15 sets: 4 5
;; Following path with 15 sets: 6 7
;; Following path with 15 sets: 8 9
;; Following path with 15 sets: 10 11
;; Following path with 15 sets: 12 13
;; Following path with 15 sets: 14 15
;; Following path with 13 sets: 16 17
;; Following path with 2 sets: 18
...

While in this case the BBs containing the dups are 6, 8, ...

(note 36 35 37 6 [bb 6] NOTE_INSN_BASIC_BLOCK)
(insn 37 36 38 6 (set (reg:DI 196)
(const_int -24576 [0xffffffffffffa000])) "../../coremark-crc8.c":23:17 -1
     (nil))
...
(note 54 53 55 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
(insn 55 54 56 8 (set (reg:DI 207)
(const_int -24576 [0xffffffffffffa000])) "../../coremark-crc8.c":23:17 -1
     (nil))

The path construction logic in cse seems non trivial for someone just starting to dig into that code so I'd appreciate some insight.

Also I'm wondering if cse is the right place to do this at all, or should it be done in gcse/PRE ?

TIA
-Vineet

P.S. With the proposed RV cond ops extension if-conversion could elide this specific instance of problem, however redundant constants are a common enough nuisance in RV codegen whcih we need to tackle, specially when it is typically a 2 instruction seq.
typedef unsigned short ee_u16;
typedef unsigned char ee_u8;

ee_u16
crcu8(ee_u8 data, ee_u16 crc)
{
    ee_u8 i = 0, x16 = 0, carry = 0;

    for (i = 0; i < 8; i++)
    {
        x16 = (ee_u8)((data & 1) ^ ((ee_u8)crc & 1));
        data >>= 1;

        if (x16 == 1)
        {
            crc ^= 0x4002;
            carry = 1;
        }
        else
            carry = 0;
        crc >>= 1;
        if (carry)
            crc |= 0x8000;	// set MSB bit
        else
            crc &= 0x7fff;	// clear MSB bit
    }
    return crc;
}

Reply via email to