On June 26, 2020 3:24:24 AM GMT+02:00, Alan Lehotsky <a...@alum.mit.edu> wrote: >On Jun 25, 2020, at 6:37 PM, Jeff Law ><l...@redhat.com<mailto:l...@redhat.com>> wrote: > >On Thu, 2020-06-25 at 15:46 -0400, Alan Lehotsky wrote: >I’m working on a GCC 8.3 port to a load/store architecture with a >32-bit data-path between registers and memory; > >looking at the gcc.dg/loop-9.c test, I fail to pass because I have >split the move of a double constant to memory into multiple moves (4 in >fact, because I only have a 16-bit immediate mode.) > >The (define_insn_and_split “movdf” …) is conditioned on >“reload_completed”. > >Is there some other trick I need get the constant hoisted. I have >already set the rtx cost of the CONST_DOUBLE ridiculously high (like 10 >insns) >Hi Alan, it's been a long time... > >We'd probably need to set the RTL. A variety of things can get in the >way of >LICM. For example, I'd expect subregs to be problematical because they >can look >like RMW operations. > >jeff > > > >Hello to you too, Jeff…. I’ve been lurking for the last decade or so, >last port I actually did was was GCC 4 based, so lots of new stuff to >try and wrap my head around. I certainly am grateful for anybody with >suggestions as to how to track down this problem (I’m not terribly >eager to do a >parallel stepping thru a x86 gcc in parallel with my port to see where >they diverge in the loop-invariant recognition.) > >Although in crafting this expanded email, I see that the x86 has >already decided to store the constant 18.4242 in the .rodata section by >the start of loop-invariance so there’s a > > (set (reg:DF…. ) (mem:DF (symbol_ref ….))) > >and I bet that’s far easier to move out of the loop than it would be to >split the original > > (set (mem:DF…) (const_double:DF ….))
Immediate operands are never moved or CSEd by either RTL nor GIMPLE so if you do not have const_double immediates the best thing to do is not make them legitimate. Richard. >— Al > >========== > >Source code is > >void f (double *a) >{ >int i; >for (i = 0; i < 100; i++_ >a[i] = 18.4242; >} >========== > >Here’s the dump from loop-9.c.252r.loop2-invariant (compiled -O1) > > >;; Function f (f, funcdef_no=0, decl_uid=1458, cgraph_uid=0, >symbol_order=0) > >*****starting processing of loop 1 ****** >starting the processing of deferred insns >ending the processing of deferred insns >setting blocks to analyze 3, 5 >starting the processing of deferred insns >ending the processing of deferred insns >df_analyze called >df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 ( >0.33) >df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 ( >0.33) >df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 3 ( >0.5) > > >starting region dump > > >f > >Dataflow summary: >def_info->table_size = 3, use_info->table_size = 23 >;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 >[d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24 >[acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc] >31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35 >[scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3] >;; hardware regs used 23 [sp] 29 [arg] 39 [sfp] >;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 >[d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; exit block uses 22 [a6] 23 [sp] 39 [sfp] >;; regs ever live 0 [d0] 30 [cc] >;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d} >r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u} >r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u} >;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns. >;; Reaching defs: >;; sparse invalidated >;; dense invalidated 0, 1 >;; reg->defs[] map: 30[0,1] 46[2,2] >;; bb 3 artificial_defs: { } >;; bb 3 artificial_uses: { u7(22){ }u8(23){ }u9(29){ }u10(39){ }} >;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 >;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 >;; lr def 30 [cc] 46 >;; live in 46 >;; live gen 30 [cc] 46 >;; live kill 30 [cc] >;; rd in (1) 46[2] >;; rd gen (2) 30[1],46[2] >;; rd kill (3) 30[0,1],46[2] >;; UD chains for artificial uses at top > >(code_label 11 7 8 3 2 (nil) [0 uses]) >(note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK) >;; UD chains for insn luid 0 uid 9 >;; reg 46 { d2(bb 3 insn 10) } >(insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15, >offset: 0B]+0 S8 A32]) >(const_double:DF 1.84241999999999990222931955941021442413330078125e+1 >[0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf} > (nil)) >;; UD chains for insn luid 1 uid 10 >;; reg 46 { d2(bb 3 insn 10) } >(insn 10 9 12 3 (parallel [ > (set (reg:SI 46 [ ivtmp___6 ]) > (plus:SI (reg:SI 46 [ ivtmp___6 ]) > (const_int 8 [0x8]))) > (clobber (reg:CC 30 cc)) > ]) 81 {addsi3_1v5} > (expr_list:REG_UNUSED (reg:CC 30 cc) > (nil))) >;; UD chains for insn luid 2 uid 12 >;; reg 46 { d2(bb 3 insn 10) } >;; reg 48 { } >(insn 12 10 13 3 (set (reg:CCWZ 30 cc) > (compare:CCWZ (reg:SI 46 [ ivtmp___6 ]) > (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4} > (nil)) >;; UD chains for insn luid 3 uid 13 >;; reg 30 { d1(bb 3 insn 12) } >(jump_insn 13 12 18 3 (set (pc) > (if_then_else (ne:CCWZ (reg:CCWZ 30 cc) > (const_int 0 [0])) > (label_ref:SI 18) > (pc))) "loop-9.c":8 177 {jcc} > (expr_list:REG_DEAD (reg:CCWZ 30 cc) > (int_list:REG_BR_PROB 1063004412 (nil))) > -> 18) >;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 >;; live out 46 >;; rd out (1) 46[2] >;; UD chains for artificial uses at bottom >;; reg 22 { } >;; reg 23 { } >;; reg 29 { } >;; reg 39 { } > > >;; bb 5 artificial_defs: { } >;; bb 5 artificial_uses: { u-1(22){ }u-1(23){ }u-1(29){ }u-1(39){ }} >;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 >;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; lr def >;; live in 46 >;; live gen >;; live kill >;; rd in (2) 30[1],46[2] >;; rd gen (0) >;; rd kill (0) >;; UD chains for artificial uses at top > >(code_label 18 13 17 5 3 (nil) [1 uses]) >(note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK) >;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48 >;; live out 46 >;; rd out (1) 46[2] >;; UD chains for artificial uses at bottom >;; reg 22 { } >;; reg 23 { } >;; reg 29 { } >;; reg 39 { } > > > >*****ending processing of loop 1 ****** >starting the processing of deferred insns >ending the processing of deferred insns > > >f > >Dataflow summary: >;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 >[d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24 >[acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc] >31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35 >[scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3] >;; hardware regs used 23 [sp] 29 [arg] 39 [sfp] >;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7 >[d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp] >;; exit block uses 22 [a6] 23 [sp] 39 [sfp] >;; regs ever live 0 [d0] 30 [cc] >;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d} >r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u} >r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u} >;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns. >(note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) >(insn 2 4 3 2 (set (reg:SI 46 [ ivtmp___6 ]) > (reg:SI 0 d0 [ a ])) "loop-9.c":6 7 {movsi_internal} > (expr_list:REG_DEAD (reg:SI 0 d0 [ a ]) > (nil))) >(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG) >(insn 7 3 11 2 (parallel [ > (set (reg:SI 48 [ _17 ]) > (plus:SI (reg:SI 46 [ ivtmp___6 ]) > (const_int 800 [0x320]))) > (clobber (reg:CC 30 cc)) > ]) 81 {addsi3_1v5} > (expr_list:REG_UNUSED (reg:CC 30 cc) > (nil))) >(code_label 11 7 8 3 2 (nil) [0 uses]) >(note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK) >(insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15, >offset: 0B]+0 S8 A32]) >(const_double:DF 1.84241999999999990222931955941021442413330078125e+1 >[0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf} > (nil)) >(insn 10 9 12 3 (parallel [ > (set (reg:SI 46 [ ivtmp___6 ]) > (plus:SI (reg:SI 46 [ ivtmp___6 ]) > (const_int 8 [0x8]))) > (clobber (reg:CC 30 cc)) > ]) 81 {addsi3_1v5} > (expr_list:REG_UNUSED (reg:CC 30 cc) > (nil))) >(insn 12 10 13 3 (set (reg:CCWZ 30 cc) > (compare:CCWZ (reg:SI 46 [ ivtmp___6 ]) > (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4} > (nil)) >(jump_insn 13 12 18 3 (set (pc) > (if_then_else (ne:CCWZ (reg:CCWZ 30 cc) > (const_int 0 [0])) > (label_ref:SI 18) > (pc))) "loop-9.c":8 177 {jcc} > (expr_list:REG_DEAD (reg:CCWZ 30 cc) > (int_list:REG_BR_PROB 1063004412 (nil))) > -> 18) >(code_label 18 13 17 5 3 (nil) [1 uses]) >(note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK) >(note 14 17 0 4 [bb 4] NOTE_INSN_BASIC_BLOCK)