On Thu, Aug 28, 2014 at 9:50 AM, Kugan <kugan.vivekanandara...@linaro.org> wrote: > > > On 27/08/14 20:07, Richard Biener wrote: >> On Wed, Aug 27, 2014 at 12:01 PM, Uros Bizjak <ubiz...@gmail.com> wrote: >>> Hello! >>> >>>> 2014-08-07 Kugan Vivekanandarajah <kug...@linaro.org> >>>> >>>> * calls.c (precompute_arguments): Check >>>> promoted_for_signed_and_unsigned_p and set the promoted mode. >>>> (promoted_for_signed_and_unsigned_p): New function. >>>> (expand_expr_real_1): Check promoted_for_signed_and_unsigned_p >>>> and set the promoted mode. >>>> * expr.h (promoted_for_signed_and_unsigned_p): New function definition. >>>> * cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if >>>> SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED. >>> >>> This patch regresses: >>> >>> Running target unix >>> FAIL: libgomp.fortran/simd7.f90 -O2 execution test >>> FAIL: libgomp.fortran/simd7.f90 -Os execution test >>> >>> on alphaev6-linux-gnu. >>> >>> The problem can be illustrated with attached testcase with a >>> crosscompiler to alphaev68-linux-gnu (-O2 -fopenmp). The problem is in >>> missing SImode extension after DImode shift of SImode subregs for this >>> part: >>> >>> --cut here-- >>> # test.23_12 = PHI <0(37), 1(36)> >>> _242 = ivtmp.181_73 + 2147483645; >>> _240 = _242 * 2; >>> _63 = (integer(kind=4)) _240; >>> if (ubound.6_99 <= 2) >>> goto <bb 39>; >>> else >>> goto <bb 40>; >>> ;; succ: 39 >>> ;; 40 >>> >>> ;; basic block 39, loop depth 1 >>> ;; pred: 38 >>> pretmp_337 = test.23_12 | l_76; >>> goto <bb 45>; >>> ;; succ: 45 >>> >>> ;; basic block 40, loop depth 1 >>> ;; pred: 38 >>> _11 = *c_208[0]; >>> if (_11 != _63) >>> goto <bb 45>; >>> else >>> goto <bb 42>; >>> --cut here-- >>> >>> this expands to: >>> >>> (code_label 592 591 593 35 "" [0 uses]) >>> >>> (note 593 592 0 NOTE_INSN_BASIC_BLOCK) >>> >>> ;; _63 = (integer(kind=4)) _240; >>> >>> (insn 594 593 595 (set (reg:SI 538) >>> (const_int 1073741824 [0x40000000])) -1 >>> (nil)) >>> >>> (insn 595 594 596 (set (reg:SI 539) >>> (plus:SI (reg:SI 538) >>> (const_int 1073741824 [0x40000000]))) -1 >>> (nil)) >>> >>> (insn 596 595 597 (set (reg:SI 537) >>> (plus:SI (reg:SI 539) >>> (const_int -3 [0xfffffffffffffffd]))) -1 >>> (expr_list:REG_EQUAL (const_int 2147483645 [0x7ffffffd]) >>> (nil))) >>> >>> (insn 597 596 598 (set (reg:SI 536 [ D.1700 ]) >>> (plus:SI (subreg/s/v/u:SI (reg:DI 144 [ ivtmp.181 ]) 0) >>> (reg:SI 537))) -1 >>> (nil)) >>> >>> (insn 598 597 599 (set (reg:DI 540) >>> (ashift:DI (subreg:DI (reg:SI 536 [ D.1700 ]) 0) >>> (const_int 1 [0x1]))) -1 >>> (nil)) >>> >>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) >>> (reg:DI 540)) -1 >>> (nil)) >>> >>> ... >>> >>> (note 610 609 0 NOTE_INSN_BASIC_BLOCK) >>> >>> ;; _11 = *c_208[0]; >>> >>> (insn 611 610 0 (set (reg:DI 120 [ D.1694 ]) >>> (sign_extend:DI (mem:SI (reg/v/f:DI 227 [ c ]) [7 *c_208+0 S4 >>> A128]))) simd7.f90:12 -1 >>> (nil)) >>> >>> ;; if (_11 != _63) >>> >>> (insn 612 611 613 40 (set (reg:DI 545) >>> (eq:DI (reg:DI 120 [ D.1694 ]) >>> (reg:DI 145 [ D.1694 ]))) simd7.f90:12 -1 >>> (nil)) >>> >>> (jump_insn 613 612 616 40 (set (pc) >>> (if_then_else (eq (reg:DI 545) >>> (const_int 0 [0])) >>> (label_ref 0) >>> (pc))) simd7.f90:12 -1 >>> (int_list:REG_BR_PROB 450 (nil))) >>> >>> which results in following asm: >>> >>> $L35: >>> addl $25,$7,$2 # 597 addsi3/1 [length = 4] >>> addq $2,$2,$2 # 598 ashldi3/1 [length = 4] <------ here >>> bne $24,$L145 # 601 *bcc_normal [length = 4] >>> lda $4,4($20) # 627 *adddi_internal/2 [length = 4] >>> ldl $8,0($20) # 611 *extendsidi2_1/2 [length = 4] >>> lda $3,3($31) # 74 *movdi/2 [length = 4] >>> cmpeq $8,$2,$2 # 612 *setcc_internal [length = 4] <-- compare >>> bne $2,$L40 # 613 *bcc_normal [length = 4] >>> br $31,$L88 # 2403 jump [length = 4] >>> .align 4 >>> ... >>> >>> Tracking the values with the debugger shows wrong calculation: >>> >>> 0x000000012000108c <+1788>: addl t10,t12,t1 >>> 0x0000000120001090 <+1792>: addq t1,t1,t1 >>> ... >>> 0x00000001200010a4 <+1812>: cmpeq t6,t1,t1 >>> 0x00000001200010a8 <+1816>: bne t1,0x1200010c0 <foo_+1840> >>> >>> (gdb) si >>> 0x000000012000108c 17 l = l .or. any (b /= 7 + i) >>> (gdb) i r t10 t12 >>> t10 0x7 7 >>> t12 0x7ffffffd 2147483645 >>> >>> (gdb) si >>> 0x0000000120001090 17 l = l .or. any (b /= 7 + i) >>> (gdb) i r t1 >>> t1 0xffffffff80000004 -2147483644 >>> >>> (gdb) si >>> 18 l = l .or. any (c /= 8 + 2 * i) >>> (gdb) i r t1 >>> t1 0xffffffff00000008 -4294967288 >>> >>> At this point, the calculation should zero-extend SImode value to full >>> DImode, since compare operates on DImode values. The problematic insn >>> is (insn 599), which is now a DImode assignment instead of >>> zero-extend, due to: >>> >>> --- a/gcc/cfgexpand.c >>> +++ b/gcc/cfgexpand.c >>> @@ -3309,7 +3309,13 @@ expand_gimple_stmt_1 (gimple stmt) >>> GET_MODE (target), temp, unsignedp); >>> } >>> >>> - convert_move (SUBREG_REG (target), temp, unsignedp); >>> + if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED) >>> + && (GET_CODE (temp) == SUBREG) >>> + && (GET_MODE (target) == GET_MODE (temp)) >>> + && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp)))) >>> + emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp)); >>> + else >>> + convert_move (SUBREG_REG (target), temp, unsignedp); >>> } >>> else if (nontemporal && emit_storent_insn (target, temp)) >>> ; >>> >>> When compiling this code, we have: >>> >>> lhs = _63 >>> target = (subreg/s/v/u:SI (reg:DI 145 [ D.1694 ]) 0) >>> temp = (subreg:SI (reg:DI 540) 0) >>> >>> So, the code assumes that it is possible to copy (reg:DI 540) directly >>> to (reg:DI 154). However, this is not the case, since we still have >>> garbage in the top 32bits. >>> >>> Reverting the part above fixes the runtime failure, since (insn 599) is now: >>> >>> (insn 599 598 0 (set (reg:DI 145 [ D.1694 ]) >>> (zero_extend:DI (subreg:SI (reg:DI 540) 0))) -1 >>> (nil)) >>> >>> It looks to me that we have also to check the temp with SUBREG_PROMOTED_*. >> >> Yeah, that makes sense. >> > > Thanks Richard for your comments. > > I added this part of the code (in cfgexpand.c) to handle binary/unary/.. > gimple operations and used the LHS value range to infer the assigned > value range. I will revert this part of the code as this is wrong. > > I dont think checking promoted_mode for temp will be necessary here as > convert_move will handle it correctly if promoted_mode is set for temp. > > Thus, I will reimplement setting promoted_mode to temp (in > expand_expr_real_2) based on the gimple statement content on RHS. i.e. > by looking at the RHS operands and its value ranges and by calculating > the resulting value range. Does this sound OK to you.
No, this sounds backward again and won't work because those operands again could be just truncated - thus you can't rely on their value-range. What you would need is VRP computing value-ranges in the promoted mode from the start (and it doesn't do that). Richard. > Thanks, > Kugan >