http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59999
--- Comment #10 from Paulo J. Matos <pa...@matos-sorge.com> --- (In reply to Paulo J. Matos from comment #8) > > Made a mistake. With the attached test, the final gimple before expand for > the loop basic block is: > ;; basic block 5, loop depth 0 > ;; pred: 5 > ;; 4 > # i_26 = PHI <i_1(5), 0(4)> > # ivtmp.24_18 = PHI <ivtmp.24_12(5), ivtmp.24_29(4)> > _28 = (void *) ivtmp.24_18; > _13 = MEM[base: _28, offset: 0B]; > x.4_14 = x; > _15 = _13 ^ x.4_14; > MEM[base: _28, offset: 0B] = _15; > ivtmp.24_12 = ivtmp.24_18 + 4; > temp_ptr.5_17 = (Sample *) ivtmp.24_12; > _11 = (unsigned short) i_26; > _2 = _11 + 1; > i_1 = (short int) _2; > _10 = (int) i_1; > if (_10 < _25) > goto <bb 5>; > else > goto <bb 6>; > ;; succ: 5 > ;; 6 > > However, the point is the same. IVOPTS should probably generate an int IV > instead of a short int IV to avoid the sign extend since removing the sign > extend during RTL seems to be quite hard. > > What do you think? For >= 4.8 the scalar evolution of _10 is deemed not simple, because it looks like the following: <nop_expr 0x2aaaaacd9ee0 type <integer_type 0x2aaaaab16690 int public SI size <integer_cst 0x2aaaaab12c60 constant 32> unit size <integer_cst 0x2aaaaab12c80 constant 4> align 32 symtab 0 alias set 3 canonical type 0x2aaaaab16690 precision 32 min <integer_cst 0x2aaaaab12f80 -2147483648> max <integer_cst 0x2aaaaab12fa0 2147483647> context <translation_unit_decl 0x2aaaaab29c00 D.2881> pointer_to_this <pointer_type 0x2aaaaab23348>> arg 0 <polynomial_chrec 0x2aaaaacdb090 type <integer_type 0x2aaaaab16540 short int sizes-gimplified public HI size <integer_cst 0x2aaaaab12f20 constant 16> unit size <integer_cst 0x2aaaaab12f40 constant 2> align 16 symtab 0 alias set 4 canonical type 0x2aaaaab16540 precision 16 min <integer_cst 0x2aaaaab12ec0 -32768> max <integer_cst 0x2aaaaab12ee0 32767> pointer_to_this <pointer_type 0x2aaaaaca1f18>> arg 0 <integer_cst 0x2aaaaab1f260 constant 1> arg 1 <integer_cst 0x2aaaaacc9140 constant 1> arg 2 <integer_cst 0x2aaaaacc9140 1>>> This is something like: (int) (short int) {1, +, 1}_1. Since these are signed integers, we can assume they don't overflow, can't we simplify the scalar evolution to a polynomial_chrec over 32bit integers and forget the nop_expr that represents the sign extend?