Richard Biener <richard.guent...@gmail.com> writes: > On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Richard Biener <richard.guent...@gmail.com> writes: >> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches >> > <gcc-patches@gcc.gnu.org> wrote: >> >> >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that >> >> > we're not papering over an issue elsewhere. >> >> >> >> Yes, I also wonder if this is an issue in vectorizable_reduction. Below >> >> is the the gimple of "gcc.target/aarch64/sve/cost_model_13.c": >> >> >> >> <bb 3>: >> >> # res_18 = PHI <res_15(7), 0(6)> >> >> # i_20 = PHI <i_16(7), 0(6)> >> >> _1 = (long unsigned int) i_20; >> >> _2 = _1 * 2; >> >> _3 = x_14(D) + _2; >> >> _4 = *_3; >> >> _5 = (unsigned short) _4; >> >> res.0_6 = (unsigned short) res_18; >> >> _7 = _5 + res.0_6; <-- The current stmt_info >> >> res_15 = (short int) _7; >> >> i_16 = i_20 + 1; >> >> if (n_11(D) > i_16) >> >> goto <bb 7>; >> >> else >> >> goto <bb 4>; >> >> >> >> <bb 7>: >> >> goto <bb 3>; >> >> >> >> It looks like that STMT_VINFO_REDUC_DEF should be "res_18 = PHI >> >> <res_15(7), 0(6)>"? >> >> The status here is: >> >> STMT_VINFO_REDUC_IDX (stmt_info): 1 >> >> STMT_VINFO_REDUC_TYPE (stmt_info): TREE_CODE_REDUCTION >> >> STMT_VINFO_REDUC_VECTYPE (stmt_info): 0x0 >> > >> > Not all stmts in the SSA cycle forming the reduction have >> > STMT_VINFO_REDUC_DEF set, >> > only the last (latch def) and live stmts have at the moment. >> >> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing: >> >> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar) >> && vect_is_reduction (stmt_info)) >> >> to: >> >> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar) >> && STMT_VINFO_LIVE_P (stmt_info) >> && vect_is_reduction (stmt_info)) >> >> instead of using a null check. > > But as seen you will miss stmts that are part of the reduction then?
Yeah, but the code is doing a maximum of all the reductions in the loop: /* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately that's not yet the case. */ ops->reduction_latency = MAX (ops->reduction_latency, base * count); So as it stands, we only need to see each reduction (as opposed to each reduction statement) once. But we do want to know the length of each reduction... > In principle we could put STMT_VINFO_REDUC_DEF to other stmts > as well. See vectorizable_reduction in the > > while (reduc_def != PHI_RESULT (reduc_def_phi)) > > loop. Nod. That's where I'd got the STMT_VINFO_LIVE_P thing from. >> I see that vectorizable_reduction calculates a reduc_chain_length. >> Would it be OK to store that in the stmt_vec_info? I suppose the >> AArch64 code should be multiplying by that as well. (It would be a >> separate patch from this one though.) > > I don't think that's too relevant here (it also counts noop conversions). Bah. I'm loath to copy that loop and just pick out the relevant statements though. I suppose if every statement had a STMT_VINFO_REDUC_DEF, aarch64 could maintain a hash map from STMT_VINFO_REDUC_DEF to total latencies, and then take the maximum of those total latencies. It sounds pretty complex though... Thanks, Richard