>> Hi, >> >> The patch was updated with the newest trunk, and also contained some minor >> changes. >> >> I am working on another new feature which is meant to support pattern >> recognition >> of lane-reducing operations in affine closure originated from loop reduction >> variable, >> like: >> >> sum += cst1 * dot_prod_1 + cst2 * sad_2 + ... + cstN * lane_reducing_op_N >> >> The feature WIP depends on the patch. It has been a little bit long time >> since its post, >> would you please take a time to review this one? Thanks.
> This seems to do multiple things so I wonder if you can split up the > patch a bit? OK. Will send out split patches in new mails. > For example adding lane_reducing_op_p can be split out, it also seems like > the vect_transform_reduction change to better distribute work can be done > separately? Likewise refactoring like splitting out > vect_reduction_use_partial_vector. > > When we have > > sum += d0[i] * d1[i]; // dot-prod <vector(16) char> > sum += w[i]; // widen-sum <vector(16) short> > sum += abs(s0[i] - s1[i]); // sad <vector(8) short> > sum += n[i]; // normal <vector(4) int> > > the vector DOT_PROD and friend ops can end up mixing different lanes > since it is not specified which lanes are reduced into which output lane. > So, DOT_PROD might combine 0-3, 4-7, ... but SAD might combine > 0,4,8,12; 1,5,9,13; ... I think this isn't worse than what one op itself > is doing, but it's worth pointing out (it's probably unlikely a target > mixes different reduction strategies anyway). Yes. But even on a peculiar target, DOT_PROD and SAD have different reduction strategies, it does not impact result correctness, at least for integer operation. Is there anything special that we need to consider? > > Can you make sure to add at least one SLP reduction example to show > this works for SLP as well? OK. The patches contains the cases for SLP reduction chain. Will add one for SLP reduction, this should be a negative case. Thanks, Feng