On 11/09/15 14:19, Bill Schmidt wrote:
A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar that has to be broadcast back to a vector, and the best way to implement it for us already has the max value in all positions of a vector. But that is something we should be able to fix with simplify-rtx in the back end.
Reading this thread again, this bit stands out as unaddressed. Yes PowerPC can "fix" this with simplify-rtx, but the vector cost model will not take this into account - it will think that the broadcast-back-to-a-vector requires an extra operation after the reduction, whereas in fact it will not.
Does that suggest we should have a new entry in vect_cost_for_stmt for vec_to_scalar-and-back-to-vector (that defaults to vec_to_scalar+scalar_to_vec, but on some architectures e.g. PowerPC would be the same as vec_to_scalar)?
(I agree that if that's the limit of how "different" conditional reductions may be between architectures, then we should not have a vec_cost_for_stmt for a whole conditional reduction.)
Cheers, Alan