Richard Sandiford <richard.sandif...@arm.com> writes: > Final pattern statements (those not in DEF_SEQ) always have the same > type and value as the original statements. We wouldn't see mismatched > precisions if we were only looking at final pattern statements. > > Like you say, the 16-bit addition didn't exist before vectorisation > (it was a 32-bit addition instead). So to make things type-correct, > the 32-bit addition: > > A: sum = a + b (STMT_VINFO_RELATED_STMT == A2) > > is replaced with: > > DEF_SEQ: > A1: tmp = a' + b' (STMT_VINFO_RELATED_STMT == A) > A2: sum' = (int) tmp (STMT_VINFO_RELATED_STMT == A) > > (using different notation from before, just to confuse things). > Here, A2 is the final pattern statement for A and A1 is just a > temporary result. sum == sum'. > > Later, we do a similar thing for the division itself. We have: > > B: quotient = sum / 0xff (STMT_VINFO_RELATED_STMT == B2) > > We realise that this can be a 16-bit division, so (IIRC) we use > vect_look_through_possible_promotion on sum to find the best > starting point. This should give: > > DEF_SEQ: > B1: tmp2 = tmp / (uint16_t) 0xff (STMT_VINFO_RELATED_STMT == B) > B2: quotient' = (int) tmp2 (STMT_VINFO_RELATED_STMT == B) > > Both changes are done by vect_widened_op_tree.
Eh, I meant vect_recog_over_widening_pattern. Richard