"Kewen.Lin" <li...@linux.ibm.com> writes: > @@ -626,6 +645,12 @@ public: > /* True if have decided to use a fully-masked loop. */ > bool fully_masked_p; > > + /* Records whether we still have the option of using a length access loop. > */ > + bool can_with_length_p; > + > + /* True if have decided to use length access for the loop fully. */ > + bool fully_with_length_p;
Rather than duplicate the flags like this, I think we should have three bits of information: (1) Can the loop operate on partial vectors? Starts off optimistically assuming "yes", gets set to "no" when we find a counter-example. (2) If we do decide to use partial vectors, will we need loop masks? (3) If we do decide to use partial vectors, will we need lengths? Vectorisation using partial vectors succeeds if (1) && ((2) != (3)) LOOP_VINFO_CAN_FULLY_MASK_P currently tracks (1) and LOOP_VINFO_MASKS currently tracks (2). In pathological cases it's already possible to have (1) && !(2), see r9-6240 for an example. With the new support, LOOP_VINFO_LENS tracks (3). So I don't think we need the can_with_length_p. What is now LOOP_VINFO_CAN_FULLY_MASK_P can continue to track (1) for both approaches, with the final choice of approach only being made at the end. Maybe it would be worth renaming it to something more generic though, now that we have two approaches to partial vectorisation. I think we can assume for now that no arch will be asymmetrical, and require (say) loop masks for loads and lengths for stores. So if that does happen (i.e. if (2) && (3) ends up being true) we should just be able to punt on partial vectorisation. Some of the new length code looks like it's copied and adjusted from the corresponding mask code. It would be good to share the code instead where possible, e.g. when deciding whether an IV can overflow. Thanks, Richard