2015-08-26 0:26 GMT+03:00 Jeff Law <l...@redhat.com>: > On 08/21/2015 06:17 AM, Ilya Enkovich wrote: >>> >>> >>> Hmm, I don't see how vector masks are more difficult to operate with. >> >> >> There are just no instructions for that but you have to pretend you >> have to get code vectorized. >> >>> >>>> Also according to vector ABI integer mask should be used for mask >>>> operand in case of masked vector call. >>> >>> >>> What ABI? The function signature of the intrinsics? How would that >>> come into play here? >> >> >> Not intrinsics. I mean OpenMP vector functions which require integer >> arg for a mask in case of 512-bit vector. > > That's what I assumed -- you can pass in a mask as an argument and it's > supposed to be a simple integer, right?
Depending on target ABI requires either vector mask or a simple integer value. > > >> >>> >>>> Current implementation of masked loads, masked stores and bool >>>> patterns in vectorizer just reflect SSE4 and AVX. Can (and should) we >>>> really call it a canonical representation for all targets? >>> >>> >>> No idea - we'll revisit when another targets adds a similar capability. >> >> >> AVX-512 is such target. Current representation forces multiple scalar >> mask -> vector mask and back transformations which are artificially >> introduced by current bool patterns and are hard to optimize out. > > I'm a bit surprised they're so prevalent and hard to optimize away. ISTM PRE > ought to handle this kind of thing with relative ease. Most of vector comparisons are UNSPEC. And I doubt PRE may actually help much even if get rid of UNSPEC somehow. Is there really a redundancy in: if ((v1 cmp v2) && (v3 cmp v4)) load v1 cmp v2 -> mask1 select mask1 vec_cst_-1 vec_cst_0 -> vec_mask1 v3 cmp v4 -> mask2 select mask2 vec_mask1 vec_cst_0 -> vec_mask2 vec_mask2 NE vec_cst_0 -> mask3 load by mask3 It looks to me more like a i386 specific instruction selection problem. Ilya > > >>> Fact is GCC already copes with vector masks generated by vector compares >>> just fine everywhere and I'd rather leave it as that. >> >> >> Nope. Currently vector mask is obtained from a vec_cond <A op B, {0 .. >> 0}, {-1 .. -1}>. AND and IOR on bools are also expressed via >> additional vec_cond. I don't think vectorizer ever generates vector >> comparison. >> >> And I wouldn't say it's fine 'everywhere' because there is a single >> target utilizing them. Masked loads and stored for AVX-512 just don't >> work now. And if we extend existing MASK_LOAD and MASK_STORE optabs to >> 512-bit vector then we get an ugly inefficient code. The question is >> where to fight with this inefficiency: in RTL or in GIMPLE. I want to >> fight with it where it appears, i.e. in GIMPLE by preventing bool -> >> int conversions applied everywhere even if target doesn't need it. > > You should expect pushback anytime target dependencies are added to gimple, > even if it's stuff in the vectorizer, which is infested with target > dependencies. > >> >> If we don't want to support both types of masks in GIMPLE then it's >> more reasonable to make bool -> int conversion in expand for targets >> requiring it, rather than do it for everyone and then leave it to >> target to transform it back and try to get rid of all those redundant >> transformations. I'd give vector<bool> a chance to become a canonical >> mask representation for that. > > Might be worth some experimentation. > > Jeff