On Wed, Aug 26, 2015 at 4:38 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > 2015-08-26 16:02 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >> On Fri, Aug 21, 2015 at 2:17 PM, Ilya Enkovich <enkovich....@gmail.com> >> wrote: >>> 2015-08-21 14:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>> >>>> Hmm, I don't see how vector masks are more difficult to operate with. >>> >>> There are just no instructions for that but you have to pretend you >>> have to get code vectorized. >> >> Huh? Bitwise ops should be readily available. > > Right bitwise ops are available, but there is no comparison into a > vector and no masked loads and stores using vector masks (when we > speak about 512-bit vectors). > >> >>>> >>>>> Also according to vector ABI integer mask should be used for mask >>>>> operand in case of masked vector call. >>>> >>>> What ABI? The function signature of the intrinsics? How would that >>>> come into play here? >>> >>> Not intrinsics. I mean OpenMP vector functions which require integer >>> arg for a mask in case of 512-bit vector. >> >> How do you declare those? > > Something like this: > > #pragma omp declare simd inbranch > int foo(int*);
The 'inbranch' is the thing that matters? And all of foo is then implicitely predicated? >> >>>> >>>>> Current implementation of masked loads, masked stores and bool >>>>> patterns in vectorizer just reflect SSE4 and AVX. Can (and should) we >>>>> really call it a canonical representation for all targets? >>>> >>>> No idea - we'll revisit when another targets adds a similar capability. >>> >>> AVX-512 is such target. Current representation forces multiple scalar >>> mask -> vector mask and back transformations which are artificially >>> introduced by current bool patterns and are hard to optimize out. >> >> I dislike the bool patterns anyway and we should try to remove those >> and make the vectorizer handle them in other ways (they have single-use >> issues anyway). I don't remember exactly what caused us to add them >> but one reason was there wasn't a vector type for 'bool' (but I don't see how >> it should be necessary to ask "get me a vector type for 'bool'"). >> >>>> >>>>> Using scalar masks everywhere should probably cause the same conversion >>>>> problem for SSE I listed above though. >>>>> >>>>> Talking about a canonical representation, shouldn't we use some >>>>> special masks representation and not mixing it with integer and vector >>>>> of integers then? Only in this case target would be able to >>>>> efficiently expand it into a corresponding rtl. >>>> >>>> That was my idea of vector<bool> ... but I didn't explore it and see where >>>> it will cause issues. >>>> >>>> Fact is GCC already copes with vector masks generated by vector compares >>>> just fine everywhere and I'd rather leave it as that. >>> >>> Nope. Currently vector mask is obtained from a vec_cond <A op B, {0 .. >>> 0}, {-1 .. -1}>. AND and IOR on bools are also expressed via >>> additional vec_cond. I don't think vectorizer ever generates vector >>> comparison. >> >> Ok, well that's an implementation detail then. Are you sure about AND and >> IOR? >> The comment above vect_recog_bool_pattern says >> >> Assuming size of TYPE is the same as size of all comparisons >> (otherwise some casts would be added where needed), the above >> sequence we create related pattern stmts: >> S1' a_T = x1 CMP1 y1 ? 1 : 0; >> S3' c_T = x2 CMP2 y2 ? a_T : 0; >> S4' d_T = x3 CMP3 y3 ? 1 : 0; >> S5' e_T = c_T | d_T; >> S6' f_T = e_T; >> >> thus has vector mask | > > I think in practice it would look like: > > S4' d_T = x3 CMP3 y3 ? 1 : c_T; > > Thus everything is usually hidden in vec_cond. But my concern is > mostly about types used for that. > >> >>> And I wouldn't say it's fine 'everywhere' because there is a single >>> target utilizing them. Masked loads and stored for AVX-512 just don't >>> work now. And if we extend existing MASK_LOAD and MASK_STORE optabs to >>> 512-bit vector then we get an ugly inefficient code. The question is >>> where to fight with this inefficiency: in RTL or in GIMPLE. I want to >>> fight with it where it appears, i.e. in GIMPLE by preventing bool -> >>> int conversions applied everywhere even if target doesn't need it. >>> >>> If we don't want to support both types of masks in GIMPLE then it's >>> more reasonable to make bool -> int conversion in expand for targets >>> requiring it, rather than do it for everyone and then leave it to >>> target to transform it back and try to get rid of all those redundant >>> transformations. I'd give vector<bool> a chance to become a canonical >>> mask representation for that. >> >> Well, you are missing the case of >> >> bool b = a < b; >> int x = (int)b; > > This case seems to require no changes and just be transformed into vec_cond. Ok, the example was too simple but I meant that a bool has a non-conditional use. Ok, so I still believe we don't want two ways to express things on GIMPLE if possible. Yes, the vectorizer already creates only vector stmts that are supported by the hardware. So it's a matter of deciding on the GIMPLE representation for the "mask". I'd rather use vector<bool> (and the target assigning an integer mode to it) than an 'int' in GIMPLE statements. Because that makes the type constraints on GIMPLE very weak and exposes those 'ints' to all kind of optimization passes. Thus if we change the result type requirement of vector comparisons from signed integer vectors to bool vectors the vectorizer can still go for promoting that bool vector to a vector of ints via a VEC_COND_EXPR and the expander can special-case that if the target has a vector comparison producing a vector mask. So, can you give that vector<bool> some thought? Note that to assign sth else than a vector mode to it needs adjustments in stor-layout.c. I'm pretty sure we don't want vector BImodes. Richard. > Thanks, > Ilya > >> >> where the bool is used as integer (and thus an integer mask would have to be >> "expanded"). When the bool is a mask in itself the integer use is either >> free >> or a matter of a widening/shortening operation. >> >> Richard. >>