2015-12-15 19:41 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>: > Hi Richard, > > I re-designed the patch to determine ability of loop masking on fly of > vectorization analysis and invoke it after loop transformation. > Test-case is also provided. > > what is your opinion? > > Thanks. > Yuri. >
Hi, I'm going to start work on extending this patch to handle mixed mask sizes, support vectorization of peeled loop tail and fix profitability estimation to choose proper loop tail processing. Here is shortly a planned changes list: 1. Don't put any restriction on mask type when check if statement can be masked. Instead just store all required masks in LOOP_VINFO_REQUIRED_MASKS. After all statements are checked we additionally check all required masks can be produced (we have proper comparison, widening and narrowing support). 2. In vect_estimate_min_profitable_iters compute overhead for masks creation, decide what we should do with a loop tail (nothing, vectorize, combine with loop body), additionally return a number of tail iterations required for chosen tail processing profitability. 3. In vect_transform_loop depending on chosen strategy either mask whole loop or produce vectorized tail. For now it's not fully clear to me what is the best way to get vectorized tail. The first option is to just peel one iteration after loop is vectorized. But in our masking functions we use LOOP_VINFO and STMT_VINFO structures we loose during peeling. Another option is to peel scalar loop and then just run vectorizer one more time to vectorize and mask it. Also we may peel vectorized loop and use original version (with all STMT_VINFO still available) as a tail and peeled version as a main loop. Currently I think the best option is to peel scalar loop and run vectorizer one more time for it. This option is simpler and can also be used to vectorize loop tail with a smaller vector size when target doesn't support masking or masking is not profitable. Any comments? Thanks, Ilya