Richard,

Here is updated patch which does not include your proposal related to
the target hook deletion.
You wrote:
> I still don't understand why you need the new target hook.  If we have a 
> masked
> load/store then the mask is computed by an assignment with a VEC_COND_EXPR
> (in your example) and thus a test for a zero mask is expressible as just
>
 > if (vect__ifc__41.17_167 == { 0, 0, 0, 0... })
>
> or am I missing something?
Such vector compare produces vector and does not set up cc flags
required for conditional branch (GIMPLE_COND).
If we use such comparison for GIMPLE_COND we got error message, so I
used target hook which does set up cc flags aka ptest instruction and
I left this part.
I moved new routines to loop-vectorizer.c file and both they are static.
I re-wrote is_valid_sink function to use def-use chain as you proposed.
I also add new parameter to control such transformation.
Few redundant tests have also been deleted.
Any comments will be appreciated.

Thanks.
Yuri.

2015-06-18  Yuri Rumyantsev  <ysrum...@gmail.com>

* config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
(ix86_vectorize_build_zero_vector_test): New function.
(TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST): New target macro
* doc/tm.texi.in: Add @hook TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST.
* doc/tm.texi: Updated.
* params.def (PARAM_ZERO_TEST_FOR_STORE_MASK): New DEFPARAM.
* params.h (ENABLE_ZERO_TEST_FOR_STORE_MASK): new macros.
* target.def (build_zero_vector_test): New DEFHOOK.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c: Include files stringpool.h and tree-ssanames.h.
(is_valid_sink): New function.
(optimize_mask_stores): New function.
(vectorize_loops): Invoke optimaze_mask_stores for loops having masked
stores.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.

gcc/testsuite/ChangeLog:
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.



2015-06-09 15:13 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, May 20, 2015 at 4:00 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Hi All,
>>
>> Here is updated patch to optimize mask stores. The main goal of it is
>> to avoid execution of mask store if its mask is zero vector since
>> loads that follow it can be blocked.
>> The following changes were done:
>>   1.  A test on sink legality was added - it simply prohibits to cross
>> statements with non-null vdef or vuse.
>>   2. New phi node is created in join block for moved MASK_STORE statements.
>>   3. Test was changed to check that 2 MASK_STORE statements are not
>> moved to the same block.
>>   4. New field was added to loop_vec_info structure to mark loops
>> having MASK_STORE's.
>>
>> Any comments will be appreciated.
>> Yuri.
>
> I still don't understand why you need the new target hook.  If we have a 
> masked
> load/store then the mask is computed by an assignment with a VEC_COND_EXPR
> (in your example) and thus a test for a zero mask is expressible as just
>
>  if (vect__ifc__41.17_167 == { 0, 0, 0, 0... })
>
> or am I missing something?  As we dont' want this transform on all targets
> (or always) we can control it by either a --param or a new flag which default
> targets can adjust.  [Is the hazard still present with Skylake or other
> AVX512 implementations for example?]
>
> +/* Helper for optimize_mask_stores: returns true if STMT sinking to end
> +   of BB is valid and false otherwise.  */
> +
> +static bool
> +is_valid_sink (gimple stmt)
> +{
>
> so STMT is either a masked store or a masked load?  You shouldn't
> walk all stmts here but instead rely on the factored use-def def-use
> chains of virtual operands.
>
> +void
> +optimize_mask_stores (struct loop *loop)
> +{
> +  basic_block bb = loop->header;
> +  gimple_stmt_iterator gsi;
> +  gimple stmt;
> +  auto_vec<gimple> worklist;
> +
> +  if (loop->dont_vectorize
> +      || loop->num_nodes > 2)
> +    return;
>
> looks like no longer needed given the place this is called from now
> (or does it guard against outer loop vectorization as well?)
> Also put it into tree-vect-loop.c and make it static.
>
> +      /* Loop was not vectorized if mask does not have vector type.  */
> +      if (!VECTOR_TYPE_P (TREE_TYPE (mask)))
> +       return;
>
> this should be always false
>
> +      store_bb->frequency = bb->frequency / 2;
> +      efalse->probability = REG_BR_PROB_BASE / 2;
>
> I think the == 0 case should end up as unlikely.
>
> +      if (dom_info_available_p (CDI_POST_DOMINATORS))
> +       set_immediate_dominator (CDI_POST_DOMINATORS, bb, store_bb);
>
> post dominators are not available in the vectorizer.
>
> +         /* Put definition statement of stored value in STORE_BB
> +            if possible.  */
> +         arg3 = gimple_call_arg (last, 3);
> +         if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3))
> +           {
> +             def_stmt = SSA_NAME_DEF_STMT (arg3);
> +             /* Move def_stmt to STORE_BB if it is in the same bb and
> +                it is legal.  */
> +             if (gimple_bb (def_stmt) == bb
> +                 && is_valid_sink (def_stmt))
>
> ok, so you move arbitrary statements.  You can always move non-memory
> statements and if you keep track of the last store you moved you
> can check if gimple_vuse () is the same as on that last store and be
> done with that, or if you sink another store (under the same condition)
> then just update the last store.
>
> Otherwise this looks better now.
>
> Thus - get rid of the target hook and of is_valid_sink.
>
> Thanks,
> Richard.
>
>> 2015-05-20  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
>> (ix86_vectorize_is_zero_vector): New function.
>> (TARGET_VECTORIZE_IS_ZERO_VECTOR): New target macro
>> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_IS_ZERO_VECTOR.
>> * doc/tm.texi: Updated.
>> * target.def (is_zero_vector): New DEFHOOK.
>> * tree-vect-stmts.c : Include tree-into-ssa.h.
>> (vectorizable_mask_load_store): Initialize has_mask_store field.
>> (is_valid_sink): New function.
>> (optimize_mask_stores): New function.
>> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
>> loops having masked stores.
>> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
>> correspondent macros.
>> (optimize_mask_stores): Update prototype.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.

Attachment: patch.3
Description: Binary data

Reply via email to