> rdapp....@gmail.com writes:
>> From: Robin Dapp <rd...@ventanamicro.com>
>>
>> This adds zero else operands to masked loads and their intrinsics.
>> I needed to adjust more than initially thought because we rely on
>> combine for several instructions and a change in a "base" pattern
>> needs to propagate to all those.
>>
>> For the lack of a better idea I used a function call property to specify
>> whether a builtin needs an else operand or not.  Somebody with better
>> knowledge of the aarch64 target can surely improve that.
>
> This part no longer holds :)
>
>> [...]
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> index fe16d93adcd..406ceb13a4c 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> [...]
>> @@ -1537,11 +1538,14 @@ public:
>>    {
>>      insn_code icode;
>>      if (e.vectors_per_tuple () == 1)
>> -      icode = convert_optab_handler (maskload_optab,
>> -                                 e.vector_mode (0), e.gp_mode (0));
>> +      {
>> +    icode = convert_optab_handler (maskload_optab,
>> +                                   e.vector_mode (0), e.gp_mode (0));
>> +    e.args.quick_push (CONST0_RTX (e.vector_mode (0)));
>
> It looks like this should no longer be necessary, since
> use_contiguous_load_insn should add it instead.  (Let me know if that's
> wrong though.)
>
>> +      }
>>      else
>>        icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0));
>> -    return e.use_contiguous_load_insn (icode);
>> +    return e.use_contiguous_load_insn (icode, true);
>>    }
>>  };
>>  
>> @@ -1551,13 +1555,19 @@ class svld1_extend_impl : public extending_load
>>  public:
>>    using extending_load::extending_load;
>>  
>> +  unsigned int
>> +  call_properties (const function_instance &) const override
>> +  {
>> +    return CP_READ_MEMORY;
>> +  }
>> +
>
> It looks like this is a left-over from the previous version and
> could be removed.
>
>>    rtx
>>    expand (function_expander &e) const override
>>    {
>> -    insn_code icode = code_for_aarch64_load (UNSPEC_LD1_SVE, 
>> extend_rtx_code (),
>> +    insn_code icode = code_for_aarch64_load (extend_rtx_code (),
>>                                           e.vector_mode (0),
>>                                           e.memory_vector_mode ());
>> -    return e.use_contiguous_load_insn (icode);
>> +    return e.use_contiguous_load_insn (icode, true);
>>    }
>>  };
>>  
>> [...]
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> index ef14f8cd39d..84c0a0caa50 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> @@ -4229,7 +4229,7 @@ function_expander::use_vcond_mask_insn (insn_code 
>> icode,
>>     Extending loads have a further predicate (operand 3) that nominally
>>     controls the extension.  */
>>  rtx
>> -function_expander::use_contiguous_load_insn (insn_code icode)
>> +function_expander::use_contiguous_load_insn (insn_code icode, bool has_else)
>
> The comment should describe the new parameter.  Maybe add a new paragraph:
>
>    HAS_ELSE is true if the pattern has an additional operand that specifies
>    the values of inactive lanes.  This exists to match the general maskload
>    interface and is always zero for AArch64.  */
>
>>  {
>>    machine_mode mem_mode = memory_vector_mode ();
>>  
>> [...]
>> diff --git a/gcc/config/aarch64/aarch64-sve.md 
>> b/gcc/config/aarch64/aarch64-sve.md
>> index 06bd3e4bb2c..a2e9f52d024 100644
>> --- a/gcc/config/aarch64/aarch64-sve.md
>> +++ b/gcc/config/aarch64/aarch64-sve.md
>> [...]
>> @@ -1302,11 +1303,14 @@ (define_expand "vec_load_lanes<mode><vsingle>"
>>    [(set (match_operand:SVE_STRUCT 0 "register_operand")
>>      (unspec:SVE_STRUCT
>>        [(match_dup 2)
>> -       (match_operand:SVE_STRUCT 1 "memory_operand")]
>> +       (match_operand:SVE_STRUCT 1 "memory_operand")
>> +       (match_dup 3)
>> +      ]
>
> Formatting nit, sorry, but: local style is to put the closing ]
> on the previous line.
>
> OK with those changes.  Thanks a lot for doing this!

Incorporated the changes for the final test run.  Thanks for catching those.

-- 
Regards
 Robin

Reply via email to