> rdapp....@gmail.com writes: >> From: Robin Dapp <rd...@ventanamicro.com> >> >> This adds zero else operands to masked loads and their intrinsics. >> I needed to adjust more than initially thought because we rely on >> combine for several instructions and a change in a "base" pattern >> needs to propagate to all those. >> >> For the lack of a better idea I used a function call property to specify >> whether a builtin needs an else operand or not. Somebody with better >> knowledge of the aarch64 target can surely improve that. > > This part no longer holds :) > >> [...] >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> index fe16d93adcd..406ceb13a4c 100644 >> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc >> [...] >> @@ -1537,11 +1538,14 @@ public: >> { >> insn_code icode; >> if (e.vectors_per_tuple () == 1) >> - icode = convert_optab_handler (maskload_optab, >> - e.vector_mode (0), e.gp_mode (0)); >> + { >> + icode = convert_optab_handler (maskload_optab, >> + e.vector_mode (0), e.gp_mode (0)); >> + e.args.quick_push (CONST0_RTX (e.vector_mode (0))); > > It looks like this should no longer be necessary, since > use_contiguous_load_insn should add it instead. (Let me know if that's > wrong though.) > >> + } >> else >> icode = code_for_aarch64 (UNSPEC_LD1_COUNT, e.tuple_mode (0)); >> - return e.use_contiguous_load_insn (icode); >> + return e.use_contiguous_load_insn (icode, true); >> } >> }; >> >> @@ -1551,13 +1555,19 @@ class svld1_extend_impl : public extending_load >> public: >> using extending_load::extending_load; >> >> + unsigned int >> + call_properties (const function_instance &) const override >> + { >> + return CP_READ_MEMORY; >> + } >> + > > It looks like this is a left-over from the previous version and > could be removed. > >> rtx >> expand (function_expander &e) const override >> { >> - insn_code icode = code_for_aarch64_load (UNSPEC_LD1_SVE, >> extend_rtx_code (), >> + insn_code icode = code_for_aarch64_load (extend_rtx_code (), >> e.vector_mode (0), >> e.memory_vector_mode ()); >> - return e.use_contiguous_load_insn (icode); >> + return e.use_contiguous_load_insn (icode, true); >> } >> }; >> >> [...] >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc >> b/gcc/config/aarch64/aarch64-sve-builtins.cc >> index ef14f8cd39d..84c0a0caa50 100644 >> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc >> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc >> @@ -4229,7 +4229,7 @@ function_expander::use_vcond_mask_insn (insn_code >> icode, >> Extending loads have a further predicate (operand 3) that nominally >> controls the extension. */ >> rtx >> -function_expander::use_contiguous_load_insn (insn_code icode) >> +function_expander::use_contiguous_load_insn (insn_code icode, bool has_else) > > The comment should describe the new parameter. Maybe add a new paragraph: > > HAS_ELSE is true if the pattern has an additional operand that specifies > the values of inactive lanes. This exists to match the general maskload > interface and is always zero for AArch64. */ > >> { >> machine_mode mem_mode = memory_vector_mode (); >> >> [...] >> diff --git a/gcc/config/aarch64/aarch64-sve.md >> b/gcc/config/aarch64/aarch64-sve.md >> index 06bd3e4bb2c..a2e9f52d024 100644 >> --- a/gcc/config/aarch64/aarch64-sve.md >> +++ b/gcc/config/aarch64/aarch64-sve.md >> [...] >> @@ -1302,11 +1303,14 @@ (define_expand "vec_load_lanes<mode><vsingle>" >> [(set (match_operand:SVE_STRUCT 0 "register_operand") >> (unspec:SVE_STRUCT >> [(match_dup 2) >> - (match_operand:SVE_STRUCT 1 "memory_operand")] >> + (match_operand:SVE_STRUCT 1 "memory_operand") >> + (match_dup 3) >> + ] > > Formatting nit, sorry, but: local style is to put the closing ] > on the previous line. > > OK with those changes. Thanks a lot for doing this!
Incorporated the changes for the final test run. Thanks for catching those. -- Regards Robin