"Kewen.Lin" <li...@linux.ibm.com> writes:
> Hi Richard,
>
> Thanks for the comments!
>
> on 2020/6/29 下午6:07, Richard Sandiford wrote:
>> Thanks for the update.  I agree with the summary of the IRC discussion
>> except for…
>> 
>> "Kewen.Lin" <li...@linux.ibm.com> writes:
>>> Hi Richard S./Richi/Jim/Segher,
>>>
>>> Thanks a lot for your comments to make this patch more solid.
>>>
>>> Based on our discussion, for the vector load/store with length
>>> optab, the length unit would be measured in lanes by default.
>>> For the targets which support length measured in bytes like Power,
>>> they should only define VnQI modes to wrap the other same size
>>> vector modes.  If the length is larger than total lane/byte count
>>> of the given mode, it's taken to load all lanes/bytes implicitly.
>> 
>> …this last bit.  IMO the behaviour of the optab should be undefined
>> when the supplied length is greater than the number of lanes.
>> 
>> I think that also makes things better for the lxvl implementation,
>> which ignores the upper 56 bits of the length.  It sounds like the
>> above semantics would instead require Power to saturate the value
>> at 255 before shifting it.
>> 
>
> Good catch, I just realized that this part is inconsistent to what I
> implemented in patch 5/7, where the function vect_gen_len still does
> the min operation between the given length and length_limit.
>
> This patch is updated accordingly to state the behavior to be undefined.
> The others aren't required to change.
>
> Could you have a further look? Thanks in advance!
>
> v6/v7: Updated optab descriptions.
>
> v5:
>   - Updated lenload/lenstore optab to len_load/len_store and the docs.
>   - Rename expand_mask_{load,store}_optab_fn to 
> expand_partial_{load,store}_optab_fn
>   - Added/updated macros for expand_mask_{load,store}_optab_fn
>     and expand_len_{load,store}_optab_fn
>
> v4: Update len_load_direct/len_store_direct to align with direct optab.
>
> v3: Get rid of length mode hook.

Thanks, mostly looks good, just some comments about the documentation…

>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
> 2020-MM-DD  Kewen Lin  <li...@gcc.gnu.org>
>
>       * doc/md.texi (len_load_@var{m}): Document.
>       (len_store_@var{m}): Likewise.
>       * internal-fn.c (len_load_direct): New macro.
>       (len_store_direct): Likewise.
>       (expand_len_load_optab_fn): Likewise.
>       (expand_len_store_optab_fn): Likewise.
>       (direct_len_load_optab_supported_p): Likewise.
>       (direct_len_store_optab_supported_p): Likewise.
>       (expand_mask_load_optab_fn): New macro.  Original renamed to ...
>       (expand_partial_load_optab_fn): ... here.  Add handlings for
>       len_load_optab.
>       (expand_mask_store_optab_fn): New macro.  Original renamed to ...
>       (expand_partial_store_optab_fn): ... here. Add handlings for
>       len_store_optab.
>       (internal_load_fn_p): Handle IFN_LEN_LOAD.
>       (internal_store_fn_p): Handle IFN_LEN_STORE.
>       (internal_fn_stored_value_index): Handle IFN_LEN_STORE.
>       * internal-fn.def (LEN_LOAD): New internal function.
>       (LEN_STORE): Likewise.
>       * optabs.def (len_load_optab, len_store_optab): New optab.
>
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 2c67c818da5..c8d7bcc7f62 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5167,6 +5167,33 @@ mode @var{n}.
>  
>  This pattern is not allowed to @code{FAIL}.
>  
> +@cindex @code{len_load_@var{m}} instruction pattern
> +@item @samp{len_load_@var{m}}
> +Load the number of units specified by operand 2 from memory operand 1

s/units/vector elements/

> +into register operand 0, setting the other bytes of operand 0 to

s/bytes/elements/

Maybe s/register operand 0/vector register operand 0/ would be clearer,
now that we're explicitly measuring elements rather than bytes.

> +undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has

and maybe here “…@var{m}, which must be a vector mode”.

> +whichever integer mode the target prefers.  If operand 2 exceeds the
> +maximum units of mode @var{m}, the behavior is undefined.  For targets

Maybe s/maximum units of/number of elements in/

> +which support length measured in bytes, they should only define VnQI
> +mode to wrap the other vector modes with the same size.  Meanwhile,

How about:

  If the target prefers the length to be measured in bytes
  rather than elements, it should only implement this pattern
  for vectors of @code{QI} elements.

> +it's required that the byte count should be a multiple of the element
> +size (wrapped vector).

This last sentence doesn't apply now that the length is measured in
elements (lanes).

> +
> +This pattern is not allowed to @code{FAIL}.
> +
> +@cindex @code{len_store_@var{m}} instruction pattern
> +@item @samp{len_store_@var{m}}
> +Store the number of units specified by operand 2 from nonmemory operand 1
> +into memory operand 0, leaving the other bytes of operand 0 unchanged.
> +Operands 0 and 1 have mode @var{m}.  Operand 2 has whichever integer
> +mode the target prefers.  If operand 2 exceeds the maximum units of mode
> +@var{m}, the behavior is undefined.  For targets which support length
> +measured in bytes, they should only define VnQI mode to wrap the other
> +vector modes with the same size.  Meanwhile, it's required that the byte
> +count should be a multiple of the element size (wrapped vector).

Equivalent changes here too.

Thanks,
Richard

Reply via email to