"Kewen.Lin" <li...@linux.ibm.com> writes: > Hi Richard, > > Thanks for the comments! > > on 2020/6/29 下午6:07, Richard Sandiford wrote: >> Thanks for the update. I agree with the summary of the IRC discussion >> except for… >> >> "Kewen.Lin" <li...@linux.ibm.com> writes: >>> Hi Richard S./Richi/Jim/Segher, >>> >>> Thanks a lot for your comments to make this patch more solid. >>> >>> Based on our discussion, for the vector load/store with length >>> optab, the length unit would be measured in lanes by default. >>> For the targets which support length measured in bytes like Power, >>> they should only define VnQI modes to wrap the other same size >>> vector modes. If the length is larger than total lane/byte count >>> of the given mode, it's taken to load all lanes/bytes implicitly. >> >> …this last bit. IMO the behaviour of the optab should be undefined >> when the supplied length is greater than the number of lanes. >> >> I think that also makes things better for the lxvl implementation, >> which ignores the upper 56 bits of the length. It sounds like the >> above semantics would instead require Power to saturate the value >> at 255 before shifting it. >> > > Good catch, I just realized that this part is inconsistent to what I > implemented in patch 5/7, where the function vect_gen_len still does > the min operation between the given length and length_limit. > > This patch is updated accordingly to state the behavior to be undefined. > The others aren't required to change. > > Could you have a further look? Thanks in advance! > > v6/v7: Updated optab descriptions. > > v5: > - Updated lenload/lenstore optab to len_load/len_store and the docs. > - Rename expand_mask_{load,store}_optab_fn to > expand_partial_{load,store}_optab_fn > - Added/updated macros for expand_mask_{load,store}_optab_fn > and expand_len_{load,store}_optab_fn > > v4: Update len_load_direct/len_store_direct to align with direct optab. > > v3: Get rid of length mode hook.
Thanks, mostly looks good, just some comments about the documentation… > > BR, > Kewen > ----- > gcc/ChangeLog: > > 2020-MM-DD Kewen Lin <li...@gcc.gnu.org> > > * doc/md.texi (len_load_@var{m}): Document. > (len_store_@var{m}): Likewise. > * internal-fn.c (len_load_direct): New macro. > (len_store_direct): Likewise. > (expand_len_load_optab_fn): Likewise. > (expand_len_store_optab_fn): Likewise. > (direct_len_load_optab_supported_p): Likewise. > (direct_len_store_optab_supported_p): Likewise. > (expand_mask_load_optab_fn): New macro. Original renamed to ... > (expand_partial_load_optab_fn): ... here. Add handlings for > len_load_optab. > (expand_mask_store_optab_fn): New macro. Original renamed to ... > (expand_partial_store_optab_fn): ... here. Add handlings for > len_store_optab. > (internal_load_fn_p): Handle IFN_LEN_LOAD. > (internal_store_fn_p): Handle IFN_LEN_STORE. > (internal_fn_stored_value_index): Handle IFN_LEN_STORE. > * internal-fn.def (LEN_LOAD): New internal function. > (LEN_STORE): Likewise. > * optabs.def (len_load_optab, len_store_optab): New optab. > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 2c67c818da5..c8d7bcc7f62 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -5167,6 +5167,33 @@ mode @var{n}. > > This pattern is not allowed to @code{FAIL}. > > +@cindex @code{len_load_@var{m}} instruction pattern > +@item @samp{len_load_@var{m}} > +Load the number of units specified by operand 2 from memory operand 1 s/units/vector elements/ > +into register operand 0, setting the other bytes of operand 0 to s/bytes/elements/ Maybe s/register operand 0/vector register operand 0/ would be clearer, now that we're explicitly measuring elements rather than bytes. > +undefined values. Operands 0 and 1 have mode @var{m}. Operand 2 has and maybe here “…@var{m}, which must be a vector mode”. > +whichever integer mode the target prefers. If operand 2 exceeds the > +maximum units of mode @var{m}, the behavior is undefined. For targets Maybe s/maximum units of/number of elements in/ > +which support length measured in bytes, they should only define VnQI > +mode to wrap the other vector modes with the same size. Meanwhile, How about: If the target prefers the length to be measured in bytes rather than elements, it should only implement this pattern for vectors of @code{QI} elements. > +it's required that the byte count should be a multiple of the element > +size (wrapped vector). This last sentence doesn't apply now that the length is measured in elements (lanes). > + > +This pattern is not allowed to @code{FAIL}. > + > +@cindex @code{len_store_@var{m}} instruction pattern > +@item @samp{len_store_@var{m}} > +Store the number of units specified by operand 2 from nonmemory operand 1 > +into memory operand 0, leaving the other bytes of operand 0 unchanged. > +Operands 0 and 1 have mode @var{m}. Operand 2 has whichever integer > +mode the target prefers. If operand 2 exceeds the maximum units of mode > +@var{m}, the behavior is undefined. For targets which support length > +measured in bytes, they should only define VnQI mode to wrap the other > +vector modes with the same size. Meanwhile, it's required that the byte > +count should be a multiple of the element size (wrapped vector). Equivalent changes here too. Thanks, Richard