On 12/20/18 5:44 PM, Segher Boessenkool wrote: > On Thu, Dec 20, 2018 at 05:34:54PM -0600, Aaron Sawdey wrote: >> On 12/20/18 3:51 AM, Segher Boessenkool wrote: >>> On Wed, Dec 19, 2018 at 01:53:05PM -0600, Aaron Sawdey wrote: >>>> Because of POWER9 dd2.1 issues with certain unaligned vsx instructions >>>> to cache inhibited memory, here is a patch that keeps memmove (and memcpy) >>>> inline expansion from doing unaligned vector or using vector load/store >>>> other than lvx/stvx. More description of the issue is here: >>>> >>>> https://patchwork.ozlabs.org/patch/814059/ >>>> >>>> OK for trunk if bootstrap/regtest ok? >>> >>> Okay, but see below. >>> >> [snip] >>> >>> This is extraordinarily clumsy :-) Maybe something like: >>> >>> static rtx >>> gen_lvx_v4si_move (rtx dest, rtx src) >>> { >>> gcc_assert (!(MEM_P (dest) && MEM_P (src)); >>> gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode); >>> if (MEM_P (dest)) >>> return gen_altivec_stvx_v4si_internal (dest, src); >>> else if (MEM_P (src)) >>> return gen_altivec_lvx_v4si_internal (dest, src); >>> else >>> gcc_unreachable (); >>> } >>> >>> (Or do you allow VOIDmode for src as well?) Anyway, at least get rid of >>> the useless extra variable. >> >> I think this should be better: > > The gcc_unreachable at the end catches the non-mem to non-mem case. > >> static rtx >> gen_lvx_v4si_move (rtx dest, rtx src) >> { >> gcc_assert ((MEM_P (dest) && !MEM_P (src)) || (MEM_P (src) && >> !MEM_P(dest))); > > But if you prefer this, how about > > { > gcc_assert (MEM_P (dest) ^ MEM_P (src)); > gcc_assert (GET_MODE (dest) == V4SImode && GET_MODE (src) == V4SImode); > > if (MEM_P (dest)) > return gen_altivec_stvx_v4si_internal (dest, src); > else > return gen_altivec_lvx_v4si_internal (dest, src); > } > > :-) > > > Segher >
I like that even better, thanks! -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain