On Mon, Feb 4, 2019 at 3:19 AM Jakub Jelinek <ja...@redhat.com> wrote: > > On Sun, Feb 03, 2019 at 08:07:22AM -0800, H.J. Lu wrote: > > + /* If the misalignment of __P > 8, subtract __P by 8 bytes. > > + Otherwise, subtract __P by the misalignment. */ > > + if (offset > 8) > > + offset = 8; > > + __P = (char *) (((__SIZE_TYPE__) __P) - offset); > > + > > + /* Zero-extend __A and __N to 128 bits and shift right by the > > + adjustment. */ > > + unsigned __int128 __a128 = ((__v1di) __A)[0]; > > + unsigned __int128 __n128 = ((__v1di) __N)[0]; > > + __a128 <<= offset * 8; > > + __n128 <<= offset * 8; > > + __A128 = __extension__ (__v2di) { __a128, __a128 >> 64 }; > > + __N128 = __extension__ (__v2di) { __n128, __n128 >> 64 }; > > We have _mm_slli_si128/__builtin_ia32_pslldqi128, why can't you use that > instead of doing the arithmetics in unsigned __int128 scalars? >
Since "PSLLDQ xmm1, imm8" takes an immediate operand, __int128 doesn't need a switch statement. -- H.J.