On Wed, 15 Jan 2025 at 23:57, John Naylor <johncnaylo...@gmail.com> wrote:
>
> On Wed, Jan 15, 2025 at 2:14 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> > Compilers that inline memcpy() may arrive at the same machine code,
> > but why rely on the compiler to make that optimization?  If the
> > compiler fails to do so, an out-of-line memcpy() call will surely
> > be a loser.
>
> See measurements at the end. As for compilers, gcc 3.4.6 and clang
> 3.0.0 can inline the memcpy. The manual copy above only gets combined
> to a single word starting with gcc 12 and clang 15, and latest MSVC
> still can't do it (4A in the godbolt link below). Are there any
> buildfarm animals around that may not inline memcpy for word-sized
> input?
>
> > A variant could be
> >
> > +               const char *hexptr = &hextbl[2 * usrc];
> > +               *dst++ = hexptr[0];
> > +               *dst++ = hexptr[1];

I'd personally much rather see us using memcpy() for this sort of
stuff. If the compiler is too braindead to inline tiny
constant-and-power-of-two-sized memcpys then we'd probably also have
plenty of other performance issues with that compiler already. I don't
think contorting the code into something less human-readable and
something the compiler may struggle even more to optimise is a good
idea.  The nieve way to implement the above requires two MOVs of
single bytes and two increments of dst. I imagine it's easier for the
compiler to inline a small constant-sized memcpy() than to figure out
that it's safe to implement the above with a single word-sized MOV
rather than two byte-sized MOVs due to the "dst++" in between the two.

I agree that the evidence you (John) gathered is enough reason to use memcpy().

David


Reply via email to