On 11/17/24 7:59 PM, Maciej W. Rozycki wrote:
Expand coverage for `__builtin_memcpy', primarily for "cpymemM" block
copy pattern, although with smaller sizes open-coded sequences may be
produced instead.

This verifies block sizes in bytes from 1 to 64, across byte alignments
of 1, 2, 4, 8 and byte misalignments within from 0 up to 7 (there's some
redundancy there for the sake of simplicity of the test cases) both for
the source and the destination, making sure all data is copied and no
data is changed outside the area meant to be written.

These choice of the ranges for the parameters has come from the Alpha
backend, whose "cpymemM" pattern covers copies being made of up to 64
bytes and has various corner cases related to base alignment and the
misalignment within.

The test cases have turned invaluable in verifying changes to the Alpha
backend, but functionality covered is generic, so I have concluded these
tests qualify for generic verification and do not have to be limited to
the Alpha-specific subset of the testsuite.

On the implementation side the tests turned out being quite stressful to
GCC and the original simpler version that just expanded all code inline
took a lot of time to complete compilation.  Depending on the target and
compilation options elapsed times up to 40 minutes (!) have been seen,
especially with GCC built at `-O0' for debugging purposes.

At the cost of increased complexity where a pair of macros is required
per variant rather than just one I have split the code into individual
functions forced not to be inlined and it improved compilation times
considerably without losing coverage.

Example compilation times with reasonably fast POWER9@2.166GHz at `-O2'
optimization and GCC built at `-O2' for various targets:

mips-linux-gnu:        23s
vax-netbsdelf:         29s
alphaev56-linux-gnu:   39s
alpha-linux-gnu:       43s
powerpc64le-linux-gnu: 48s

With GCC built at `-O0':

alphaev56-linux-gnu: 3m37s
alpha-linux-gnu:     3m54s

I have therefore set the timeout factor accordingly so as to take slower
test hosts into account.

        gcc/testsuite/
        * gcc.c-torture/execute/memcpy-a1.c: New file.
        * gcc.c-torture/execute/memcpy-a2.c: New file.
        * gcc.c-torture/execute/memcpy-a4.c: New file.
        * gcc.c-torture/execute/memcpy-a8.c: New file.
        * gcc.c-torture/execute/memcpy-ax.h: New file.
OK. There's some chance for timing fallouts on things like qemu emulated targets, but I wouldn't let that get in the way of adding coverage. The total memory sizes don't look terrible, so I'm not too concerned about how the small embedded targets would respond from a testing standpoint.

jeff

Reply via email to