Yes, OK, however, have you been able to test performance. I am only
curious. There was a test program we used back when this code was first
implemented in bugzilla. I do not remember the PR number off hand.
Jerry
On 2/23/21 1:46 PM, Harald Anlauf via Fortran wrote:
Dear all,
under certain circumstances a call to MATMUL for rank-2 times rank-1
would invoke a highly tuned rank-2 times rank-2 algorithm which could
lead to invalid reads and writes. The solution is to check the rank
of the second argument to matmul and fall back to a regular algorithm
for rank-1. The invalid accesses did show up with valgrind.
I have not been able to create a testcase that gives wrong results.
Regtested on x86_64-pc-linux-gnu, and verified with valgrind.
OK for master?
As this affects all open branches down to 8, ok for backports?
Thanks,
Harald
PR libfortran/99218 - matmul on temporary array accesses invalid memory
Do not invoke tuned rank-2 times rank-2 matmul if rank(b) == 1.
libgfortran/ChangeLog:
PR libfortran/99218
* m4/matmul_internal.m4: Invoke tuned matmul only for rank(b)>1.
* generated/matmul_c10.c: Regenerated.
* generated/matmul_c16.c: Likewise.
* generated/matmul_c4.c: Likewise.
* generated/matmul_c8.c: Likewise.
* generated/matmul_i1.c: Likewise.
* generated/matmul_i16.c: Likewise.
* generated/matmul_i2.c: Likewise.
* generated/matmul_i4.c: Likewise.
* generated/matmul_i8.c: Likewise.
* generated/matmul_r10.c: Likewise.
* generated/matmul_r16.c: Likewise.
* generated/matmul_r4.c: Likewise.
* generated/matmul_r8.c: Likewise.
* generated/matmulavx128_c10.c: Likewise.
* generated/matmulavx128_c16.c: Likewise.
* generated/matmulavx128_c4.c: Likewise.
* generated/matmulavx128_c8.c: Likewise.
* generated/matmulavx128_i1.c: Likewise.
* generated/matmulavx128_i16.c: Likewise.
* generated/matmulavx128_i2.c: Likewise.
* generated/matmulavx128_i4.c: Likewise.
* generated/matmulavx128_i8.c: Likewise.
* generated/matmulavx128_r10.c: Likewise.
* generated/matmulavx128_r16.c: Likewise.
* generated/matmulavx128_r4.c: Likewise.
* generated/matmulavx128_r8.c: Likewise.