Hi Dominique, > IMO the inlining of MATMUL should be restricted to small matrices (less than > 4x4, 9x9 > or 16x16 depending of your field!-)
The problem with the library function we have is that it is quite general; it can deal with all the complexity of assumed-shape array arguments. Inlining allows the compiler to take advantage of contiguous memory, compile-time knowledge of array sizes, etc. to allow for vectorization or even complete unrolling. Of course, if you use c=matmul(a,b) where all three are assumed-shape arrays, your advantage over the library function will be minimal at best. It will be interesting to see at what size calling BLAS directly will be better. Thomas