Martin, Do you think Magma uses naive multiplication for its base case? This can be ridicoulously fast, especially over GF2. I note for example that Magma's base case is about 6 times faster than M4RI at 1000x1000. Is it possible that the naive multiplication can just be optimised with a far better constant and that the crossover with m4r is above the crossover with Strassen-Winograd?
I also note that the M4RI SSE2 code doesn't seem faster than the generic C code on the Opteron, in fact it seems the other way around. Very strange. Bill. On 15 May, 20:25, Martin Albrecht <[EMAIL PROTECTED]> wrote: > On Thursday 15 May 2008, Bill Hart wrote: > > > Hi Martin, > > > Here is a run that illustrates the problem. Am I doing something > > wrong? > > No, I was stupid. The cpucycles are printed as %u but they should be printed > as %llu since they are longer than an int. I've attached the fixed C file > (since it is < 1KB). > > -- > name: Martin Albrecht > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > _www:http://www.informatik.uni-bremen.de/~malb > _jab: [EMAIL PROTECTED] > > bench_multiplication.c > 1KDownload --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---