On Friday 16 May 2008, Bill Hart wrote: > Here are the times I get for Magma vs M4RI now. Note the crossover > between the two programs is now above about 5000 and M4RI beats Magma > below that point. This suggests the remaining factor of 2 is in the > Strassen-Winograd function. Probably Winograd-Strassen is falling out > of L2 cache (the previous adjustments I made were to prevent the M4R > algorithm falling out of L1 cache). > > The other possibility is that Magma combines the two algorithms so > that there is even greater usage of the Gray code tables. This would > be an ugly hack, but could work.
Are you suggesting Magma uses M4RM too? I'd doubt that, since they don't state that anyway. Probably I'm just misunderstanding you. > 40000x40000: > Magma: 112.6s > M4RI: 232.4s > > 20000x20000: > Magma: 16.40s > M4RI: 32.34s > > 10000x10000: > Magma: 2.750s > M4RI: 4.529s > > 5000x5000: > Magma: 0.700s > M4RI: 0.672s > > 2500x2500: > Magma: 0.13s > M4RI: 0.079s > > 1250x1250: > Magma: 0.015s > M4RI: 0.012s > > 625x625: > Magma: 0.0030s > M4RI: 0.0023s > > 312x312: > Magma: 0.0014s > M4RI: 0.00032s > > 156x156: > Magma: 0.001s > M4RI: 0.0001s > > If I get some time I'll look into this. > > Did those changes work for you Martin? > > Bill. Yes, the change worked like a charm. I made some changes (the fixed k is replaced by a k that depends on the new block dimensions etc.) and it is much much faster now. I'm working on re-introducing SSE2 now to see if it at least on the Core2Duo makes the world a better place. Btw. all the stuff I wrote about the L2 cache size C2D vs. Opteron was bollocks. The reason I beat Magma so badly earlier was that I used a 32-bit Magma and compared it with a 64-bit version of M4RI. To state it clearly: We don't beat Magma anywhere. Anyhow, here are the times: 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo Matrix Dimension Magma 2.14-13 (64-bit) M4RI-20080517 (64-bit) 10,000 x 10,000 2.920 4.130 16,384 x 16,384 11.140 15.740 20,000 x 20,000 20.370 28.950 64-bit Debian/GNU Linux, 1.8Ghz Opteron (sage.math) Matrix Dimension Magma 2.13-5 (64-bit) M4RI-20080517 (64-bit) 10,000 x 10,000 3.930 7.860 16,384 x 16,384 16.230 104.77??? 20,000 x 20,000 27.080 56.420 (My university today finally granted me access to Magma 2.14 for my notebook.) As you can see there is an odd (reproducible) spike at 2^14 x 2^14. I cannot explain that for now, it might just be a bug. I have a similar spike in another run on another AMD (Athlon X2) machine: n: 2048, cutoff: 1024, speedup: 1.06, m4rm: 0.01 strassen: 0.01 n: 3072, cutoff: 1536, speedup: 0.87, m4rm: 0.04 strassen: 0.05 n: 4096, cutoff: 2048, speedup: 1.13, m4rm: 0.15 strassen: 0.13 n: 5120, cutoff: 2560, speedup: 1.35, m4rm: 0.39 strassen: 0.29 n: 6144, cutoff: 3072, speedup: 1.20, m4rm: 0.66 strassen: 0.55 n: 7168, cutoff: 3584, speedup: 1.87, m4rm: 1.64 strassen: 0.88 n: 8192, cutoff: 4096, speedup: 4.48, m4rm: 6.07 strassen: 1.35 >>> n: 9216, cutoff: 4608, speedup: 2.94, m4rm: 8.90 strassen: 3.02 <<< n: 10240, cutoff: 5120, speedup: 4.68, m4rm: 12.99 strassen: 2.78 n: 11264, cutoff: 5632, speedup: 4.84, m4rm: 18.78 strassen: 3.88 n: 12288, cutoff: 6144, speedup: 4.65, m4rm: 24.40 strassen: 5.24 n: 13312, cutoff: 6656, speedup: 3.32, m4rm: 30.92 strassen: 9.33 I'm investigating this one too and play around with the parameters. Since the C2D times are considerably better I also bet it is L2 related, we'll see. Thanks again! Martin -- name: Martin Albrecht _pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 _www: http://www.informatik.uni-bremen.de/~malb _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---