On Saturday 17 May 2008, Martin Albrecht wrote: > > I think a better idea would be to explicitly force all matrices and > > all rows to be 128 bit aligned if the matrices are wide enough to > > benefit from SSE2, Then the combine function can always use SSE2 and > > there will be no need to check for alignment. > > That doesn't seem to make a noticeable difference for me (on C2D). However, > I realised that the multiplications where the target matrix is a real > matrix rather than a window (which has bad data locality). Copying > everything over seems not like a good idea but it at least indicates an > area for improvements.
Okay, if I only copy when we crossover to M4RM then the memory overhead is constant (~ cutoff^2) and the performance still improves. Old: 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo Matrix Dimension Magma 2.14-13 (64-bit) M4RI-20080517 (64-bit) 10,000 x 10,000 2.920 3.610 16,384 x 16,384 11.140 12.120 20,000 x 20,000 20.370 24.390 32,000 x 32,000 74.290 94.910 New: 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo Matrix Dimension Magma 2.14-13 (64-bit) M4RI-20080517 (64-bit) 10,000 x 10,000 2.920 2.990 16,384 x 16,384 11.140 11.750 20,000 x 20,000 20.370 21.180 32,000 x 32,000 74.290 86.570 On Opteron things don't look this way, but I think sage.math is pretty heavily used right now such that my benchmarks there are not very telling. Martin -- name: Martin Albrecht _pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 _www: http://www.informatik.uni-bremen.de/~malb _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---