Here are the times I get with the different cutoffs. Magma M4RI:7200 M4RI:2048
10000x10000: 2.940s 3.442s 4.132s 16384x16384: 9.250s 11.47s 11.80s 20000x20000: 16.57s 19.3s 26.05s 32000x32000: 59.05s 71.9s 71.8s So it seems when there is not an exact cut, the higher cutoff is substantially better. Don't know why that is. Tomorrow I'll see if there is anything I have that speeds up your code. I'm hopeful we'll be within about 5% on the Opteron by then. The other ideas I outlined above should push us 10-15% ahead of Magma if we end up implementing them, I think. Of course one can go too crazy with optimisation. Bill. On 18 May, 00:40, Martin Albrecht <[EMAIL PROTECTED]> wrote: > On Sunday 18 May 2008, Bill Hart wrote: > > > > > I don't have the two Gray code tables, so it would be good to get your > > version. Also my code is currently a mess, so it would be good to > > clean it up by merging with a cleaner version (yours). Tomorrow I'll > > check carefully what I've changed and try and merge the ideas if there > > are any you don't have which definitely improve performance on the > > Opteron. > > > The speedups I am seeing from the ifs are possibly a feature of the > > Opteron cache algorithms. It is very sensitive when things just begin > > to fall out of cache, as they certainly are here. Not combining with > > the zero row just nudges things closer in to the cache boundary since > > it never has to read that row. > > > I have checked and the speedups are quite reproducible, and they > > definitely come from the ifs, though I am now using a crossover with > > Strassen of 7200!! > > I'm using a crossover of 2048 here, so maybe our improvements are orthogonal? > Even more puzzling, I'd expect that my crossover should be bigger than yours. > (on a side note: my code changes how the crossover is used, your > version: 'size < cutoff', my version: '|cutoff - size| is minimal' which > should give a actual cutoffs closer to the desired values). > > My version is here: > > http://sage.math.washington.edu/home/malb/spkgs/libm4ri-20080516.p1.spkg > > (this needs an updated patch for Sage) > > and here: > > http://sage.math.washington.edu/home/malb/m4ri-20080516.tar.gz > > (which is the raw source). Those don't have SSE2 yet but it doesn't seem to > make that much of a difference anyway. I'll add that back before doing an > official release. However, unfortunately I'll probably have limited/no time > tomorrow to commit. > > Martin > > PS: To give at least some indication that my code still does the right thing, > a 'known answer' test: > > sage: A = random_matrix(GF(2), 10^3, 10^3) > sage: B = random_matrix(GF(2), 10^3, 10^3) > sage: (A*B)._magma_() == A._magma_() * B._magma_() > True > sage: (A._multiply_strassen(B,cutoff=256))._magma_() == A._magma_() * > B._magma_() > True > > -- > name: Martin Albrecht > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > _www:http://www.informatik.uni-bremen.de/~malb > _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---