Hi Martin, This version works great. Here are the times on my unburdened 2.8Ghz Opteron. First the Magma times, then the times for an older version of m4ri and now, for the first time ever, the new Magma beating times:
10000x10000: 2.940s 3.13s 2.25s 16384x16384: 9.250s 12.96s 8.80s 20000x20000: 16.57s 22.43s 15.48 32000x32000: 59.1s 90.2s 57.8s The first three times pretty much agree with what I had before and are only marginally better. The third time is 4.3s faster than the time I had (before you fixed the bug). I am not so surprised that a bug fix could make this much difference. I've seen that sort of thing before. I suppose we should do a comparison at 32000x32000 with Magma just to verify, but other than that, well done Martin!! Bill. On 21 May, 19:57, Martin Albrecht <[EMAIL PROTECTED]> wrote: > On Wednesday 21 May 2008, Bill Hart wrote: > > > > > Hi Martin, > > > I downloaded the clean tarball and added an extra test, but I get: > > > mul: m: 4096, l: 3528, n: 4096, k: 0, cutoff: 1024 > > FAIL: Strassen != M4RM > > FAIL: Strassen != Naive > > > :-( > > > Also I later replaced the following lines of strassen.c: > > > a -= a%RADIX; > > b -= b%RADIX; > > c -= c%RADIX; > > > with > > > unsigned long mult = 1; > > unsigned long width = a; > > while (width > 2*cutoff) > > { > > width/=2; > > mult*=2; > > } > > a -= a%(RADIX*mult); > > b -= b%(RADIX*mult); > > c -= c%(RADIX*mult); > > > and this sped up the 32000x32000 multiplication by a further 5% or so. > > The other times didn't change (the 10000x10000 may have been slightly > > quicker). > > > :-) > > > Bill. > > I've uploaded a new version to > > http://m4ri.sagemath.org/downloads/m4ri-20080521.tar.gz > > which > - fixes the bug described above > - adds your code snipped > - fixes a couple of other problems reported by Valgrind. > > It seems the last bugfix also speeds up the thing (on my notebook) since we > computed too much stuff and wrote it to unallocated memory. Updated > performance figures: > > 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo (Macbook Pro, 2nd Gen.) > Matrix Dimension Magma GAP M4RI > 10,000 x 10,000 2.920 6.691 1.760 > 16,384 x 16,384 11.140 36.063 6.760 > 20,000 x 20,000 20.370 - 12.200 > 32,000 x 32,000 74.260 - 51.510 > > I'm having trouble believing those figures but all tests pass (including some > equality checks with Magma via Sage). On sage.math things don't seem to get > any better but it is hard to tell since it is often so loaded. Also, this > version only enables SSE2 on Intel CPUs and has a configure switch for OpenMP > (--enable-openmp). > > Martin > > PS: One thing that seems significant on sage.math: The systime is > considerable, e.g.: > > sage: A = random_matrix(GF(2),32000,32000) > sage: B = random_matrix(GF(2),32000,32000) > sage: time C = A._multiply_strassen(B,cutoff=2^11) > CPU times: user 108.02 s, sys: 5.03 s, total: 113.05 s > Wall time: 113.05 > > -- > name: Martin Albrecht > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > _www:http://www.informatik.uni-bremen.de/~malb > _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---