On Wednesday 21 May 2008, Bill Hart wrote:
> Hi Martin,
>
> I downloaded the clean tarball and added an extra test, but I get:
>
>    mul: m: 4096, l: 3528, n: 4096, k:  0, cutoff: 1024
> FAIL: Strassen != M4RM
> FAIL: Strassen != Naive
>
> :-(
>
> Also I later replaced the following lines of strassen.c:
>
>   a -= a%RADIX;
>   b -= b%RADIX;
>   c -= c%RADIX;
>
> with
>
>   unsigned long mult = 1;
>   unsigned long width = a;
>   while (width > 2*cutoff)
>   {
>     width/=2;
>     mult*=2;
>   }
>   a -= a%(RADIX*mult);
>   b -= b%(RADIX*mult);
>   c -= c%(RADIX*mult);
>
> and this sped up the 32000x32000 multiplication by a further 5% or so.
> The other times didn't change (the 10000x10000 may have been slightly
> quicker).
>
> :-)
>
> Bill.

I've uploaded a new version to

     http://m4ri.sagemath.org/downloads/m4ri-20080521.tar.gz

which
 - fixes the bug described above
 - adds your code snipped
 - fixes a couple of other problems reported by Valgrind.

It seems the last bugfix also speeds up the thing (on my notebook) since we 
computed too much stuff and wrote it to unallocated memory. Updated  
performance figures:

64-bit Debian/GNU Linux, 2.33Ghz Core2Duo (Macbook Pro, 2nd Gen.)
Matrix Dimension        Magma           GAP             M4RI
10,000 x 10,000         2.920           6.691           1.760
16,384 x 16,384         11.140          36.063          6.760
20,000 x 20,000         20.370          -               12.200
32,000 x 32,000         74.260          -               51.510

I'm having trouble believing those figures but all tests pass (including some 
equality checks with Magma via Sage). On sage.math things don't seem to get 
any better but it is hard to tell since it is often so loaded. Also, this 
version only enables SSE2 on Intel CPUs and has a configure switch for OpenMP 
(--enable-openmp).

Martin

PS: One thing that seems significant on sage.math: The systime is 
considerable, e.g.:

sage: A = random_matrix(GF(2),32000,32000)
sage: B = random_matrix(GF(2),32000,32000)
sage: time C = A._multiply_strassen(B,cutoff=2^11)
CPU times: user 108.02 s, sys: 5.03 s, total: 113.05 s
Wall time: 113.05

-- 
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_www: http://www.informatik.uni-bremen.de/~malb
_jab: [EMAIL PROTECTED]


--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to