Martin,

On the M4RI website you say that M4R inversion is asymptotically fast
with time complexity n^3/log_2(n), but Strassen's method has
complexity n^log2(7), which I would call asymptotically fast.

Do you just mean asymptotically faster than the classical algorithm?
By the way, I wrote some code ages ago for computing the row echelon
form of sparse matrices over GF2 which essentially inverts 6000x6000
sparse matrices in about a second.

Bill.

On 19 May, 19:58, Martin Albrecht <[EMAIL PROTECTED]>
wrote:
> On Monday 19 May 2008, Bill Hart wrote:
>
> > You seemed to be getting up to 8% at points there. That's definitely
> > worth it. I'll be interested to see this evening how it comes out,
> > though I recommend optimising my combine3 function (which I suppose
> > should now be combine8), even including it inline rather than have it
> > in a separate file.
>
> Bill,
>
> some progress report for the C2D:
>
> I incorporated your changes with the following modifications:
> - if (x1==0 & x2==0 & x3==0 .... x8==8) ... I removed this one since it seems
> rather unlikely that this is true
> - I added #define called GRAY8 which defines whether 8 or 4 tables ought to be
> used. I figured this might be handy for machines with a smaller L1.
> - I added the correct a_nc%k values back in
> - We don't need to make sure that the allocated buffers are correctly aligned,
> since we try allocate with _mm_malloc. If that is not available we should
> probably just use posix_memalign.
>
> The speed-up on the C2D (similar to the Opteron) is considerable (the last
> column is a parallel toy implementation):
>
> 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo, cutoff=2^12 (*)
> Matrix Dimension        Magma   M4RI            M4RI (OpenMP), walltime
> 10,000 x 10,000         2.920           2.270           1.470
> 16,384 x 16,384         11.140          9.130           5.540
> 20,000 x 20,000         20.370          16.110          11.800
> 32,000 x 32,000         75.510          64.340          40.040
>
> Amazingly M4RM alone (w.o. Strassen-Winograd) now beats Magma up to 2*10^4 x
> 2*10^4 in this configuration:
>
> sage: A = random_matrix(GF(2),10^3, 10^3);
> sage: B = random_matrix(GF(2),10^3, 10^3);
> sage: magma(A._multiply_m4rm(B)) == magma(A)*magma(B)
> True
> sage: A = random_matrix(GF(2),2*10^4, 2*10^4);
> sage: B = random_matrix(GF(2),2*10^4, 2*10^4);
> sage: time A._multiply_m4rm(B)
> CPU times: user 18.32 s, sys: 0.10 s, total: 18.42 s
> Wall time: 18.65
>
> 64-bit Debian/GNU Linux, 1.8Ghz Opteron (sage.math), cutoff=2^11
> Matrix Dimension        Magma 2.13-5 (64-bit)           M4RI-20080518 (64-bit)
> 10,000 x 10,000         4.190                           4.290
> 16,384 x 16,384         15.360                          15.230
> 20,000 x 20,000         29.530                          28.640
> 32,000 x 32,000         103.970                 114.620
>
> That does seem to roughly match what you reported. I'll now look into SSE2
> support.
>
> Cheers,
> Martin
>
> (*) Note: Magma is 64-bit but not optimised for C2D.
>
> --
> name: Martin Albrecht
> _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
> _www:http://www.informatik.uni-bremen.de/~malb
> _jab: [EMAIL PROTECTED]
--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to