Martin, On the M4RI website you say that M4R inversion is asymptotically fast with time complexity n^3/log_2(n), but Strassen's method has complexity n^log2(7), which I would call asymptotically fast.
Do you just mean asymptotically faster than the classical algorithm? By the way, I wrote some code ages ago for computing the row echelon form of sparse matrices over GF2 which essentially inverts 6000x6000 sparse matrices in about a second. Bill. On 19 May, 19:58, Martin Albrecht <[EMAIL PROTECTED]> wrote: > On Monday 19 May 2008, Bill Hart wrote: > > > You seemed to be getting up to 8% at points there. That's definitely > > worth it. I'll be interested to see this evening how it comes out, > > though I recommend optimising my combine3 function (which I suppose > > should now be combine8), even including it inline rather than have it > > in a separate file. > > Bill, > > some progress report for the C2D: > > I incorporated your changes with the following modifications: > - if (x1==0 & x2==0 & x3==0 .... x8==8) ... I removed this one since it seems > rather unlikely that this is true > - I added #define called GRAY8 which defines whether 8 or 4 tables ought to be > used. I figured this might be handy for machines with a smaller L1. > - I added the correct a_nc%k values back in > - We don't need to make sure that the allocated buffers are correctly aligned, > since we try allocate with _mm_malloc. If that is not available we should > probably just use posix_memalign. > > The speed-up on the C2D (similar to the Opteron) is considerable (the last > column is a parallel toy implementation): > > 64-bit Debian/GNU Linux, 2.33Ghz Core2Duo, cutoff=2^12 (*) > Matrix Dimension Magma M4RI M4RI (OpenMP), walltime > 10,000 x 10,000 2.920 2.270 1.470 > 16,384 x 16,384 11.140 9.130 5.540 > 20,000 x 20,000 20.370 16.110 11.800 > 32,000 x 32,000 75.510 64.340 40.040 > > Amazingly M4RM alone (w.o. Strassen-Winograd) now beats Magma up to 2*10^4 x > 2*10^4 in this configuration: > > sage: A = random_matrix(GF(2),10^3, 10^3); > sage: B = random_matrix(GF(2),10^3, 10^3); > sage: magma(A._multiply_m4rm(B)) == magma(A)*magma(B) > True > sage: A = random_matrix(GF(2),2*10^4, 2*10^4); > sage: B = random_matrix(GF(2),2*10^4, 2*10^4); > sage: time A._multiply_m4rm(B) > CPU times: user 18.32 s, sys: 0.10 s, total: 18.42 s > Wall time: 18.65 > > 64-bit Debian/GNU Linux, 1.8Ghz Opteron (sage.math), cutoff=2^11 > Matrix Dimension Magma 2.13-5 (64-bit) M4RI-20080518 (64-bit) > 10,000 x 10,000 4.190 4.290 > 16,384 x 16,384 15.360 15.230 > 20,000 x 20,000 29.530 28.640 > 32,000 x 32,000 103.970 114.620 > > That does seem to roughly match what you reported. I'll now look into SSE2 > support. > > Cheers, > Martin > > (*) Note: Magma is 64-bit but not optimised for C2D. > > -- > name: Martin Albrecht > _pgp:http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99 > _www:http://www.informatik.uni-bremen.de/~malb > _jab: [EMAIL PROTECTED] --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---