[sage-devel] Re: SSE2 not so useless after all

2008-05-22 Thread Clement Pernet
Hi, > > Bill, I suppose that also means that now we actually beat (or are close to > beating) Magma on the C2D "for real". My M4RI times are quite similar on the > C2D as your times on your Opteron. But my version of Magma (on the C2D) is > much worse than your version of Magma (on the Optero

[sage-devel] Re: SSE2 not so useless after all

2008-05-22 Thread Bill Hart
The asymptotics really only kick in when we are using Strassen. The starting point is thus not 0, but the crossover point between M4RM and Strassen. So I don't think you can read too much into the asymptotic statement. But once the asymptotics do kick in, we aren't too far off. The only thing thro

[sage-devel] Re: SSE2 not so useless after all

2008-05-22 Thread Martin Albrecht
On Thursday 22 May 2008, Bill Hart wrote: > Hi Martin, > > This version works great. Here are the times on my unburdened 2.8Ghz > Opteron. First the Magma times, then the times for an older version of > m4ri and now, for the first time ever, the new Magma beating times: > > 1x1: 2.940s 3.1

[sage-devel] Re: SSE2 not so useless after all

2008-05-21 Thread Bill Hart
Hi Martin, This version works great. Here are the times on my unburdened 2.8Ghz Opteron. First the Magma times, then the times for an older version of m4ri and now, for the first time ever, the new Magma beating times: 1x1: 2.940s 3.13s 2.25s 16384x16384: 9.250s 12.96s 8.80s 2x2000

[sage-devel] Re: SSE2 not so useless after all

2008-05-21 Thread Martin Albrecht
On Wednesday 21 May 2008, Bill Hart wrote: > Hi Martin, > > I downloaded the clean tarball and added an extra test, but I get: > >mul: m: 4096, l: 3528, n: 4096, k: 0, cutoff: 1024 > FAIL: Strassen != M4RM > FAIL: Strassen != Naive > > :-( > > Also I later replaced the following lines of stra

[sage-devel] Re: SSE2 not so useless after all

2008-05-21 Thread Martin Albrecht
On Wednesday 21 May 2008, Bill Hart wrote: > Hi Martin, > > I downloaded the clean tarball and added an extra test, but I get: > >mul: m: 4096, l: 3528, n: 4096, k: 0, cutoff: 1024 > FAIL: Strassen != M4RM > FAIL: Strassen != Naive > > :-( Same here, I'll look into it right away. "Only" Stra

[sage-devel] Re: SSE2 not so useless after all

2008-05-20 Thread Bill Hart
Hi Martin, I downloaded the clean tarball and added an extra test, but I get: mul: m: 4096, l: 3528, n: 4096, k: 0, cutoff: 1024 FAIL: Strassen != M4RM FAIL: Strassen != Naive :-( Also I later replaced the following lines of strassen.c: a -= a%RADIX; b -= b%RADIX; c -= c%RADIX; wit

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Bill Hart
Hi Clement, I heard you had a big smile on your face today. Well done. Regarding your suggestion about copying into blocks, that is a very good idea. The problem at present is that we break up into blocks vertically, not horizontally. But we absolutely should be doing it horizontally. The reason

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread William Stein
On Mon, May 19, 2008 at 6:14 PM, Clement Pernet <[EMAIL PROTECTED]> wrote: > > hi guys, > > I am finally up to date with this discussion (I was being interviewed, > and then flying when it started). > First, congrats for the great job you have achieved. I have started to > dive into m4ri, and I re

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Clement Pernet
hi guys, I am finally up to date with this discussion (I was being interviewed, and then flying when it started). First, congrats for the great job you have achieved. I have started to dive into m4ri, and I really like the quality of the code. I have a few remarks * the loop unrolling techniq

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Bill Hart
Yep that's exactly the same thing as what M4RM does. Thanks for the explanation. Bill. On 20 May, 00:22, Robert Miller <[EMAIL PROTECTED]> wrote: > > I can't tell exactly what GAP does. It is beautifully documented, but > > it talks about "grease units", which is terminology I don't > > understa

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Robert Miller
> I can't tell exactly what GAP does. It is beautifully documented, but > it talks about "grease units", which is terminology I don't > understand. It does look like M4RM though. Grease is a concept for speeding up certain things using caching. For example, suppose I have the permutation group S_

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Bill Hart
I can't tell exactly what GAP does. It is beautifully documented, but it talks about "grease units", which is terminology I don't understand. It does look like M4RM though. One trick they use is to handle the case where the bits they get from the A matrix equals 1. But I think they only do this t

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Martin Albrecht
On Monday 19 May 2008, Bill Hart wrote: > Martin, > > That's all excellent news!! So on the c2d we are caning magma. But we > should try and figure out if your magma version is optimised for c2d > or for amd64, since that will make a big difference. Is your machine > some kind of 64 bit Intel OSX

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Bill Hart
Ha, GAP isn't fast at everything. I just found timings for their multiple polynomial quadratic sieve. It takes 2hr to factor a 60 digit number. My sieve takes about 9sec. But what's a factor of 800 between friends. Bill. On 19 May, 22:23, Bill Hart <[EMAIL PROTECTED]> wrote: > Martin, > > That's

[sage-devel] Re: SSE2 not so useless after all

2008-05-19 Thread Bill Hart
Martin, That's all excellent news!! So on the c2d we are caning magma. But we should try and figure out if your magma version is optimised for c2d or for amd64, since that will make a big difference. Is your machine some kind of 64 bit Intel OSX machine? I don't see a specific core 2 version of M