Somehow I screwed up and put the wrong times for the Opteron for giac. Here are all the 1-d times again:
FLINT (2.4Ghz Opteron): 1d: 0.00018s, 0.0055s, 0.021s mod-1d: 0.00046s, 0.00419s, 0.00979 Magma (2.4Ghz Opteron): 1d: 0.00168s, 0.01692s and 0.0453s mod-1d: 0.00067s, 0.00981s, 0.0244s Giac (***2.2***Ghz Opteron): 1d: 0.00075, 0.044, 0.17 mod-1d: 0.00018, 0.009, 0.034 Now I believe the times!! It looks like your basecase Z/pZ[x] time is much better than FLINT. I think that will be due to your assembly optimisation and the special case for division which you described. It may also be due in part to the 32 vs 64 bit thing. Your other times would of course be much better if you had half gcd. I'll now try to time Magma for the higher dimensional problems (if I can figure out how). If you like I can also time giac on the 2.4GHz Opteron. If you provide me with a program or tarball or something I can just run it. Bill. On 23 Jan, 16:58, Bill Hart <goodwillh...@googlemail.com> wrote: > On 23 Jan, 07:44, parisse <bernard.pari...@ujf-grenoble.fr> wrote: > > > > > > Your timings are surprisingly low. Can you tell me which files perform > > > the Z/pZ[x] GCD? I'd be very interested to see your code. > > > The division code in Z/pZ is in the file modpoly.cc, lines 1621->1723. > > It depends on invmod(int,int) for modular inverse (in gen.cc) and the > > asm multiplication followed by reduction in monomial.h (full speed > > requires -D_I386_ compilation flag). The ordering of coefficients is > > from high to low degree. > > > > You also mention that you made improvements. Is there anything that > > > might be considered new, i.e. improvements I should consider > > > incorporating in my own code. I sincerely doubt that FLINT can match > > > those timings (though I haven't checked). > > > > The use of int* probably makes some difference. I use unsigned long *. > > > Perhaps, since on a 64 bit processor, the size of the data is divided > > by 2 and you must check for underflow when doing a substraction. > > Moreover the integer division might cost more even if the 32 high bits > > of the divisor are 0. Is your max modulo 2^31? > > No, in my case the maximum modulus is 2^63 for now on a 64 bit > machine. > > > The gcd code using the rem function is gcdsmallmodpoly with > > vector<int> as argument. It converts the two vector<int> to two int * > > allocated on the stack, and then all divisions are done in place > > trashing the initial contents (but avoiding further memory > > allocation). --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to sage-devel-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---