On the other hand, I am unable to replicate the very sparse benchmark
unless I assume the result will fit in 2 limbs and allocate all the
output mpz's in advance, etc. Then I can basically replicate it. If I
use my generic no assumptions code it takes about 3s. I don't think I
can improve that much if at all.

Bill.

On 14 May, 17:09, Bill Hart <goodwillh...@googlemail.com> wrote:
> With a bit of fiddling I can get the Fateman benchmark down to 53.5s
> on sage.math (2.66 GHz Core2/penryn) making no assumptions about the
> size of the output coefficients. I've checked that at least the output
> poly has the right length and coeffs of the right size.
>
> Adjusting for the clock speed, that seems to match what is reported
> for sdmp in Roman and Peter's paper.
>
> Time to clean up my code and commit. I didn't implement the heap trick
> I mentioned.
>
> Bill.
>
> On 14 May, 14:28, Bill Hart <goodwillh...@googlemail.com> wrote:
>
>
>
>
>
> > Thanks all for the very interesting comments and links to publications
> > and CAS's.
>
> > I've implemented the algorithm using flint2's fmpz (multiprecision)
> > integer type for coefficients and at this stage for 62 bit integers
> > for exponents, only. (However it should be trivial to lift this
> > restriction.)
>
> > If I make a couple of simplifications, namely assume that the output
> > fits into two limbs, and that none of the polynomials has length >
> > 2^32 - 1, etc, I get pretty good times, certainly better than reported
> > in Francesco's paper. I also don't know if any assumptions about the
> > coefficients being unsigned are made in any of the benchmarks. That
> > would further reduce the times, etc.
>
> > If I am a bit more general with the assumptions, allowing coefficients
> > of any multiple precisions size and polys of any length up to 2^64 -
> > 1, then it takes a little longer than reported in the various papers.
> > My comparison is on sage.math which is a core2 penryn running at 2.66
> > GHz (I adjust for the faster clock speed). Performance is also good on
> > a K102.
>
> > I might yet implement a heap speedup I thought of last year, but which
> > didn't actually speed anything up at that time. This year, now that I
> > have taken Roman's suggestion of not trying to store too much
> > information in the heap, I seem to be getting good times. (Last year I
> > couldn't get under 133% of the sdmp time.) The speedup might work this
> > time around!
>
> > I want to publicly thank Roman Pearce for sending me a copy of his
> > reference implementation. It was a great help. I didn't use any of the
> > goto tricks you did to help gcc along. I believe you probably have
> > better performance in sdmp on account of your very careful
> > implementation. But my heap implementation already looks too much like
> > yours! I seem to recall you saying you accumulate in 3 limbs, rather
> > than 2 in sdmp, and I am not convinced when I do that, that I'm not
> > losing quite a few cycles.
>
> > I think I might not implement Richard Fateman's trick, but instead
> > just duplicate the heap code to deal with the case where the exponents
> > don't fit in a single limb. My implementation should not slow down in
> > the other case if I do this, and in some senses, the cases where the
> > exponents don't fit in a single limb are not the interesting cases for
> > my work. I'd be better off putting time into a multivariate Kronecker
> > Segmentation implementation.
>
> > Bill.
>
> > On May 14, 7:35 am, parisse <bernard.pari...@ujf-grenoble.fr> wrote:
>
> > > On 13 mai, 19:03, Roman Pearce <rpear...@gmail.com> wrote:
>
> > > > On May 13, 2:45 am, parisse <bernard.pari...@ujf-grenoble.fr> wrote:
>
> > > > > In my own experience, coding with an univariate polynomial is not
> > > > > efficient especially if the polynomial is sparse.
>
> > > > There must be some kind of inefficiency.  If you use word
> > > > operations
> > > > for all monomial operations then it should be fast.  
> > > > Seehttp://portal.acm.org/citation.cfm?id=281657
>
> > > Sorry, I made a confusion. I wanted to say that multiplying using big
> > > integers was inefficient. I'm of course using univariate
> > > multiplication by packing monomials in an integer (if it can fit).
> > > Looking at Biscani's paper, it seems he has good timing by
> > > preallocating an array for the resulting monomial in the "semi-dense"
> > > case (by semi-dense I mean dense with respect to total degree).
>
> > > --
> > > To post to this group, send an email to sage-devel@googlegroups.com
> > > To unsubscribe from this group, send an email to 
> > > sage-devel+unsubscr...@googlegroups.com
> > > For more options, visit this group 
> > > athttp://groups.google.com/group/sage-devel
> > > URL:http://www.sagemath.org
>
> > --
> > To post to this group, send an email to sage-devel@googlegroups.com
> > To unsubscribe from this group, send an email to 
> > sage-devel+unsubscr...@googlegroups.com
> > For more options, visit this group 
> > athttp://groups.google.com/group/sage-devel
> > URL:http://www.sagemath.org
>
> --
> To post to this group, send an email to sage-devel@googlegroups.com
> To unsubscribe from this group, send an email to 
> sage-devel+unsubscr...@googlegroups.com
> For more options, visit this group athttp://groups.google.com/group/sage-devel
> URL:http://www.sagemath.org

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to