-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

David Harvey made some interesting suggestions on sagetrac about #1014
(implementing a number-of-digits function).  I have some comments about
this and I decided it might be better to bring the discussion to
sage-devel; I'll try to summarize what we end up with back on trac.

Basically, the gmp function mpz_sizeinbase returns either the correct
number of digits, or 1+ this number (unless the base is 2, when the answer
is always correct.  We want to use this to figure out the correct number
of digits, as fast as possible.  

Let *self* denote the integer we're working with and let *guess* denote
the number returned by mpz_sizeinbase.  What I am currently doing is
comparing *self* and *compare=base^(guess-1)*.  If *self* is bigger or
equal, then *guess* is right, otherwise we need to subtract one.

This can become somewhat slow when *self* is humongous, due to the fact
that we need to compute *compare*.

David's suggestion was:
- -------------
  * instead of computing the whole power, just estimate the top couple of
 digits using MPFR (much much much faster than computing the whole power)
  * keep increasing precision until we can distinguish the input from the
 power.

 Maybe this would be easiest to implement using interval arithmetic.

 The point is, for uniform random input, the top couple of digits will
 almost always give you the right answer straightaway. It's very rare that
 you need to compute everything. I wish I had more time to think about
 this, it sounds like a fun problem.
- -------------

We actually know what the first few digits (or, actually, all of them)
of *compare* are: 10000000...
The trick is that to do things as suggested, we need to figure out what
the first few digits of *self* in the given base are -- in fact, I
think it's enough to know whether its first digit is a one or not.  But
I don't know how to do this efficiently.  Everything I can think of
basically comes down to comparing/dividing/etc *self* and *compare*,
and that's precisely what we were trying to avoid.  If anyone has a
smart quick way of computing the most significant digit of an arbitrary
integer with respect to an arbitrary basis, I'd love to hear about it.

Thinking about all this made me realize that for a majority of numbers
(if base=10, for a set of density 6/7) it is possible to decide that
the guess is right without involving *compare*, and very very quickly.
But for the remaining case I still need to check *self* versus *compare*.

I'm attaching the current version of the ndigits function so you can
see what it looks like at the moment.  Suggestions are very much welcome!

Cheers,
Alex

- --
Alexandru Ghitza
Assistant Professor
Department of Mathematics
Colby College
Waterville, ME 04901    
http://bayes.colby.edu/~ghitza/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHnU+qdZTaNFFPILgRAo/nAKCCI5VGc8KbzYkZv7zCdbNZFg3EpACeIt5/
Pw9669ytHNrUDy9rip4cSd0=
=DYGS
-----END PGP SIGNATURE-----


--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

    def ndigits(self, int base=10):
        """
        Return the number of digits of self expressed in the given base.

        INPUT:
            base -- integer (default: 10)

        EXAMPLES:
            sage: n = 52
            sage: n.ndigits()
            2
            sage: n = -10003
            sage: n.ndigits()
            5
            sage: n = 15
            sage: n.ndigits(2)
            4
            sage: n=1000**1000000+1
            sage: n.ndigits()
            3000001
            sage: n=1000**1000000-1
            sage: n.ndigits()
            3000000
        """
        if base < 2:
            raise ValueError, "base=%s should be an integer greater than or 
equal to 2" %base
        if self == 0:
            return 1
        # the following is either the number of digits, or 1+ the number
        # of digits
        guess = Integer(mpz_sizeinbase(self.value, base))
        # if the base is 2, the guess is always correct
        if guess == one or base == 2:
            return guess

        # from the behavior of mpz_sizeinbase, we know that self is
        # somewhere between 2^floor(log2(base^(guess-1))) and
        # 2^floor(log2(base^guess))
        threshold = (guess - one)*RR(base).log2() + 1
        # floor(threshold) is the number of bits required to write down
        # base^(guess-1).  If self needs more bits than that, then it is
        # clearly bigger and guess is correct.
        if self.bits() > (guess - one)*RR(base).log2() + 1:
            return guess
        else:
            # otherwise, we need to "manually" compare self and base^(guess-1)
            compare = Integer(base)**(guess - one)
            if self.abs() >= compare:
                return guess
            else:
                return guess - one

Reply via email to