-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
David Harvey made some interesting suggestions on sagetrac about #1014 (implementing a number-of-digits function). I have some comments about this and I decided it might be better to bring the discussion to sage-devel; I'll try to summarize what we end up with back on trac. Basically, the gmp function mpz_sizeinbase returns either the correct number of digits, or 1+ this number (unless the base is 2, when the answer is always correct. We want to use this to figure out the correct number of digits, as fast as possible. Let *self* denote the integer we're working with and let *guess* denote the number returned by mpz_sizeinbase. What I am currently doing is comparing *self* and *compare=base^(guess-1)*. If *self* is bigger or equal, then *guess* is right, otherwise we need to subtract one. This can become somewhat slow when *self* is humongous, due to the fact that we need to compute *compare*. David's suggestion was: - ------------- * instead of computing the whole power, just estimate the top couple of digits using MPFR (much much much faster than computing the whole power) * keep increasing precision until we can distinguish the input from the power. Maybe this would be easiest to implement using interval arithmetic. The point is, for uniform random input, the top couple of digits will almost always give you the right answer straightaway. It's very rare that you need to compute everything. I wish I had more time to think about this, it sounds like a fun problem. - ------------- We actually know what the first few digits (or, actually, all of them) of *compare* are: 10000000... The trick is that to do things as suggested, we need to figure out what the first few digits of *self* in the given base are -- in fact, I think it's enough to know whether its first digit is a one or not. But I don't know how to do this efficiently. Everything I can think of basically comes down to comparing/dividing/etc *self* and *compare*, and that's precisely what we were trying to avoid. If anyone has a smart quick way of computing the most significant digit of an arbitrary integer with respect to an arbitrary basis, I'd love to hear about it. Thinking about all this made me realize that for a majority of numbers (if base=10, for a set of density 6/7) it is possible to decide that the guess is right without involving *compare*, and very very quickly. But for the remaining case I still need to check *self* versus *compare*. I'm attaching the current version of the ndigits function so you can see what it looks like at the moment. Suggestions are very much welcome! Cheers, Alex - -- Alexandru Ghitza Assistant Professor Department of Mathematics Colby College Waterville, ME 04901 http://bayes.colby.edu/~ghitza/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHnU+qdZTaNFFPILgRAo/nAKCCI5VGc8KbzYkZv7zCdbNZFg3EpACeIt5/ Pw9669ytHNrUDy9rip4cSd0= =DYGS -----END PGP SIGNATURE----- --~--~---------~--~----~------------~-------~--~----~ To post to this group, send email to sage-devel@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---
def ndigits(self, int base=10): """ Return the number of digits of self expressed in the given base. INPUT: base -- integer (default: 10) EXAMPLES: sage: n = 52 sage: n.ndigits() 2 sage: n = -10003 sage: n.ndigits() 5 sage: n = 15 sage: n.ndigits(2) 4 sage: n=1000**1000000+1 sage: n.ndigits() 3000001 sage: n=1000**1000000-1 sage: n.ndigits() 3000000 """ if base < 2: raise ValueError, "base=%s should be an integer greater than or equal to 2" %base if self == 0: return 1 # the following is either the number of digits, or 1+ the number # of digits guess = Integer(mpz_sizeinbase(self.value, base)) # if the base is 2, the guess is always correct if guess == one or base == 2: return guess # from the behavior of mpz_sizeinbase, we know that self is # somewhere between 2^floor(log2(base^(guess-1))) and # 2^floor(log2(base^guess)) threshold = (guess - one)*RR(base).log2() + 1 # floor(threshold) is the number of bits required to write down # base^(guess-1). If self needs more bits than that, then it is # clearly bigger and guess is correct. if self.bits() > (guess - one)*RR(base).log2() + 1: return guess else: # otherwise, we need to "manually" compare self and base^(guess-1) compare = Integer(base)**(guess - one) if self.abs() >= compare: return guess else: return guess - one