[issue3439] math.frexp and obtaining the bit size of a large integer
New submission from Fredrik Johansson <[EMAIL PROTECTED]>: Python 3.0b2 (r30b2:65106, Jul 18 2008, 18:44:17) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import math >>> math.frexp(10**100) (0.5714936956411375, 333) >>> math.frexp(10**1000) Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C double >>> (Same behavior in Python 2.5.2, and presumably 2.6 although I haven't checked the latter.) I think it should be easy to make frexp work for large integers by calling PyLong_AsScaledDouble and adding the exponents. It would be logical to fix this since math.log(n) already works for large integers. My reason for requesting this change is that math.frexp is the fastest way I know of to accurately count the number of bits in a Python integer (it is more robust than math.log(n,2) and makes it easy to verify that the computed size is exact) and this is something I need to do a lot. Actually, it would be even more useful to have a standard function to directly obtain the bit size of an integer. If I'm not mistaken, PyLong_NumBits does precisely this, and would just have to be wrapped. Aside from my own needs (which don't reflect those of the Python community), there is at least one place in the standard library where this would be useful: decimal.py contains an inefficient implementation (_nbits) that could removed. -- components: Library (Lib) messages: 70216 nosy: fredrikj severity: normal status: open title: math.frexp and obtaining the bit size of a large integer type: feature request versions: Python 2.6, Python 3.0 ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: Raymond, yes, I think that a separate numbits function would better, although exposing this functionality would not prevent also changing the behavior of frexp. As I said, math.log already knows about long integers, so handling long integers similarly in frexp would not be all that unnatural. (On the other hand, it is true that math.sqrt, math.pow, math.cos, etc could all theoretically be "fixed" to work with larger-than-double input, and that slippery slope is probably better avoided.) Good point about roundtripping, but the problem with that argument is that frexp already accepts integers that cannot be represented exactly, e.g.: >>> ldexp(*frexp(10**100)) == 10**100 False Anyway, if there is support for exposing _PyLong_Numbits, should it be a method or a function? (And if a function, placed where? Should it accept floating-point input?) I'm attaching a patch (for the trunk) that adds a numbits method to the int and long types. My C-fu is limited, and I've never hacked on Python before, so the code is probably broken or otherwise bad in several ways (but in that case you can tell me about it and I will learn something :-). I did not bother to optimize the implementation for int, and the tests may be redundant/incomplete/placed wrongly. A slight problem is that _PyLong_NumBits is restricted to a size_t, so it raises an OverflowError on 32-bit platforms for some easily physically representable numbers: >>> (1<<3*10**9).numbits() Traceback (most recent call last): File "", line 1, in OverflowError: long int too large to convert to int This would need to be fixed somehow. If numbits becomes a method, should it also be added to the Integral ABC? GMPY's mpz type, by the way, defines a method numdigits(self, base). This generalization would possibly be overkill, but it's worth considering. If it's too late to add this method/function for 2.6 and 3.0, please update the issue version field as appropriate. -- keywords: +patch Added file: http://bugs.python.org/file10972/numbits.diff ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3451] Asymptotically faster divmod and str(long)
New submission from Fredrik Johansson <[EMAIL PROTECTED]>: A few weeks ago, I blogged about taking advantage of Karatsuba multiplication and Newton's method to divide big integers quickly (some of you may have read it, as it was posted to Daily Python-URL among other places): http://fredrik-j.blogspot.com/2008/07/making-division-in-python-faster.html To summarize, this method blows the builtin division out of the water already at ~(2000 digits)/(1000 digits). The primary catch is that the result of the Newton division may be slightly wrong (typically 1 ulp). However, a single extra multiplication and a subtraction at the end allows one to compute a remainder, and since the remainder must satisfy 0 <= r < q, the error is easily corrected. From a quick test, the cost of the extra multiplication seems to move the break-even point with the builtin division up to around 5000/2500 digits. A pure Python implementation of divmod, with error correction based on the remainder, can be found in this file: http://www.dd.chalmers.se/~frejohl/code/libarith/libarith.py (See the function idivmod) Of particular note is that fast divmod gives a fast way to do radix conversion, by recursively splitting the number in half. The function numeral (see same .py file) does this, e.g: >>> from time import clock >>> a = 2**1257787-1 >>> t1=clock(); s1=str(a); t2=clock(); t2-t1 109.08065159119386 >>> t1=clock(); s2=numeral(a); t2=clock(); t2-t1 7.1394886780303182 >>> s1 == s2 True (This recursive algorithm, by the way, is actually faster than str() even with the slow builtin divmod.) Would there be any interest in porting these algorithms to C and using them in the standard Python long implementation? There are likely some problems that I have overlooked. A critical review will be most welcome. -- components: Interpreter Core messages: 70309 nosy: fredrikj severity: normal status: open title: Asymptotically faster divmod and str(long) ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3451> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3451] Asymptotically faster divmod and str(long)
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: > Yes, definitely! Though it depends a little bit how much complication is involved. Not that much, I think, the pure Python version being just a few lines of code (plus a few more lines of boilerplate). But I'm not too familiar with converting Python to C myself, so I may underestimate the difficulties. It might get a little more complicated if you want to minimize temporary allocations, for example. > Now I'm just waiting for you to propose an implementation of integer square root :-). Yes, there is also code for fast (O(M(n)) integer square roots in libarith.py. This function would be much less useful overall as a builtin, though. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3451> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3451] Asymptotically faster divmod and str(long)
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: For your convenience, I have split the division and numeral code off to a standalone .py file which I'm attaching here. I also updated the remainder logic to use the default divmod instead of repeated subtraction, which ensures worst-case performance similar to the builtin divmod in the very unlikely case that div_newton would return junk. Added file: http://bugs.python.org/file10990/div.py ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3451> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3451] Asymptotically faster divmod and str(long)
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: And here some timings on my laptop. Both int->str and balanced division become faster somewhere between 1000 and 2000 digits. This is after editing the division benchmark to use random numbers instead of exact powers of ten (interestingly causing a bit of slowdown). String conversion might be a little slower for lengths that aren't close to powers of two. Added file: http://bugs.python.org/file10991/div_timings.txt ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3451> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3470] Wrong name for getrandbits in docstring and documentation
New submission from Fredrik Johansson <[EMAIL PROTECTED]>: The docstring for random.Random mentions a method getrandombits(). Surely this should be getrandbits()? This ghost method is also mentioned in the Library Reference page for the random module. -- assignee: georg.brandl components: Documentation, Library (Lib) messages: 70421 nosy: fredrikj, georg.brandl severity: normal status: open title: Wrong name for getrandbits in docstring and documentation versions: Python 2.4, Python 2.5, Python 2.6, Python 2.7, Python 3.0, Python 3.1 ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3470> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3451] Asymptotically faster divmod and str(long)
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: Indeed, that seems to be much faster. In the mean time, do you mind if I steal the code? :-) ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3451> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: Wow, newbie error. Thanks for spotting! In every case I can think of, I've wanted (0).numbits() to be 0. The explanation in the docstring can probably be improved. What other documentation is needed (where)? ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3944] faster long multiplication
Changes by Fredrik Johansson <[EMAIL PROTECTED]>: -- nosy: +fredrikj ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2487] ldexp(x,n) misbehaves when abs(n) is large
New submission from Fredrik Johansson <[EMAIL PROTECTED]>: Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from math import ldexp >>> from sys import maxint >>> ldexp(1.234, maxint//2) Traceback (most recent call last): File "", line 1, in OverflowError: math range error >>> ldexp(1.234, maxint) 0.0 >>> ldexp(1.234, -maxint) 0.0 >>> ldexp(1.234, -maxint-2) Traceback (most recent call last): File "", line 1, in OverflowError: long int too large to convert to int The first and third cases seem right. The second is clearly a bug. In the fourth case, it would be more correct to return 0.0 than to raise an exception IMHO. -- components: Library (Lib) messages: 64527 nosy: fredrikj severity: normal status: open title: ldexp(x,n) misbehaves when abs(n) is large type: behavior versions: Python 2.5 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2487> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2487] ldexp(x,n) misbehaves when abs(n) is large
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: I'm not qualified to comment on the implementation. The test cases all look right. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2487> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] math.frexp and obtaining the bit size of a large integer
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: Some elaboration (that perhaps could be adapted into the documentation or at least source comments). There are two primary uses for numbits, both of which justify (0).numbits() == 0. The first is that for positive k, n = k.numbits() gives the minimum width of a register that can hold k, where a register can hold the 2**n integers 0, 1, ..., 2**n-1 (inclusive). This definition continues to make sense for k = 0, n = 0 (the empty register holds the 2**0 = 1 values 0). In Python terms, one could say that self.numbits() "returns the smallest n such that abs(self) is in range(2**n)". Perhaps this would make a clearer docstring? Second, k.numbits() (plus/minus 1, or perhaps multiplied by a constant factor) measures the number of steps required to solve a problem of size k using various divide-and-conquer algorithms. The problem of size k = 0 is trivial and therefore requires (0).numbits() == 0 steps. In particular, if L is a sorted list, then len(L).numbits() exactly gives the maximum number of comparisons required to find an insertion point in L using binary search. Finally, the convention (-k).numbits() == k.numbits() is useful in contexts where the number k itself is the input to a mathematical function. For example, in a function for multiplying two integers, one might want to choose a different algorithm depending on the sizes of the inputs, and this choice is likely to be independent of signs (if not, one probably needs to check signs anyway.) ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] create a numbits() method for int and long types
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: > Another tack is to notice that numbits is the length of the bit sequence > representation of an int (excepting 0) and give ints a .__len__ method > ;-). I would not expect that suggestion to fly very far, though. FWIW, I'm one of the people who'd additionally find indexing and slicing of the bits of integers very useful. It's not going to happen, though! ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] create a numbits() method for int and long types
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: > One other note: in Fredrik's patch there's commented out code for a > numbits *property* (rather than a method). Is there any good reason to > make this a property? Aesthetically, I think numbits as a function would make more sense. (Maybe if the hypothetical imath module comes along...) > Since numbits() cost is O(n) with n: number of digits. I prefer a > method than a property because, IMHO, reading a property should be > O(1) (*read* an attribute is different than *compute* a value). Unless I missed something, numbits() is O(1). Only the topmost word in a number needs to be examined. > reading a property should be O(1) (*read* an attribute is different > than *compute* a value). O(1) is necessary but not sufficient. My sense is that an attribute should access an existing "part" of an object while an operation that involves creating a "new" object should be a method. Compare complex.real/.imag and complex.conjugate(). ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4128] Performance regression in long division in 2.6
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: > The speed difference comes from different compiler options. I figured as much. I'm using the binaries from python.org (see the .txt file; it includes version headers). The question is why the compilation changes for 2.6 slowed down division but not e.g. multiplication, and if there is a workaround. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4128> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4128] Performance regression in long division in 2.6
New submission from Fredrik Johansson <[EMAIL PROTECTED]>: On my laptop (Windows XP, 32-bit), long division is about 15% slower in 2.6 compared to 2.5. See the attached .txt for timings. I noticed this when comparing the unit tests for mpmath (http://code.google.com/p/mpmath/) under 2.5 and 2.6. It appears that most tests run 10-20% faster under 2.6 (good work all Python developers!), but the tests performing very high precision arithmetic run noticeably slower. >From some quick benchmarking, addition and multiplication are not the culprits (they both actually seem to be quite a bit faster in 2.6). There could be other factors involved, but from what I've found out so far, it is only division that has become slower in 2.6. Division in Python 2.4 is also the same speed as 2.5. -- components: Interpreter Core files: division_speed.txt messages: 74798 nosy: fredrikj severity: normal status: open title: Performance regression in long division in 2.6 versions: Python 2.6 Added file: http://bugs.python.org/file11804/division_speed.txt ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4128> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4128] Performance regression in long division in 2.6
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: > I propose to close this as "won't fix"; I'm not interested in 150ms > speed differences when dividing 10 digit numbers. Sure. (I care about differences like this, because they have a tendency to add up. But it's a minor issue, and a faster algorithm in Python 2.7 will indeed solve the problem.) ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4128> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] create a numbits() method for int and long types
Fredrik Johansson <[EMAIL PROTECTED]> added the comment: In stdtypes.rst, x.numbits should be listed in the table under "Bit-string Operations on Integer Types" and not in the table of operations supported by all numeric types. > (1) the number of bits should be computed first directly using C > arithmetic, and only recomputed using PyLong arithmetic if the C > computations overflow. +1 > (4) I quite like the idea of having numbits be a property rather than a > method---might still be worth considering? I'm not against. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] create a numbits() method for int and long types
Fredrik Johansson added the comment: FYI, Brent & Zimmermann call this function nbits(n) in "Modern Computer Arithmetic": http://www.loria.fr/~zimmerma/mca/pub226.html I don't really care much about the name though. In the documentation, should "same value as ``x.bit_length``" not be "same value as ``x.bit_length()``"? ___ Python tracker <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3439] create a numbits() method for int and long types
Fredrik Johansson added the comment: When did the name change back to numbits? Anyway, I think this reference implementation is better: def numbits(x): x = abs(x) n = 0 while x: n += 1 x //= 2 return n (//= 2 could be changed to >>= 1) ___ Python tracker <http://bugs.python.org/issue3439> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5139] Add combinatoric counting functions to the math module.
Fredrik Johansson added the comment: I understand the connection with itertools, but why not just call a binomial coefficient a binomial coefficient? Probably 90% of all math libraries call this function 'binomial' or 'bincoef' and I suspect that's the name most people would search for. -- nosy: +fredrikj ___ Python tracker <http://bugs.python.org/issue5139> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36228] Support coercion of complex to float/int
Fredrik Johansson added the comment: I can think of two reasons to extend floor() and ceil() to complex numbers, and they lead to different extensions. The first is as a way to map complex numbers to nearby Gaussian integers by defining floor(z) = floor(z.real) + floor(z.imag)*1j, etc. Definition in mpmath borrowed from Mathematica. Conceivably handy for data quantization, or discrete plane geometry... but I honestly never used it myself and can't remember ever seeing it used. The second is to extend piecewise analytic functions on R to piecewise holomorphic functions on C so that the real analytic segments extend to complex analytic neighborhoods, most easily achieved by defining floor(z) = floor(z.real). This one I've actually had use for (think complex step differentiation, contour integration), but it's a bit esoteric. My opinion? If a Python user calls floor() and ceil() with a complex input, it's probably because of a bug in their code, and TypeError is appropriate. It's a one-line lambda to define your own complex extension if you really need it. On the other hand: if it exists, someone will eventually find a way to use it for code golf ;-) -- nosy: +fredrikj ___ Python tracker <https://bugs.python.org/issue36228> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2706] datetime: define division timedelta/timedelta
Fredrik Johansson added the comment: I think datetime division would be a fine application for Fractions. -- message_count: 18.0 -> 19.0 nosy: +fredrikj nosy_count: 7.0 -> 8.0 ___ Python tracker <http://bugs.python.org/issue2706> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com