Richard Damon <rich...@damon-family.org> writes: > On 9/4/21 9:40 AM, Hope Rouselle wrote: >> Chris Angelico <ros...@gmail.com> writes: >> >>> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouse...@jevedi.com> wrote: >>>> >>>> Hope Rouselle <hrouse...@jevedi.com> writes: >>>> >>>>> Just sharing a case of floating-point numbers. Nothing needed to be >>>>> solved or to be figured out. Just bringing up conversation. >>>>> >>>>> (*) An introduction to me >>>>> >>>>> I don't understand floating-point numbers from the inside out, but I do >>>>> know how to work with base 2 and scientific notation. So the idea of >>>>> expressing a number as >>>>> >>>>> mantissa * base^{power} >>>>> >>>>> is not foreign to me. (If that helps you to perhaps instruct me on >>>>> what's going on here.) >>>>> >>>>> (*) A presentation of the behavior >>>>> >>>>>>>> import sys >>>>>>>> sys.version >>>>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 >>>>> bit (AMD64)]' >>>>> >>>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77] >>>>>>>> sum(ls) >>>>> 39.599999999999994 >>>>> >>>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23] >>>>>>>> sum(ls) >>>>> 39.60000000000001 >>>>> >>>>> All I did was to take the first number, 7.23, and move it to the last >>>>> position in the list. (So we have a violation of the commutativity of >>>>> addition.) >>>> >>>> Suppose these numbers are prices in dollar, never going beyond cents. >>>> Would it be safe to multiply each one of them by 100 and therefore work >>>> with cents only? For instance >>> >>> Yes and no. It absolutely *is* safe to always work with cents, but to >>> do that, you have to be consistent: ALWAYS work with cents, never with >>> floating point dollars. >>> >>> (Or whatever other unit you choose to use. Most currencies have a >>> smallest-normally-used-unit, with other currency units (where present) >>> being whole number multiples of that minimal unit. Only in forex do >>> you need to concern yourself with fractional cents or fractional yen.) >>> >>> But multiplying a set of floats by 100 won't necessarily solve your >>> problem; you may have already fallen victim to the flaw of assuming >>> that the numbers are represented accurately. >> >> Hang on a second. I see it's always safe to work with cents, but I'm >> only confident to say that when one gives me cents to start with. In >> other words, if one gives me integers from the start. (Because then, of >> course, I don't even have floats to worry about.) If I'm given 1.17, >> say, I am not confident that I could turn this number into 117 by >> multiplying it by 100. And that was the question. Can I always >> multiply such IEEE 754 dollar amounts by 100? >> >> Considering your last paragraph above, I should say: if one gives me an >> accurate floating-point representation, can I assume a multiplication of >> it by 100 remains accurately representable in IEEE 754? >> > > Multiplication by 100 might not be accurate if the number you are > starting with is close to the limit of precision, because 100 is > 1.1001 x 64 so multiplying by 100 adds about 5 more 'bits' to the > representation of the number. In your case, the numbers are well below > that point.
Alright. That's clear now. Thanks so much! >>>> --8<---------------cut here---------------start------------->8--- >>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77] >>>>>>> sum(map(lambda x: int(x*100), ls)) / 100 >>>> 39.6 >>>> >>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23] >>>>>>> sum(map(lambda x: int(x*100), ls)) / 100 >>>> 39.6 >>>> --8<---------------cut here---------------end--------------->8--- >>>> >>>> Or multiplication by 100 isn't quite ``safe'' to do with floating-point >>>> numbers either? (It worked in this case.) >>> >>> You're multiplying and then truncating, which risks a round-down >>> error. Try adding a half onto them first: >>> >>> int(x * 100 + 0.5) >>> >>> But that's still not a perfect guarantee. Far safer would be to >>> consider monetary values to be a different type of value, not just a >>> raw number. For instance, the value $7.23 could be stored internally >>> as the integer 723, but you also know that it's a value in USD, not a >>> simple scalar. It makes perfect sense to add USD+USD, it makes perfect >>> sense to multiply USD*scalar, but it doesn't make sense to multiply >>> USD*USD. >> >> Because of the units? That would be USD squared? (Nice analysis.) >> >>>> I suppose that if I multiply it by a power of two, that would be an >>>> operation that I can be sure will not bring about any precision loss >>>> with floating-point numbers. Do you agree? >>> >>> Assuming you're nowhere near 2**53, yes, that would be safe. But so >>> would multiplying by a power of five. The problem isn't precision loss >>> from the multiplication - the problem is that your input numbers >>> aren't what you think they are. That number 7.23, for instance, is >>> really.... >> >> Hm, I think I see what you're saying. You're saying multiplication and >> division in IEEE 754 is perfectly safe --- so long as the numbers you >> start with are accurately representable in IEEE 754 and assuming no >> overflow or underflow would occur. (Addition and subtraction are not >> safe.) >> > > Addition and Subtraction are just as safe, as long as you stay within > the precision limits. Multiplication and division by powers of two are > the safest, not needing to add any precision, until you hit the limits > of the magnitude of numbers that can be expressed. > > The problem is that a number like 0.1 isn't precisely represented, so it > ends up using ALL available precision to get the closest value to it so > ANY operations on it run the danger of precision loss. Got it. That's clear now. It should've been before, but my attention is that of a beginner, so some extra iterations turn up. As long as the numbers involved are accurately representable, floating-points have no other problems. I may, then, conclude that the whole difficulty with floating-point is nothing but going beyond the reserved space for the number. However, I still lack an easy method to detect when a number is not accurately representable by the floating-point datatype in use. For instance, 0.1 is not representable accurately in IEEE 754. But I don't know how to check that >>> 0.1 0.1 # no clue >>> 0.1 + 0.1 0.2 # no clue >>> 0.1 + 0.1 + 0.1 0.30000000000000004 # there is the clue How can I get a clearer and quicker evidence that 0.1 is not accurately representable --- using the REPL? I know 0.1 = 1/10 = 1 * 10^-1 and in base 2 that would have to be represented as... Let me calculate it with my sophisticated skills: 0.1: 0 + 0.2 --> 0 + 0.4 --> 0 + 0.8 --> 1 + 0.6 --> 1 + 0.2, closing a cycle. So 0.1 is representable poorly as 0.00011... In other words, 1/10 in base 10 equals 1/2^4 + 1/2^5 + 1/2^9 + 1/2^10 + ... The same question in other words --- what's a trivial way for the REPL to show me such cycles occur? >>>>>> 7.23.as_integer_ratio() >>> (2035064081618043, 281474976710656) Here's what I did on this case. The REPL is telling me that 7.23 = 2035064081618043/281474976710656 If that were true, then 7.23 * 281474976710656 would have to equal 2035064081618043. So I typed: >>> 7.23 * 281474976710656 2035064081618043.0 That agrees with the falsehood. I'm getting no evidence of the problem. When take control of my life out of the hands of misleading computers, I calculate the sum: 844424930131968 + 5629499534213120 197032483697459200 ================== 203506408161804288 =/= 203506408161804300 How I can save the energy spent on manual verification? Thanks very much. -- https://mail.python.org/mailman/listinfo/python-list