A good part of the problem in the specific case you initially presented is that some non-integer numbers have an exact representation in the binary floating point arithmetic being used. Basically, if the fractional part is of the form 1/2^k for some integer k > 0, there is an exact representation in the binary floating point scheme.
> options(digits=20) > (100*23)/40 [1] 57.5 > 100*(23/40) [1] 57.499999999999992895 So the two operations give a slightly different result because the fractional part of the division of 100*23 by 40 is 0.5. So the first operations gives, exactly, 57.5 while the second operation does not because 23/40 has no exact representation. But, change the example's divisor from 40 to 30 [the fractional part from 1/2 to 2/3]: > (100*23)/30 [1] 76.666666666666671404 > 100*(23/30) [1] 76.666666666666671404 Now the two operations give the same answer to the full precision available. So, it isn't "generally true true in R that (100*x)/y is more accurate than 100*(x/y), if x > y." The key (in your example) is a property of the way that floating point arithmetic is implemented. ---JRG On 04/21/2017 08:19 AM, Paul Johnson wrote: > We all agree it is a problem with digital computing, not unique to R. I > don't think that is the right place to stop. > > What to do? The round example arose in a real funded project where 2 R > programs differed in results and cause was that one person got 57 and > another got 58. The explanation was found, but its less clear how to > prevent similar in future. Guidelines, anyone? > > So far, these are my guidelines. > > 1. Insert L on numbers to signal that you really mean INTEGER. In R, > forgetting the L in a single number will usually promote whole calculation > to floats. > 2. S3 variables are called 'numeric' if they are integer or double storage. > So avoid "is.numeric" and prefer "is.double". > 3. == is a total fail on floats > 4. Run print with digits=20 so we can see the less rounded number. Perhaps > start sessions with "options(digits=20)" > 5. all.equal does what it promises, but one must be cautious. > > Are there math habits we should follow? > > For example, Is it generally true in R that (100*x)/y is more accurate than > 100*(x/y), if x > y? (If that is generally true, couldn't the R > interpreter do it for the user?) > > I've seen this problem before. In later editions of the game theory program > Gambit, extraordinary effort was taken to keep values symbolically as > integers as long as possible. Avoid division until the last steps. Same in > Swarm simulations. Gary Polhill wrote an essay about the Ghost in the > Machine along those lines, showing accidents from trusting floats. > > I wonder now if all uses of > or < with numeric variables are suspect. > > Oh well. If everybody posts their advice, I will write a summary. > > Paul Johnson > University of Kansas > > On Apr 21, 2017 12:02 AM, "PIKAL Petr" <petr.pi...@precheza.cz> wrote: > >> Hi >> >> The problem is that people using Excel or probably other such spreadsheets >> do not encounter this behaviour as Excel silently rounds all your >> calculations and makes approximate comparison without telling it does so. >> Therefore most people usually do not have any knowledge of floating point >> numbers representation. >> >> Cheers >> Petr >> >> -----Original Message----- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Paul >> Johnson >> Sent: Thursday, April 20, 2017 11:56 PM >> To: R-help <r-help@r-project.org> >> Subject: [R] Interesting quirk with fractions and rounding >> >> Hello, R friends >> >> My student unearthed this quirk that might interest you. >> >> I wondered if this might be a bug in the R interpreter. If not a bug, it >> certainly stands as a good example of the dangers of floating point numbers >> in computing. >> >> What do you think? >> >>> 100*(23/40) >> [1] 57.5 >>> (100*23)/40 >> [1] 57.5 >>> round(100*(23/40)) >> [1] 57 >>> round((100*23)/40) >> [1] 58 >> >> The result in the 2 rounds should be the same, I think. Clearly some >> digital number devil is at work. I *guess* that when you put in whole >> numbers and group them like this (100*23), the interpreter does integer >> math, but if you group (23/40), you force a fractional division and a >> floating point number. The results from the first 2 calculations are not >> actually 57.5, they just appear that way. >> >> Before you close the books, look at this: >> >>> aa <- 100*(23/40) >>> bb <- (100*23)/40 >>> all.equal(aa,bb) >> [1] TRUE >>> round(aa) >> [1] 57 >>> round(bb) >> [1] 58 >> >> I'm putting this one in my collection of "difficult to understand" >> numerical calculations. >> >> If you have seen this before, I'm sorry to waste your time. >> >> pj >> -- >> Paul E. Johnson http://pj.freefaculty.org >> Director, Center for Research Methods and Data Analysis >> http://crmda.ku.edu >> >> To write to me directly, please address me at pauljohn at ku.edu. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> ________________________________ >> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou >> určeny pouze jeho adresátům. >> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě >> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie >> vymažte ze svého systému. >> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email >> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. >> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi >> či zpožděním přenosu e-mailu. >> >> V případě, že je tento e-mail součástí obchodního jednání: >> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření >> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. >> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; >> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany >> příjemce s dodatkem či odchylkou. >> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve >> výslovným dosažením shody na všech jejích náležitostech. >> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za >> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn >> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto >> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich >> existence je adresátovi či osobě jím zastoupené známá. >> >> This e-mail and any documents attached to it may be confidential and are >> intended only for its intended recipients. >> If you received this e-mail by mistake, please immediately inform its >> sender. Delete the contents of this e-mail with all attachments and its >> copies from your system. >> If you are not the intended recipient of this e-mail, you are not >> authorized to use, disseminate, copy or disclose this e-mail in any manner. >> The sender of this e-mail shall not be liable for any possible damage >> caused by modifications of the e-mail or by delay with transfer of the >> email. >> >> In case that this e-mail forms part of business dealings: >> - the sender reserves the right to end negotiations about entering into a >> contract in any time, for any reason, and without stating any reasoning. >> - if the e-mail contains an offer, the recipient is entitled to >> immediately accept such offer; The sender of this e-mail (offer) excludes >> any acceptance of the offer on the part of the recipient containing any >> amendment or variation. >> - the sender insists on that the respective contract is concluded only >> upon an express mutual agreement on all its aspects. >> - the sender of this e-mail informs that he/she is not authorized to enter >> into any contracts on behalf of the company except for cases in which >> he/she is expressly authorized to do so in writing, and such authorization >> or power of attorney is submitted to the recipient or the person >> represented by the recipient, or the existence of such authorization is >> known to the recipient of the person represented by the recipient. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.