[See at end] On 09-Apr-2013 16:11:18 Jorge Fernando Saraiva de Menezes wrote: > Dear list, > > I have found an unusual behavior and would like to check if it is a > possible bug, and if updating R would fix it. I am not sure if should post > it in this mail list but I don't where is R bug tracker. The only mention I > found that might relate to this is "If times is a computed quantity it is > prudent to add a small fuzz." in rep() help, but not sure if it is related > to this particular problem > > Here it goes: > >> rep(TRUE,29) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE TRUE >> rep(TRUE,0.29*100) > [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > [28] TRUE >> length(rep(TRUE,29)) > [1] 29 >> length(rep(TRUE,0.29*100)) > [1] 28 > > Just to make sure: >> 0.29*100 > [1] 29 > > This behavior seems to be independent of what is being repeated (rep()'s > first argument) >> length(rep(1,0.29*100)) > [1] 28 > > Also it occurs only with the 0.29. >> length(rep(1,0.291*100)) > [1] 29 >> for(a in seq(0,1,0.01)) {print(sum(rep(TRUE,a*100)))} #also shows correct > values in values from 0 to 1 except for 0.29. > > I have confirmed that this behavior happens in more than one machine > (though I only have session info of this one) > > >> sessionInfo() > R version 2.15.3 (2013-03-01) > Platform: x86_64-w64-mingw32/x64 (64-bit) > [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 > LC_MONETARY=Portuguese_Brazil.1252 > [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] spatstat_1.31-1 deldir_0.0-21 mgcv_1.7-22 > > loaded via a namespace (and not attached): > [1] grid_2.15.3 lattice_0.20-13 Matrix_1.0-11 nlme_3.1-108 > tools_2.15.3
The basic issue is, believe or not, that despite apparently: 0.29*100 # [1] 29 in "reality": 0.29*100 == 29 # [1] FALSE In other words, as computed by R, 0.29*100 is not exactly equal to 29: 29 - 0.29*100 # [1] 3.552714e-15 The difference is tiny, but it is sufficient to make 0.29*100 slightly smaller than 29, so rep(TRUE,0.29*100) uses the largest integer compatible with "times = 0.29*100", i.e. 28. Hence the recommendation to "add a little fuzz". On the other hand, when you use rep(1,0.291*100) you will be OK: This is because: 29 - 0.291*100 # [1] -0.1 so 0.291*100 is comfortably greater than 29 (but well clear of 30). The reason for the small inaccuracy (compared with "mathematical truth") is that R performs numerical calculations using binary representations of numbers, and there is no exact binary representation of 0.29, so the result of 0.29*100 will be slightly inaccurate. If you do need to do this sort of thing (e.g. the value of "times" will be the result of a calculation) then one useful precaution could be to round the result: round(0.29*100) # [1] 29 29-round(0.29*100) # [1] 0 length(rep(TRUE,0.29*100)) # [1] 28 length(rep(TRUE,round(0.29*100))) # [1] 29 (The default for round() is 0 decimal places, i.e. it rounds to an integer). So, compared with: 0.29*100 == 29 # [1] FALSE we have: round(0.29*100) == 29 # [1] TRUE Hoping this helps, Ted. ------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@wlandres.net> Date: 09-Apr-2013 Time: 17:56:33 This message was sent by XFMail ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.