On Fri, Aug 24, 2012 at 7:29 PM, Jennifer Sabatier <plessthanpointohf...@gmail.com> wrote: > AAAAAHHHHHHH I GOT IT!!!!!!!!!! > > And I *think* I understand about floating point arithmetic..
Well then you're doing much better than the rest of us: it's quite a difficult subject and only gets trickier as you think about it more. (Numerical analysis generally, not the definition of an IEEE754 / ISO 6059 double) You even get such fun as -1 * 0 != 1 * 0. under some interpretations. > > In this case vn$PM.DIST.TOT is the sum of proportions. So, it should > be anywhere 0 and 1. > > In our case, if it's anything other than 1 when vn$PM.EXP is greater > than 0 then it means something is wrong with one of the variables > used to sum vn$PM.DIST.TOT. > > I was worried making it an integer will cause cases of 0.4 to be 0 and > look legal, when it's not (though it doesn't actually seem to be a > problem). > > So, I just did what Michael and Peter suggested, after reading up on > floating points. > > fpf <- 1e-05 # fpf = floating point fuzz Though I sugested 1e-05 here, usually one uses slightly more stringent testing: a general rule of thumb is the square root of machine precision. In R terms, sqrt(.Machine$double.eps) > > vn$PM.DIST_flag<-ifelse(vn$PM.EXP > 0 & abs(vn$PM.DIST.TOT - 1) > fpf , 1, 0) > > YAAAAAYYYYY!!!! > > Thanks, solved AND I learned something new. > > Thanks, alll, and have a GREAT weekend! > > Jen Just for the "macro-take-away": this is the reason we don't really like console printout instead of dput() to show a problem: if you dput the original not-yet-ifelse-d numbers, you'll see that they really aren't 1's, but that they are truncated upon regular printing. Cheers and don't forget the old adage: 0.1*10 is hardly ever 1, Michael > > > On Fri, Aug 24, 2012 at 6:27 PM, Peter Ehlers <ehl...@ucalgary.ca> wrote: >> I see that you got other responses while I was composing an answer. >> Your 'example.csv' did come through for me, but I still can't >> replicate your PM.DIST_flag variable. Specifically, observations >> 30, 33, 36 and 40 are wrong. >> >> I agree with Rui, that there's something else going on. The data >> you've sent can't be the data that yielded the 'flag' variable >> or you didn't use the ifelse() function in the way that you've >> shown. >> >> I would start with a clean R session and I would use the 'convert >> logical to numeric' idea (or keep a logical rather than numeric >> flag): >> >> vn <- transform(vn, >> my_flag = ( (PM.EXP > 0) & (PM.DIST.TOT != 1) ) * 1 ) >> >> It looks as though your PM.DIST.TOT variable is meant to be >> integer. If so, you might want to ensure that it is that type. >> Otherwise, you might want to use Michael's suggestion of using >> abs(... - 1) < 1e-05. >> >> Peter Ehlers >> >> >> On 2012-08-24 14:56, Jennifer Sabatier wrote: >>> >>> Hi Michael, >>> >>> Thanks for letting me know how to post data. I will try to upload it >>> that way in a second. >>> >>> I can usually use code to make a reproducible dataset but this time >>> with the ifelse behaving strangely (perhaps, it's probably me) I >>> didn't think I could do it easily so I figured I would just put my >>> data up. >>> >>> I will check out the R FAQ you mentioned. >>> >>> Thanks, again, >>> >>> Jen >>> >>> >>> >>> On Fri, Aug 24, 2012 at 5:50 PM, R. Michael Weylandt >>> <michael.weyla...@gmail.com> wrote: >>>> >>>> On Fri, Aug 24, 2012 at 4:46 PM, Jennifer Sabatier >>>> <plessthanpointohf...@gmail.com> wrote: >>>>> >>>>> Hi Michael, >>>>> >>>>> No, I never use attach(), exactly for the reasons you state. To do >>>>> due diligence I did a search of code for the function and it didn't >>>>> come up (I would have been shocked because I never us it!). >>>>> >>>>> Now that real data is up, does your suggestion still apply? I am >>>>> reading it now. >>>>> >>>> >>>> If you mean the data you sent to Peter, it got scrubbed by the list >>>> servers as well (they are somewhat draconian, but appropriately so in >>>> the long run). The absolute best way to send R data via email (esp on >>>> this list) is to use the dput() function which will create a plain >>>> text representation of your data _exactly_ as R sees it. It's a little >>>> hard for the untrained eye to parse (I can usually get about 90% of >>>> what it all means but there's some stuff with rownames = NA I've never >>>> looked into) but it's perfectly reproducible to a different R session. >>>> Then us having the same data is a simple copy+paste away. >>>> >>>> For more on dput() and reproducibility generally, see >>>> >>>> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example >>>> >>>> It could be the floating point thing (it's hard to say without knowing >>>> how your data was calculated), but Rui seems to think not. >>>> >>>> M >>>> >>>>> Thanks, >>>>> >>>>> Jen >>>>> >>>>> On Fri, Aug 24, 2012 at 5:38 PM, R. Michael Weylandt >>>>> <michael.weyla...@gmail.com> wrote: >>>>>> >>>>>> Off the wall / wild guess, do you use attach() frequently? Not >>>>>> entirely sure how it would come up, but it tends to make weird errors >>>>>> like this occur. >>>>>> >>>>>> M >>>>>> >>>>>> On Fri, Aug 24, 2012 at 4:36 PM, Jennifer Sabatier >>>>>> <plessthanpointohf...@gmail.com> wrote: >>>>>>> >>>>>>> Hi Rui, >>>>>>> >>>>>>> Thanks so much for responding but I think with my HTML problem the vn >>>>>>> data you made must not be the same. I tried running your code on the >>>>>>> data (I uploaded a copy) and I got the same thing I had before. >>>>>>> >>>>>>> Jen >>>>>>> >>>>>>> On Fri, Aug 24, 2012 at 5:28 PM, Rui Barradas <ruipbarra...@sapo.pt> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> 165114 1 0 0 0 0 417313 1 0 3546 1 0 4613 1 0 225460 1 0 6417 1 >>>>>>>> 1 23 >>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.