Got it. Thank you all.

On Mon, Mar 16, 2009 at 4:39 PM, Stavros Macrakis <macra...@alum.mit.edu>wrote:

> The factor approach is horrifically ugly and dangerous.
>
> Even if it didn't have the extraordinarily poor behavior documented
> below, it simply isn't well-defined what it should do.  The explicit
> approximation route is far far preferable in every way: more
> predictable, more controllable, and even (though it hardly matters
> usually) faster.
>
> Let's look at the extraordinarily poor behavior I was mentioning. Consider:
>
> nums <- (.3 + 2e-16 * c(-2,-1,1,2)); nums
> [1] 0.3 0.3 0.3 0.3
>
> Though they all print as .3 with the default precision (which is
> normal and expected), they are all different from .3:
>
> nums - .3 =>  -3.885781e-16 -2.220446e-16  2.220446e-16  3.885781e-16
>
> When we convert nums to a factor, we get:
>
> fact <- as.factor(nums); fact
> [1] 0.300000000000000 0.3               0.3               0.300000000000000
> Levels: 0.300000000000000 0.3 0.3 0.300000000000000
>
> Not clear what the difference between 0.300000000000000 and 0.3 is
> supposed to be, nor why some 0.300000000000000 are < .3 and others are
> > .3, but let's put that aside for the moment.
>
> Now let's look at the relations among the factor values:
>
> fact[1]==fact[2]
> [1] FALSE
> > fact[1]==fact[4]
> [1] TRUE
>
> So though nums[1] < nums[2] < nums[3] < nums[4], fact[1] compares
> *unequal* to fact[2] though it compares *equal* to fact[4].
> Apparently R is comparing the *names* of the levels rather than the
> indexes in the factor.  This would be weird even if it didn't lead to
> this very bad case.
>
> Hope this helps,
>
>             -s
>
>
> On Mon, Mar 16, 2009 at 6:53 PM, Daniel Murphy <chiefmur...@gmail.com>
> wrote:
> > I have a matrix whose columns were filled with values which were
> functions
> > of cvseq<-seq(.2,.3,by=.1) (and a row value of mode integer). To do a
> lookup
> > for cv=.3 later, I wanted to match(.3,cvseq), which gave me NA, hence my
> > question. I thought R would match .3 in cvseq within .Machine$double.eps,
> > but I can understand it if .3 and the second element of cvseq would not
> have
> > identical bits.
> > Besides the helpful suggestions below, I also tried
> >> cvseqf <- as.factor(cvseq)
> >> match(.3,cvseq)
> > [1] 2
> > which worked.
> > In general, would it be better to go the enumeration route via as.factor
> or
> > the approximation route?
> > Thanks for the help.
> > -Dan
> >
> > On Mon, Mar 16, 2009 at 8:24 AM, Stavros Macrakis <macra...@alum.mit.edu
> >
> > wrote:
> >>
> >> Well, first of all, seq(from=.2,to=.3) gives c(0.2), so I assume you
> >> really mean something like seq(from=.2,to=.3,by=.1), which gives
> >> c(0.2, 0.3).
> >>
> >> %in% tests for exact equality, which is almost never a good idea with
> >> floating-point numbers.
> >>
> >> You need to define what exactly you mean by "in" for floating-point
> >> numbers.  What sort of tolerance are you willing to allow?
> >>
> >> Some possibilities would be for example:
> >>
> >> approxin <- function(x,list,tol) any(abs(list-x)<tol)   # absolute
> >> tolerance
> >>
> >> rapproxin <- function(x,list,tol) (x==0 && 0 %in% list) ||
> >> any(abs((list-x)/x)<=tol,na.rm=TRUE)
> >>     # relative tolerance; only exact 0 will match 0
> >>
> >> Hope this helps,
> >>
> >>          -s
> >>
> >> On Mon, Mar 16, 2009 at 9:36 AM, Daniel Murphy <chiefmur...@gmail.com>
> >> wrote:
> >> > Hello:I am trying to match the value 0.3 in the sequence seq(.2,.3). I
> >> > get
> >> >> 0.3 %in% seq(from=.2,to=.3)
> >> > [1] FALSE
> >> > Yet
> >> >> 0.3 %in% c(.2,.3)
> >> > [1] TRUE
> >> > For arbitrary sequences, this "invisible .3" has been problematic.
> What
> >> > is
> >> > the best way to work around this?
> >
> >
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to