Hello Andrew,
But "cut" generates factors. In most cases with real data one expects to have also the ends of the interval: the argument "include.lowest" is both ugly and too long. [The test-code on the ftable thread contains this error! I have run through this error a couple of times.] The only real situation that I can imagine to be problematic: - if the interval goes to +Inf (or -Inf): I do not know if there would be any effects when including +Inf (or -Inf). Leonard On 9/18/2021 1:14 AM, Andrew Simmons wrote: > While it is not explicitly mentioned anywhere in the documentation for > .bincode, I suspect 'include.lowest = FALSE' is the default to keep > the definitions of the bins consistent. For example: > > > x <- 0:20 > breaks1 <- seq.int <http://seq.int>(0, 16, 4) > breaks2 <- seq.int <http://seq.int>(0, 20, 4) > cbind( > .bincode(x, breaks1, right = FALSE, include.lowest = TRUE), > .bincode(x, breaks2, right = FALSE, include.lowest = TRUE) > ) > > > by having 'include.lowest = TRUE' with different ends, you can get > inconsistent behaviour. While this probably wouldn't be an issue with > 'real' data, this would seem like something you'd want to avoid by > default. The definitions of the bins are > > > [0, 4) > [4, 8) > [8, 12) > [12, 16] > > > and > > > [0, 4) > [4, 8) > [8, 12) > [12, 16) > [16, 20] > > > so you can see where the inconsistent behaviour comes from. You might > be able to get R-core to add argument 'warn', but probably not to > change the default of 'include.lowest'. I hope this helps > > > On Fri, Sep 17, 2021 at 6:01 PM Leonard Mada <leo.m...@syonic.eu > <mailto:leo.m...@syonic.eu>> wrote: > > Thank you Andrew. > > > Is there any reason not to make: include.lowest = TRUE the default? > > > Regarding the NA: > > The user still has to suspect that some values were not included > and run that test. > > > Leonard > > > On 9/18/2021 12:53 AM, Andrew Simmons wrote: >> Regarding your first point, argument 'include.lowest' already >> handles this specific case, see ?.bincode >> >> Your second point, maybe it could be helpful, but since both >> 'cut.default' and '.bincode' return NA if a value isn't within a >> bin, you could make something like this on your own. >> Might be worth pitching to R-bugs on the wishlist. >> >> >> >> On Fri, Sep 17, 2021, 17:45 Leonard Mada via R-help >> <r-help@r-project.org <mailto:r-help@r-project.org>> wrote: >> >> Hello List members, >> >> >> the following improvements would be useful for function cut >> (and .bincode): >> >> >> 1.) Argument: Include extremes >> extremes = TRUE >> if(right == FALSE) { >> # include also right for last interval; >> } else { >> # include also left for first interval; >> } >> >> >> 2.) Argument: warn = TRUE >> >> Warn if any values are not included in the intervals. >> >> >> Motivation: >> - reduce risk of errors when using function cut(); >> >> >> Sincerely, >> >> >> Leonard >> >> ______________________________________________ >> R-help@r-project.org <mailto:R-help@r-project.org> mailing >> list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> <https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible >> code. >> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.