I disagree, I don't really think it's too long or ugly, but if you think it is, you could abbreviate it as 'i'.
x <- 0:20 breaks1 <- seq.int(0, 16, 4) breaks2 <- seq.int(0, 20, 4) data.frame( cut(x, breaks1, right = FALSE, i = TRUE), cut(x, breaks2, right = FALSE, i = TRUE), check.names = FALSE ) I hope this helps. On Fri, Sep 17, 2021 at 6:26 PM Leonard Mada <leo.m...@syonic.eu> wrote: > Hello Andrew, > > > But "cut" generates factors. In most cases with real data one expects to > have also the ends of the interval: the argument "include.lowest" is both > ugly and too long. > > [The test-code on the ftable thread contains this error! I have run > through this error a couple of times.] > > > The only real situation that I can imagine to be problematic: > > - if the interval goes to +Inf (or -Inf): I do not know if there would be > any effects when including +Inf (or -Inf). > > > Leonard > > > On 9/18/2021 1:14 AM, Andrew Simmons wrote: > > While it is not explicitly mentioned anywhere in the documentation for > .bincode, I suspect 'include.lowest = FALSE' is the default to keep the > definitions of the bins consistent. For example: > > > x <- 0:20 > breaks1 <- seq.int(0, 16, 4) > breaks2 <- seq.int(0, 20, 4) > cbind( > .bincode(x, breaks1, right = FALSE, include.lowest = TRUE), > .bincode(x, breaks2, right = FALSE, include.lowest = TRUE) > ) > > > by having 'include.lowest = TRUE' with different ends, you can get > inconsistent behaviour. While this probably wouldn't be an issue with > 'real' data, this would seem like something you'd want to avoid by default. > The definitions of the bins are > > > [0, 4) > [4, 8) > [8, 12) > [12, 16] > > > and > > > [0, 4) > [4, 8) > [8, 12) > [12, 16) > [16, 20] > > > so you can see where the inconsistent behaviour comes from. You might be > able to get R-core to add argument 'warn', but probably not to change the > default of 'include.lowest'. I hope this helps > > > On Fri, Sep 17, 2021 at 6:01 PM Leonard Mada <leo.m...@syonic.eu> wrote: > >> Thank you Andrew. >> >> >> Is there any reason not to make: include.lowest = TRUE the default? >> >> >> Regarding the NA: >> >> The user still has to suspect that some values were not included and run >> that test. >> >> >> Leonard >> >> >> On 9/18/2021 12:53 AM, Andrew Simmons wrote: >> >> Regarding your first point, argument 'include.lowest' already handles >> this specific case, see ?.bincode >> >> Your second point, maybe it could be helpful, but since both >> 'cut.default' and '.bincode' return NA if a value isn't within a bin, you >> could make something like this on your own. >> Might be worth pitching to R-bugs on the wishlist. >> >> >> >> On Fri, Sep 17, 2021, 17:45 Leonard Mada via R-help <r-help@r-project.org> >> wrote: >> >>> Hello List members, >>> >>> >>> the following improvements would be useful for function cut (and >>> .bincode): >>> >>> >>> 1.) Argument: Include extremes >>> extremes = TRUE >>> if(right == FALSE) { >>> # include also right for last interval; >>> } else { >>> # include also left for first interval; >>> } >>> >>> >>> 2.) Argument: warn = TRUE >>> >>> Warn if any values are not included in the intervals. >>> >>> >>> Motivation: >>> - reduce risk of errors when using function cut(); >>> >>> >>> Sincerely, >>> >>> >>> Leonard >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.