Wow, I'll have to study this one for a bit. Thanks!

Ben

On Sat, May 12, 2012 at 3:09 PM, William Dunlap <wdun...@tibco.com> wrote:

> Here is some code that I've been fiddling with for years
> (since I wanted to provide evidence that our main office
> needed more modems and wanted to show how often
> both of them were busy).  It does set operations and a
> bit more on collections of half-open intervals.  (Hence
> it drops zero-length intervals).
>
> Several of the functions could be defined as methods
> of standard set operators.
>
> To see what it does try
>
>   r1 <- as.Ranges(bottoms=c(1,3,5,7), tops=c(2, 4, 9, 8))
>   r2 <- as.Ranges(bottoms=c(1.5,4,6,7), tops=c(1.7,5,7,9))
>   setdiffRanges( as.Ranges(1, 5), as.Ranges(c(2, 3.5), c(3, 4.5)) )
>   plot(r1, r2, setdiffRanges(r1,r2), intersectRanges(r1,r2),
>            unionRanges(r1,r2), c(r1,r2), inNIntervals(c(r1,r2), n=2))
>
> You can use Date and POSIXct objects for the endpoints of
> the intervals as well.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> # An object of S3-class "Ranges" is a 2-column
> # data.frame(bottoms, tops), describing a
> # set of half-open intervals, (bottoms[i], tops[i]].
> # inRanges is the only function that cares about
> # the direction of the half-openness of those intervals,
> # but the other rely on half-openness (so 0-width intervals
> # are not allowed).
>
> # Use as.Ranges to create a Ranges object from
> #   * a matrix whose rows are intervals
> #   * a data.frame whose rows are intervals
> #   * a vector of interval starts and a vector of interval ends
> # The endpoints must be of a class which supports the comparison (<,<=)
> # operators and which can be concatenated with the c() function.
> # That class must also be able to be in a data.frame and be subscriptable.
> # That covers at least numeric, Data, and POSIXct.
> # (The plot method only works for numeric endpoints).
> # You may input a zero-width interval (with bottoms[i]==tops[i]),
> # but the constructors will silently remove it.
> as.Ranges <- function(x, ...) UseMethod("as.Ranges")
>
> as.Ranges.matrix <- function(x, ...) {
>    # each row of x is an interval
>    stopifnot(ncol(x)==2, all(x[,1] <= x[,2]))
>    x <- x[x[,1] < x[,2], , drop=FALSE]
>    Ranges <- data.frame(bottoms = x[,1], tops = x[,2])
>    class(Ranges) <- c("Ranges", class(Ranges))
>    Ranges
> }
>
> as.Ranges.data.frame <- function(x, ...) {
>    # each row of x is an interval
>    stopifnot(ncol(x)==2, all(x[,1] <= x[,2]))
>    x <- x[x[,1] < x[,2], , drop=FALSE]
>    Ranges <- data.frame(bottoms = x[,1], tops = x[,2])
>    class(Ranges) <- c("Ranges", class(Ranges))
>    Ranges
> }
>
> as.Ranges.default <- function(bottoms, tops, ...) {
>    # vectors of bottoms and tops of intervals
>    stopifnot(all(bottoms <= tops))
>    Ranges <- data.frame(bottoms=bottoms, tops=tops)[bottoms < tops, ,
> drop=FALSE]
>    class(Ranges) <- c("Ranges", class(Ranges))
>    Ranges
> }
>
> c.Ranges <- function(x, ...) {
>    # combine several Ranges objects into one which lists all the intervals.
>    RangesList <- list(x=x, ...)
>    Ranges <- x
>    for (r in list(...)) {
>        Ranges <- rbind(Ranges, r)
>    }
>    class(Ranges) <- unique(c("Ranges", class(Ranges)))
>    Ranges
> }
>
> inNIntervals <- function(Ranges, n)
> {
>    # return Ranges object that describes points that are
>    # in at least n intervals in the input Ranges object
>    stopifnot(n>0)
>    u <- c(Ranges[,1], Ranges[,2])
>    o <- order(u)
>    u <- u[o]
>    jumps <- rep(c(+1L,-1L), each=nrow(Ranges))[o]
>    val <- cumsum(jumps)
>    as.Ranges(u[val==n & jumps==1], u[val==n-1 & jumps==-1])
> }
>
> unionIntervals <- function(Ranges) {
>    # combine overlapping and adjacent intervals to create a
>    # possibly smaller and simpler, but equivalent, Ranges object
>    inNIntervals(Ranges, 1)
> }
>
> intersectIntervals <- function(Ranges) {
>    # return 0- or 1-row Ranges object containing describing points
>    # that are in all the intervals in input Ranges object.
>    u <- unname(c(Ranges[,1], Ranges[,2]))
>    o <- order(u)
>    u <- u[o]
>    jumps <- rep(c(+1L,-1L), each=nrow(Ranges))[o]
>    val <- cumsum(jumps)
>    as.Ranges(u[val==nrow(Ranges) & jumps==1], u[val==nrow(Ranges)-1 &
> jumps==-1])
> }
>
> unionRanges <- function(x, ...) {
>    unionIntervals(rbind(x, ...))
> }
>
> setdiffRanges <- function (x, y)
> {
>    # set difference: return Ranges object describing points that are in x
> but not y
>    x <- unionIntervals(x)
>    y <- unionIntervals(y)
>    nx <- nrow(x)
>    ny <- nrow(y)
>    u <- c(x[, 1], y[, 1], x[, 2], y[, 2])
>    o <- order(u)
>    u <- u[o]
>    vx <- cumsum(jx <- rep(c(1, 0, -1, 0), c(nx, ny, nx, ny))[o])
>    vy <- cumsum(jy <- rep(c(0, -1, 0, 1), c(nx, ny, nx, ny))[o])
>    as.Ranges(u[vx == 1 & vy == 0], u[(vx == 1 & jy == -1) | (jx == -1 & vy
> == 0)])
> }
>
> intersectRanges <- function(x, y)
> {
>    # return Ranges object describing points that are in both x and y
>    x <- unionIntervals(x)
>    y <- unionIntervals(y)
>    nx <- nrow(x)
>    ny <- nrow(y)
>    u <- c(x[, 1], y[, 1], x[, 2], y[, 2])
>    o <- order(u)
>    u <- u[o]
>    vx <- cumsum(jx <- rep(c(1, 0, -1, 0), c(nx, ny, nx, ny))[o])
>    vy <- cumsum(jy <- rep(c(0, 1, 0, -1), c(nx, ny, nx, ny))[o])
>    as.Ranges(u[vx == 1 & vy == 1], u[(vx == 1 & jy == -1) | (jx == -1 & vy
> == 1)])
> }
>
> inRanges <- function(x, Ranges)
> {
>    if (length(x) == 1) {
>        any(x > Ranges[,1] & x <= Ranges[,2])
>    } else {
>        Ranges <- unionIntervals(Ranges)
>        (findInterval(-x, rev(-as.vector(t(Ranges)))) %% 2) == 1
>    }
> }
>
> plot.Ranges <- function(x, ...)
> {
>    # mainly for debugging - no plotting controls, all ... must be Ranges
> objects.
>    RangesList <- list(x=x, ...)
>    labels <- vapply(as.list(substitute(list(x, ...)))[-1],
> function(x)deparse(x)[1], "")
>    oldmar <- par(mar = replace(par("mar"), 2, max(nchar(labels)/2, 10)))
>    on.exit(par(oldmar))
>    xlim <- do.call("range", c(unlist(RangesList, recursive=FALSE),
> list(finite=TRUE)))
>    ylim <-  c(0, length(RangesList)+1)
>    plot(type="n", xlim, ylim, xlab="", ylab="", axes=FALSE)
>    grid(ny=0)
>    axis(side=1)
>    axis(side=2, at=seq_along(RangesList), lab=labels, las=1, tck=0)
>    box()
>    incr <- 0.45 / max(vapply(RangesList, nrow, 0))
>    xr <- par("usr")[1:2] # for intervals that extend to -Inf or Inf.
>    for(i in seq_along(RangesList)) {
>        r <- RangesList[[i]]
>        if (nrow(r)>0) {
>            y <- i + seq(0, by=incr, len=nrow(r))
>            r <- r[order(r[,1]),,drop=FALSE]
>            segments(pmax(r[,1], xr[1]), y, pmin(r[,2], xr[2]), y)
>         }
>    }
> }
>
>
>
> > -----Original Message-----
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf
> > Of Ben quant
> > Sent: Saturday, May 12, 2012 10:54 AM
> > To: r-help@r-project.org
> > Subject: [R] range segment exclusion using range endpoints
> >
> > Hello,
> >
> > I'm posting this again (with some small edits). I didn't get any replies
> > last time...hoping for some this time. :)
> >
> > Currently I'm only coming up with brute force solutions to this issue
> > (loops). I'm wondering if anyone has a better way to do this. Thank you
> for
> > your help in advance!
> >
> > The problem: I have endpoints of one x range (x_rng) and an unknown
> number
> > of s ranges (s[#]_rng) also defined by the range endpoints. I'd like to
> > remove the x ranges that overlap with the s ranges. The examples below
> > demonstrate what I mean.
> >
> > What is the best way to do this?
> >
> > Ex 1.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-25.5,30)
> > s2_rng = c(0.77,10)
> > s3_rng = c(25,35)
> > s4_rng = c(70,80.3)
> > s5_rng = c(90,95)
> >
> > I would get:
> > -100,-25.5
> > 35,70
> > 80.3,90
> > 95,100
> >
> > Ex 2.
> > For:
> > x_rng = c(-50.5,100)
> >
> > s1_rng = c(-75.3,30)
> >
> > I would get:
> > 30,100
> >
> > Ex 3.
> > For:
> > x_rng = c(-75.3,30)
> >
> > s1_rng = c(-50.5,100)
> >
> > I would get:
> > -75.3,-50.5
> >
> > Ex 4.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-105,105)
> >
> > I would get something like:
> > NA,NA
> > or...
> > NA
> >
> > Ex 5.
> > For:
> > x_rng = c(-100,100)
> >
> > s1_rng = c(-100,100)
> >
> > I would get something like:
> > -100,-100
> > 100,100
> > or just...
> > -100
> >  100
> >
> > PS - You may have noticed that in all of the examples I am including the
> s
> > range endpoints in the desired results, which I can deal with later in my
> > program so its not a problem...  I think leaving in the s range endpoints
> > simplifies the problem.
> >
> > Thanks!
> > Ben
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to