Hi Aditya, I think straddle would be a great addition to plyranges. Happy for you to put in a PR and add you as a contributor.
Maybe instead of specifying the start etc. we could dispatch on anchored ranges instead? So we’d follow the anchor_start(gr) %>% straddle(). We could also have the directed version for considering strands. https://github.com/sa-lee/plyranges Thanks, Stuart --- Stuart Lee Visiting PhD Student - Ritchie Lab On 13 Sep 2019, at 22:38, Michael Lawrence <lawrence.mich...@gene.com<mailto:lawrence.mich...@gene.com>> wrote: Thanks for these suggestions; I think they're worth considering. I've never been totally satisfied with (my function) flank(), because it's limited and its arguments are somewhat obscure in meaning. You can check out what we did in plyranges: https://rdrr.io/bioc/plyranges/man/flank-ranges.html. Your functions are more flexible, because they are two-way about the endpoint, like promoters(). Sometimes I've solved that with resize(flank()), but that's not ideal. Maybe a better name is "straddle" for when ranges straddle one of the endpoints? In keeping with the current pattern of Ranges API, there would be a single function: straddle(x, side, left, right, ignore.strand=FALSE). So straddle(x, "start", -100, 10) would be like promoters(x, 100, 10) for a positive or "*" strand range. That brings up strandedness, which needs to be considered here. For unstranded ranges, it may be that direct start() and end() manipulation is actually more transparent than a special verb. I wonder what Stuart Lee thinks? The functions that involve reduce() wouldn't fit into the intrarange operations, as they are summarizing ranges, not transforming them. They may be going too far. Michael On Fri, Sep 13, 2019 at 4:48 AM Bhagwat, Aditya <aditya.bhag...@mpi-bn.mpg.de<mailto:aditya.bhag...@mpi-bn.mpg.de>> wrote: Dear bioc-devel, The ?GenomicRanges::`intra-range-methods` are very useful for range arithmetic<https://genomicsclass.github.io/book/pages/figure/bioc1_igranges-unnamed-chunk-6-1.png> Feedback request: would it be of general use to add the methods below to the GenomicRanges::`intra-range-methods` palette (after properly S4-ing them)? Or shall I keep them in multicrispr<https://gitlab.gwdg.de/loosolab/software/multicrispr>? Additional feedback welcome as well (e.g. re-implementation of already existing functionality). 1) Left flank #' Left flank #' @param gr \code{\link[GenomicRanges]{GRanges-class}} #' @param leftstart number: flank start (relative to range start) #' @param leftend number: flank end (relative to range start) #' @return a \code{\link[GenomicRanges]{GRanges-class}} #' @export #' @examples #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr') #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus #' gr <- read_bed(bedfile, bsgenome) #' left_flank(gr) left_flank <- function(gr, leftstart = -200, leftend = -1){ # Assert assert_is_identical_to_true(is(gr, 'GRanges')) assert_is_a_number(leftstart) assert_is_a_number(leftend) # Flank newranges <- gr end(newranges) <- start(gr) + leftend start(newranges) <- start(gr) + leftstart # Return newranges } 2) Right flank #' Right flank #' @param gr \code{\link[GenomicRanges]{GRanges-class}} #' @param rightstart number: flank start (relative to range end) #' @param rightend number: flank end (relative to range end) #' @return \code{\link[GenomicRanges]{GRanges-class}} #' @export #' @examples #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr') #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus #' gr <- read_bed(bedfile, bsgenome) #' right_flank(gr) #' @export right_flank <- function(gr, rightstart = 1, rightend = 200){ # Assert assert_is_identical_to_true(is(gr, 'GRanges')) assert_is_a_number(rightstart) assert_is_a_number(rightend) assert_is_a_bool(verbose) # Flank newranges <- gr start(newranges) <- end(newranges) + rightstart end(newranges) <- end(newranges) + rightend # Plot if (plot) plot_intervals(GRangesList(sites = gr, rightflanks = newranges)) # Return cmessage('\t\t%d right flanks : [end%s%d, end%s%d]', length(newranges), csign(rightstart), abs(rightstart), csign(rightend), abs(rightend)) newranges } 3) Slop #' Slop (i.e. extend left/right) #' @param gr \code{\link[GenomicRanges]{GRanges-class}} #' @param leftstart number: flank start (relative to range start) #' @param rightend number: flank end (relative to range end) #' @return \code{\link[GenomicRanges]{GRanges-class}} #' @export #' @examples #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr') #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus #' gr <- read_bed(bedfile, bsgenome) #' slop(gr) #' @export slop <- function(gr, leftstart = -22, rightend = 22){ # Assert assert_is_identical_to_true(methods::is(gr, 'GRanges')) assert_is_a_number(leftstart) assert_is_a_number(rightend) assert_is_a_bool(verbose) # Slop newranges <- gr start(newranges) <- start(newranges) + leftstart end(newranges) <- end(newranges) + rightend # Return newranges } 4) Flank fourways #' Flank fourways #' #' Flank left and right, for both strands, and merge overlaps #' @param gr \code{\link[GenomicRanges]{GRanges-class}} #' @param leftstart number: left flank start (relative to range start) #' @param leftend number: left flank end (relative to range start) #' @param rightstart number: right flank start (relative to range end) #' @param rightend number: right flank end (relative to range end) #' @return \code{\link[GenomicRanges]{GRanges-class}} #' @examples #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr') #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus #' granges <- read_bed(bedfile, bsgenome) #' flank_fourways(granges) #' @export flank_fourways <- function(gr, leftstart = -200, leftend = -1, rightstart = 1, rightend = 200){ # Comply . <- NULL # Flank left <- left_flank( gr, leftstart, leftend) right <- right_flank(gr,rightstart, rightend) newranges <- c(left, right) # Complement newranges %<>% c(invertStrand(.)) # Merge overlaps newranges %<>% reduce() # GenomicRanges::reduce # Return newranges } 5) Slop fourways #' Slop granges for both strands, merging overlaps #' @param gr \code{\link[GenomicRanges]{GRanges-class}} #' @param leftstart number #' @param rightend number #' @return \code{\link[GenomicRanges]{GRanges-class}} #' @examples #' bedfile <- system.file('extdata/SRF.bed', package = 'multicrispr') #' bsgenome <- BSgenome.Mmusculus.UCSC.mm10::Mmusculus #' gr <- read_bed(bedfile, bsgenome) #' gr #' slop_fourways(gr) #' @export slop_fourways <- function(gr, leftstart = -22, rightend = 22){ # Comply . <- NULL # Slop if (verbose) cmessage('\tSlop fourways') newranges <- slop(gr, leftstart, rightend, verbose = verbose) # Complement newranges %<>% c(invertStrand(.)) # Merge overlaps newranges %<>% reduce() # Return newranges } Thankyou! Aditya [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org<mailto:Bioc-devel@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Michael Lawrence Scientist, Bioinformatics and Computational Biology Genentech, A Member of the Roche Group Office +1 (650) 225-7760 micha...@gene.com<mailto:micha...@gene.com> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube _______________________________________________ The information in this email is confidential and intended solely for the addressee. You must not disclose, forward, print or use it without the permission of the sender. The Walter and Eliza Hall Institute acknowledges the Wurundjeri people of the Kulin Nation as the traditional owners of the land where our campuses are located and the continuing connection to country and community. _______________________________________________ [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel