Dear Valerie, Great improvement. Thanks a lot for your work. I am greatly appreciated for this.
Yours sincerely, Jianhong Ou LRB 670A Program in Gene Function and Expression 364 Plantation Street Worcester, MA 01605 On 8/27/13 4:49 PM, "Valerie Obenchain" <voben...@fhcrc.org> wrote: >Thanks Jianhong for reporting this. > >Changes implemented in IRanges 1.19.27: >- RleList() constructor now has default 'compress=TRUE'. >- seqselect,Vector-method lapply() loop was replaced with direct subset. > >New timings: > >## generic subset function >fun0 <- function(x) x[500:1] > >## GRangesList with RleList as metadata col >grll <- GRanges(seqnames="chr1", > IRanges(start=1:500, width=2), > someInfo=rep(RleList("*"), 500)) >grr <- split(grll, 1:500) > > microbenchmark(fun0(grr), times=10) >Unit: milliseconds > expr min lq median uq max neval > fun0(grr) 28.88062 29.31157 30.58494 31.4393 32.26367 10 > >Median is now 0.031 seconds compared to the previous 1.635. > >>> > system.time(grr<- grr[500:1]) >>> user system elapsed >>> 1.622 0.013 1.635 > > > >Valerie > > >On 08/23/2013 11:17 AM, Michael Lawrence wrote: >> >> >> >> On Fri, Aug 23, 2013 at 8:41 AM, Valerie Obenchain <voben...@fhcrc.org >> <mailto:voben...@fhcrc.org>> wrote: >> >> Hi Michael, >> >> Martin and I have been discussing this. In addition to the fix you >> suggest, what do you think of changing the default to >> compressed=TRUE for the RleList constructor? Rle is the only one of >> the AtomicLists with default FALSE. Was there a reason for this when >> it was first implemented? >> >> >> I'm guessing Patrick did that because we always used Rles for coverage, >> and RleList for per-chromosome coverage. Also, there might be some >> overhead in that Rle runs in the unlistData can cross list elements. >> >> About my fix, the only downside would be if the range widths were much >> larger than the size of the vector, e.g., a highly compressed Rle, >> selected with chromosome-size ranges. Then the as.integer(ir) is big >> compared to the data. Otherwise, it's way faster. >> >> >> Val >> >> >> >> >> On 08/22/2013 07:34 PM, Maintainer wrote: >> >> Hi, >> >> SimpleLists are slow in this situation, basically because the >> underlying >> seqselect is slow, due to this loop: >> >> x <- do.call(c, lapply(seq_len(length(ir)), >> function(i) >> window(x, >> start = start(ir)[i], width = width(ir)[i]))) >> >> Am I missing something or could this become a simple >> x[as.integer(ir)]? >> >> In the meantime, using CompressedLists is the way to go. So for >>an >> RleList, you need to pass compress=TRUE to the constructor. >> >> >> On Wed, Aug 21, 2013 at 8:30 AM, Ou, Jianhong >> <jianhong...@umassmed.edu <mailto:jianhong...@umassmed.edu> >> <mailto:Jianhong.Ou@umassmed.__edu >> <mailto:jianhong...@umassmed.edu>>> wrote: >> >> Hi, >> >> When I use big set of GrangesList, I found it become very >> slow when >> metadata contain AtomicList. e.g. >> >> > grll <- GRanges(seqnames="chr1", >>ranges=IRanges(start=1:500, >> width=2), someInfo=rep(RleList("*"), 500)) >> > grr <- split(grll, 1:500) >> > grl <- as.list(grr) >> > system.time(grl<- grl[500:1]) >> user system elapsed >> 0 0 0 >> > system.time(grr<- grr[500:1]) >> user system elapsed >> 1.622 0.013 1.635 >> > grll <- GRanges(seqnames="chr1", >>ranges=IRanges(start=1:500, >> width=2)) >> > grr <- split(grll, 1:500) >> > grl <- as.list(grr) >> > system.time(grl<- grl[500:1]) >> user system elapsed >> 0 0 0 >> > system.time(grr<- grr[500:1]) >> user system elapsed >> 0.029 0.001 0.030 >> > sessionInfo() >> R Under development (unstable) (2013-07-23 r63392) >> Platform: x86_64-apple-darwin12.4.0 (64-bit) >> >> locale: >> [1] >> >>en_US.UTF-8/en_US.UTF-8/en_US.__UTF-8/C/en_US.UTF-8/en_US.UTF-__8 >> >> attached base packages: >> [1] parallel stats graphics grDevices utils >>datasets >> methods base >> >> other attached packages: >> [1] GenomicRanges_1.13.36 XVector_0.1.0 >>IRanges_1.19.24 >> BiocGenerics_0.7.3 >> >> loaded via a namespace (and not attached): >> [1] stats4_3.1.0 tools_3.1.0 >> >> Is there any method to improve this? >> >> Yours sincerely, >> >> Jianhong Ou >> >> LRB 670A >> Program in Gene Function and Expression >> 364 Plantation Street Worcester, >> MA 01605 >> >> [[alternative HTML version deleted]] >> >> _________________________________________________ >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> >> <mailto:Bioc-devel@r-project.__org >> <mailto:Bioc-devel@r-project.org>> mailing list >> https://stat.ethz.ch/mailman/__listinfo/bioc-devel >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel> >> >> >> >> >> >>_________________________________________________________________________ >>___ >> devteam-bioc mailing list >> To unsubscribe from this mailing list send a blank email to >> devteam-bioc-leave@lists.__fhcrc.org >> <mailto:devteam-bioc-le...@lists.fhcrc.org> >> You can also unsubscribe or change your personal options at >> https://lists.fhcrc.org/__mailman/listinfo/devteam-bioc >> <https://lists.fhcrc.org/mailman/listinfo/devteam-bioc> >> >> >> > _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel