Hi Thomas, In some particular situations seqselect<- was using some tricks to be fast. In IRanges 1.20.6, I've ported these same tricks to [<- so the performance regression you report below should be gone. Let me know if you run into other issues with the subsetting code.
Thanks, H. On 11/11/2013 05:06 PM, Thomas Sandmann wrote:
Hi Herve, thanks a lot for re-enabling the subsetting functionality for CompressedRleList with List-like objects. While things work now, I noticed a big difference in execution time for the following operations: with IRanges_1.18.2 rles <- RleList(Rle(values=TRUE,__lengths=10000), Rle(values=TRUE,lengths= 10000), Rle(values=TRUE,lengths= 10000), Rle(values=TRUE,__lengths=10000), Rle(values=TRUE,__lengths=10000), Rle(values=TRUE,__lengths=10000), Rle(values=TRUE,__lengths=10000), Rle(values=TRUE,__lengths=10000), compress=TRUE) system.time(seqselect( rles, unname(list(a=20:108, b=41:131, c=21:105, d=1:1234, e=4:5, f=1223:1243, g=432:5234, h=444:5555) )) <- TRUE) clocks ca. *0.040s *on my system. R 3.0.2 with other attached packages: [1] Rsamtools_1.12.2 Biostrings_2.28.0 devtools_1.3 [4] GenomicRanges_1.12.4 IRanges_1.18.2 BiocGenerics_0.6.0 [7] Defaults_1.1-1 BiocInstaller_1.10.3 roxygen2_2.2.2 [10] digest_0.6.3 with IRanges_1.20.5, the same operation is much slower: system.time( rles[ unname( list(a=20:108, b=41:131, c=21:105, d=1:1234, e=4:5, f=1223:1243, g=432:5234, h=444:5555)) ] <- TRUE ) takes about *0.45s * more than 10x longer.** R3.0.0 with other attached packages: [1] devtools_1.3 rtracklayer_1.22.0 Rsamtools_1.14.1 [4] Biostrings_2.30.0 GenomicRanges_1.14.3 XVector_0.2.0 [7] IRanges_1.20.5 BiocGenerics_0.8.0 Defaults_1.1-1 [10] BiocInstaller_1.12.0 roxygen2_2.2.2 digest_0.6.3 I noticed even larger speed degradation with real-life, longer datasets, so the decrease appears to be non-linear. Can you reproduce this difference in performance ? If so, would it be possible to reinstate the old seqselect method for the sake of efficiency ? Thomas
-- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel