Hi bioconductor devs, The BSgenomeViews class has been very useful in efficiently propagating metadata for running Biostring operations. I noticed something unexpected when iterating over views - it seems to return the Biostrings object instead of a single length Views object, and thus loses the associated view metadata. Is this intentional? Below is some example code, the output and sessionInfo(). Yes, I also confirmed this happens in the development version of R / bioconductor 3.5.
On a side note, for unit testing it's been difficult to mock a BSgenome object due to the link to physical files, and as a workaround I use a small, arbitrary BSgenome package. Can one construct a BSgenome from its package bundled extdata? The man page examples use packaged genomes. Code to reproduce the issue: ---------------------------------------------------------------------- library(BSgenome) genome <- getBSgenome("BSgenome.Hsapiens.UCSC.hg19") gr <- GRanges(c("chr1:25001-28000", "chr2:30001-31000")) views <- Views(genome, gr) views lapply(views, class) ---------------------------------------------------------------------- Result: ---------------------------------------------------------------------- > views BSgenomeViews object with 2 views and 0 metadata columns: seqnames ranges strand dna <Rle> <IRanges> <Rle> <DNAStringSet> [1] chr1 [25001, 28000] * [GCTTCAGCCT...TTATTTATTG] [2] chr2 [30001, 31000] * [GACCCTCCTG...AGCAGGTGGT] ------- seqinfo: 93 sequences (1 circular) from hg19 genome > lapply(views, class) [[1]] [1] "DNAString" attr(,"package") [1] "Biostrings" [[2]] [1] "DNAString" attr(,"package") [1] "Biostrings" > ---------------------------------------------------------------------- Tested against these configurations: 1) R 3.3.2 + BSgenome 1.42.0 (stable bioconductor 3.4) 2) R 2017-04-05 installed via llnl/spack + BSgenome 1.43.7 (devel bioconductor 3.5) sessionInfo for configuration #2 above: ---------------------------------------------------------------------- > sessionInfo() R Under development (unstable) (2017-04-05 r72488) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.2 LTS Matrix products: default BLAS: /share/apps/spack/opt/spack/linux-ubuntu16-x86_64/gcc-5.4.0/r-2017-04-05-4tkzhsu6sdpwmlvnv275jf6x766gwnpy/rlib/R/lib/libRblas.so LAPACK: /share/apps/spack/opt/spack/linux-ubuntu16-x86_64/gcc-5.4.0/r-2017-04-05-4tkzhsu6sdpwmlvnv275jf6x766gwnpy/rlib/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.43.7 [3] rtracklayer_1.35.10 Biostrings_2.43.7 [5] XVector_0.15.2 GenomicRanges_1.27.23 [7] GenomeInfoDb_1.11.10 IRanges_2.9.19 [9] S4Vectors_0.13.15 BiocGenerics_0.21.3 loaded via a namespace (and not attached): [1] zlibbioc_1.21.0 GenomicAlignments_1.11.12 [3] BiocParallel_1.9.5 lattice_0.20-35 [5] tools_3.5.0 SummarizedExperiment_1.5.7 [7] grid_3.5.0 Biobase_2.35.1 [9] matrixStats_0.52.1 Matrix_1.2-9 [11] GenomeInfoDbData_0.99.0 bitops_1.0-6 [13] RCurl_1.95-4.8 DelayedArray_0.1.7 [15] compiler_3.5.0 Rsamtools_1.27.15 [17] XML_3.98-1.6 > BiocInstaller::biocValid() [1] TRUE > --- Pariksheet Nanda PhD Candidate in Genetics and Genomics System Administrator, Storrs HPC Cluster University of Connecticut [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel