It would be nice (for a number of reasons) to have chromosome lengths readily available in a foundational package like GenomeInfoDb, so that, say,
data(seqinfo.hg19) seqinfo(myResults) <- seqinfo.hg19[ seqlevels(myResults) ] would work without issues. Is there any particular reason this couldn't happen for the supported/available BSgenomes? It would seem like a simple matter to do R> library(BSgenome.Hsapiens.UCSC.hg19) R> seqinfo.hg19 <- seqinfo(Hsapiens) R> save(seqinfo.hg19, file="~/bioc-devel/GenomeInfoDb/data/seqinfo.hg19.rda") and be done with it until (say) the next release or next released BSgenome. I considered looping through the following BSgenomes myself... and if it isn't strongly opposed by (everyone) I may still do exactly that. Seems useful, no? e.g. for the following 42 builds, grep("(UCSC|NCBI)", unique(gsub(".masked", "", available.genomes())), value=TRUE) [1] "BSgenome.Amellifera.UCSC.apiMel2" "BSgenome.Btaurus.UCSC.bosTau3" [3] "BSgenome.Btaurus.UCSC.bosTau4" "BSgenome.Btaurus.UCSC.bosTau6" [5] "BSgenome.Btaurus.UCSC.bosTau8" "BSgenome.Celegans.UCSC.ce10" [7] "BSgenome.Celegans.UCSC.ce2" "BSgenome.Celegans.UCSC.ce6" [9] "BSgenome.Cfamiliaris.UCSC.canFam2" "BSgenome.Cfamiliaris.UCSC.canFam3" [11] "BSgenome.Dmelanogaster.UCSC.dm2" "BSgenome.Dmelanogaster.UCSC.dm3" [13] "BSgenome.Dmelanogaster.UCSC.dm6" "BSgenome.Drerio.UCSC.danRer5" [15] "BSgenome.Drerio.UCSC.danRer6" "BSgenome.Drerio.UCSC.danRer7" [17] "BSgenome.Ecoli.NCBI.20080805" "BSgenome.Gaculeatus.UCSC.gasAcu1" [19] "BSgenome.Ggallus.UCSC.galGal3" "BSgenome.Ggallus.UCSC.galGal4" [21] "BSgenome.Hsapiens.NCBI.GRCh38" "BSgenome.Hsapiens.UCSC.hg17" [23] "BSgenome.Hsapiens.UCSC.hg18" "BSgenome.Hsapiens.UCSC.hg19" [25] "BSgenome.Hsapiens.UCSC.hg38" "BSgenome.Mfascicularis.NCBI.5.0" [27] "BSgenome.Mfuro.UCSC.musFur1" "BSgenome.Mmulatta.UCSC.rheMac2" [29] "BSgenome.Mmulatta.UCSC.rheMac3" "BSgenome.Mmusculus.UCSC.mm10" [31] "BSgenome.Mmusculus.UCSC.mm8" "BSgenome.Mmusculus.UCSC.mm9" [33] "BSgenome.Ptroglodytes.UCSC.panTro2" "BSgenome.Ptroglodytes.UCSC.panTro3" [35] "BSgenome.Rnorvegicus.UCSC.rn4" "BSgenome.Rnorvegicus.UCSC.rn5" [37] "BSgenome.Rnorvegicus.UCSC.rn6" "BSgenome.Scerevisiae.UCSC.sacCer1" [39] "BSgenome.Scerevisiae.UCSC.sacCer2" "BSgenome.Scerevisiae.UCSC.sacCer3" [41] "BSgenome.Sscrofa.UCSC.susScr3" "BSgenome.Tguttata.UCSC.taeGut1" Am I insane for suggesting this? It would make things a little easier for rtracklayer, most SummarizedExperiment and SE-derived objects, blah, blah, blah... Best, --t Statistics is the grammar of science. Karl Pearson <http://en.wikipedia.org/wiki/The_Grammar_of_Science> [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel