On 09/08/2014 11:41 AM, Michael Lawrence wrote:
I might have requested the genome annotation, but I'm pretty sure it wasn't me who decide on tracking it on a per-sequence basis.
OK, maybe. Don't trust my memory too much on this. No regrets though. I think it was the right thing to do ;-) Just because the SAM/BAM format does it is a good enough reason for us to do it too. According to the SAM Spec, the header can look like: @HD VN:1.3 SO:coordinate @SQ SN:chr1_hg19 LN:45 AS:hg19 @SQ SN:chr1_mm10 LN:42 AS:mm10 The only problem is that seqinfo(BamFile("test.bam")) seems to ignore the AS tag (genome assembly identifier) at the moment: > seqinfo(BamFile("test.bam")) Seqinfo of length 2 seqnames seqlengths isCircular genome chr1_hg19 45 NA <NA> chr1_mm10 42 NA <NA> Hopefully that can be addressed. But that's a separate issue... Cheers, H.
I could imagine use cases for that though, e.g., when diagnosing sequencing contamination (like human vs. mouse). But most other tools and file formats expect a single genome per "track", so, for example, rtracklayer has an internal function singleGenome() to take care of this. On Mon, Sep 8, 2014 at 10:50 AM, Hervé Pagès <hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>> wrote: Hi Vince, Yes it would make sense to have the "show" method report the genome when genome(x) contains a unique non-NA value. I think the main use case for having the genome defined at the sequence level instead of the whole object level is metagenomics. Maybe Michael has some other good use cases to share since IIRC he requested the addition of the genome field a couple of years ago and made the case for having it defined at the sequence level. Cheers, H. On 09/08/2014 07:21 AM, Vincent Carey wrote: For GRanges x, my naive expectation is that genome(x) returns a length- one tag identifying the genome to which chromosomal coordinates correspond. The genome() method seems to have sequence-specific semantics, which makes sense, but when we identify sequence with chromosome, it seems too complicated. Is there a use case for a GRanges with sequences from several different genomes? One reason I am inquiring is that I feel it would be nice to have the GRanges show() method report, prominently, the genome in use (or NA if unspecified). This could be accomplished by reporting unique(genome(x)), and perhaps that would be satisfactory. after example(genome) : seqinfo(txdb) Seqinfo of length 15 seqnames seqlengths isCircular genome CH2L 23011544 FALSE dm3 CH2R 21146708 FALSE dm3 CH3L 24543557 FALSE dm3 CH3R 27905053 FALSE dm3 CH4 1351857 FALSE dm3 ... ... ... ... CH3LHet 2555491 FALSE dm3 CH3RHet 2517507 FALSE dm3 CHXHet 204112 FALSE dm3 CHYHet 347038 FALSE dm3 CHUextra 29004656 FALSE dm3 genome(seqinfo(txdb)) CH2L CH2R CH3L CH3R CH4 CHX CHU M "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" CH2LHet CH2RHet CH3LHet CH3RHet CHXHet CHYHet CHUextra "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" "dm3" [[alternative HTML version deleted]] _________________________________________________ Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list https://stat.ethz.ch/mailman/__listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel> -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> _________________________________________________ Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing list https://stat.ethz.ch/mailman/__listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
-- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel