Works for me. Marc
On Tue, Sep 22, 2015 at 6:03 PM, Hervé Pagès <hpa...@fredhutch.org> wrote: > Hi Marc, > > On 09/22/2015 05:39 PM, Marc Carlson wrote: > >> Herve is right. UCSC doesn't give us this information, And actually, I >> think it's pretty rare to see exon names from anybody. So it's weird >> to me that they are a default return value for this method. >> > > Ensembl does provide exon names/ids so any TxDb object that was created > with makeTxDbFromBiomart("ensembl", ...) should have them: > > library(GenomicFeatures) > txdb <- makeTxDbFromBiomart("ensembl", dataset="celegans_gene_ensembl") > exonsBy(txdb, use.names=TRUE)$Y74C9A.2a.2 > # GRanges object with 4 ranges and 3 metadata columns: > # seqnames ranges strand | exon_id exon_name > exon_rank > # <Rle> <IRanges> <Rle> | <integer> <character> > <integer> > # [1] I [10413, 10585] + | 1 WBGene00022276.e1 > 1 > # [2] I [11618, 11689] + | 6 WBGene00022276.e6 > 2 > # [3] I [14951, 15160] + | 11 WBGene00022276.e11 > 3 > # [4] I [16473, 16842] + | 14 WBGene00022276.e14 > 4 > # ------- > # seqinfo: 7 sequences (1 circular) from an unspecified genome > > Note that the *By() extractors don't let the user choose which column > to return at the moment so that's why it was decided (a long time ago) > to return exon internal ids *and* names (better more than less). > > H. > > >> Marc >> >> On Tue, Sep 22, 2015 at 5:29 PM, Hervé Pagès <hpa...@fredhutch.org >> <mailto:hpa...@fredhutch.org>> wrote: >> >> Hi Sonali, >> >> UCSC doesn't provide names for the exons of their gene models. >> See the table where this data is coming from: >> >> >> >> https://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=genes&hgta_track=knownGene&hgta_table=knownGene&hgta_doSchema=describe+table+schema >> >> The exon information is all coming from the exonStarts and exonEnds >> columns. No exon names! >> >> H. >> >> PS: Maybe this would better be asked on the support site. >> >> >> On 09/22/2015 04:44 PM, Arora, Sonali wrote: >> >> Hi everyone, >> >> I was trying to get the exons by gene from a txdb object but I >> get NA's >> for all exon_name's. Please advise. >> >> > library(TxDb.Hsapiens.UCSC.hg19.knownGene) >> > txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene >> > ebg2 <- exonsBy(txdb, by="gene") >> > >> > ebg2 >> GRangesList object of length 23459: >> $1 >> GRanges object with 15 ranges and 2 metadata columns: >> seqnames ranges strand | exon_id >> <Rle> <IRanges> <Rle> | <integer> >> [1] chr19 [58858172, 58858395] - | 250809 >> [2] chr19 [58858719, 58859006] - | 250810 >> [3] chr19 [58859832, 58860494] - | 250811 >> [4] chr19 [58860934, 58862017] - | 250812 >> [5] chr19 [58861736, 58862017] - | 250813 >> ... ... ... ... ... ... >> [11] chr19 [58868951, 58869015] - | 250821 >> [12] chr19 [58869318, 58869652] - | 250822 >> [13] chr19 [58869855, 58869951] - | 250823 >> [14] chr19 [58870563, 58870689] - | 250824 >> [15] chr19 [58874043, 58874214] - | 250825 >> exon_name >> <character> >> [1] <NA> >> [2] <NA> >> [3] <NA> >> [4] <NA> >> [5] <NA> >> ... ... >> [11] <NA> >> [12] <NA> >> [13] <NA> >> [14] <NA> >> [15] <NA> >> >> $10 >> GRanges object with 2 ranges and 2 metadata columns: >> seqnames ranges strand | exon_id exon_name >> [1] chr8 [18248755, 18248855] + | 113603 <NA> >> [2] chr8 [18257508, 18258723] + | 113604 <NA> >> >> ... >> <23457 more elements> >> ------- >> seqinfo: 93 sequences (1 circular) from hg19 genome >> > testgr <- unlist(ebg2) >> > table(is.na <http://is.na>(mcols(testgr)$exon_name)) >> >> >> TRUE >> 272776 >> > sessionInfo() >> R version 3.2.2 RC (2015-08-09 r68965) >> Platform: x86_64-w64-mingw32/x64 (64-bit) >> Running under: Windows 7 x64 (build 7601) Service Pack 1 >> >> locale: >> [1] LC_COLLATE=English_United States.1252 >> [2] LC_CTYPE=English_United States.1252 >> [3] LC_MONETARY=English_United States.1252 >> [4] LC_NUMERIC=C >> [5] LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats4 parallel stats graphics grDevices utils >> [7] datasets methods base >> >> other attached packages: >> [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.1 >> [2] GenomicFeatures_1.21.29 >> [3] AnnotationDbi_1.31.18 >> [4] Biobase_2.29.1 >> [5] GenomicRanges_1.21.28 >> [6] GenomeInfoDb_1.5.16 >> [7] IRanges_2.3.21 >> [8] S4Vectors_0.7.18 >> [9] BiocGenerics_0.15.6 >> >> loaded via a namespace (and not attached): >> [1] XVector_0.9.4 zlibbioc_1.15.0 >> [3] GenomicAlignments_1.5.17 BiocParallel_1.3.52 >> [5] tools_3.2.2 SummarizedExperiment_0.3.9 >> [7] DBI_0.3.1 lambda.r_1.1.7 >> [9] futile.logger_1.4.1 rtracklayer_1.29.27 >> [11] futile.options_1.0.0 bitops_1.0-6 >> [13] RCurl_1.95-4.7 biomaRt_2.25.3 >> [15] RSQLite_1.0.0 Biostrings_2.37.8 >> [17] Rsamtools_1.21.17 XML_3.98-1.3 >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org> >> Phone: (206) 667-5791 <tel:%28206%29%20667-5791> >> Fax: (206) 667-1319 <tel:%28206%29%20667-1319> >> >> >> _______________________________________________ >> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> >> >> > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel