Hi Marc, Thanks a lot for your advice.
I think as far as I know the gff3 file is the only way I can use to get Gmax's latest build for annotation from phytozome(http://www.phytozome.net/). Now it's publicly available ftp://ftp.jgi-psf.org/pub/compgen/phytozome/v9.0/Gmax/annotation/Gmax_189_gene_exons.gff3.gz And the reason I didn't provide the 'exonRankAttributeName' is that because there is no explicit numbers which indicate the exon rank directly in that gff3 file, examples are like Gm01 phytozome8_0 gene 27643 27977 . - . ID=Glyma01g00210;Name=Glyma01g00210 Gm01 phytozome8_0 mRNA 27643 27977 . - . ID=PAC:26325839;Name=Glyma01g00210.1;pacid=26325839;longest=1;Parent=Glyma01g00210 Gm01 phytozome8_0 exon 27913 27977 . - . ID=PAC:26325839.exon.1;Parent=PAC:26325839;pacid=26325839 Gm01 phytozome8_0 CDS 27913 27977 . - 0 ID=PAC:26325839.CDS.1;Parent=PAC:26325839;pacid=26325839 Gm01 phytozome8_0 exon 27643 27811 . - . ID=PAC:26325839.exon.2;Parent=PAC:26325839;pacid=26325839 Gm01 phytozome8_0 CDS 27643 27811 . - 1 ID=PAC:26325839.CDS.2;Parent=PAC:26325839;pacid=26325839 The ID attributes looks like it has information about the rank, I see *.exon.1 *.exon.2, so I guess I can extract those information as extra column manually and specify them in the function of ' makeTranscriptDbFromGFF'. btw, Is this required? It looks like the GenomicFeatures trying to infer exon rank if I didn't provide that information, so I thought 'exonRankAttributeName' is optional at first. Thanks again Tengfei On Fri, Feb 8, 2013 at 6:08 PM, Marc Carlson <mcarl...@fhcrc.org> wrote: > Hi Tengfei, > > Yes that looks like an oversight. Thanks for reporting that! I will > extend makeTxDbPackage so that it's more accommodating of these newer > transcriptDbs. If you want to help me out, you could call saveDb() on your > gmax189 object and send me the .sqlite file that you save it to. > > Also, if you have any alternate options for importing your data (other > than using GFF or GTF): I think you probably should consider it. The file > specifications for these filetypes are missing key details and so you can > very easily get a "legal" GFF or GTF file that is actually missing > important details from it's contents. For example, they can commonly lack > information about the order of the exons for a given transcript, which can > render them difficult (or impossible) to use for transcript work. But for > these specifications, that information is "optional". > > > Marc > > > > > On 02/06/2013 09:46 PM, Tengfei Yin wrote: > >> Dear all, >> >> I am trying to build a txdb object from gff3 for soybean data and try to >> make it a package. Code used like this >> >> gmax189<- makeTranscriptDbFromGFF("~/**Gmax_189_gene_exons.gff3", >> format = "gff3", species = "Glycine >> max", >> dataSource = " >> http://www.phytozome.org/") >> makeTxDbPackage(txdb = gmax189, >> version = "0.9.1", >> maintainer = "Tengfei Yin", >> author = "Tengfei Yin", >> destDir=".", >> license="Artistic-2.0") >> >> Error message: >> Error in gsub("_", "", pkgName) : >> error in evaluating the argument 'x' in selecting a method for function >> 'gsub': Error: object 'pkgName' not found >> >> >> Looks like my dataSource should be either BioMart or UCSC, otherwise no >> pkgname will be produced in function .makePackageName? >> >> Or should I build annotation package in some other ways? >> >> Thanks a lot >> >> Tengfei >> >> my sessionInfo >> >> sessionInfo() >>> >> R Under development (unstable) (2013-01-21 r61728) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] GenomicFeatures_1.11.8 AnnotationDbi_1.21.10 Biobase_2.19.2 >> [4] GenomicRanges_1.11.28 IRanges_1.17.31 BiocGenerics_0.5.6 >> >> loaded via a namespace (and not attached): >> [1] biomaRt_2.15.0 Biostrings_2.27.10 bitops_1.0-5 >> BSgenome_1.27.1 >> [5] DBI_0.2-5 RCurl_1.95-3 Rsamtools_1.11.15 >> RSQLite_0.11.2 >> [9] rtracklayer_1.19.9 stats4_3.0.0 tools_3.0.0 >> XML_3.95-0.1 >> >> [13] zlibbioc_1.5.0 >> >> >> > ______________________________**_________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel> > -- Tengfei Yin MCDB PhD student 1620 Howe Hall, 2274, Iowa State University Ames, IA,50011-2274 [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel