Thanks, Martin. I agree using AnnotationHub to manage the db resources is a better option than how it is currently setup. A pull request would be much appreciated.
On Tue, Aug 4, 2015 at 3:14 PM Vincent Carey <st...@channing.harvard.edu> wrote: > On Tue, Aug 4, 2015 at 3:00 PM, Martin Morgan <mtmor...@fredhutch.org> > wrote: > >> On 08/04/2015 06:43 AM, Nathan Olson wrote: >> >>> We are starting to work on an infrastructure for annotation of 16S >>> metagenomic >>> sequencing datasets and would like your comments and/or contributions. >>> Below are >>> links to two github repositories: metagenomeFeatures and >>> greengenes13.5MgDb. >>> The metagenomeFeatures package contains two classes; mgDb, for 16S >>> sequence >>> databases, and metagenomeAnnotation, for annotating a sequence dataset >>> with >>> taxonomic information from a mgDb object. The greengenes13.5MgDb >>> package, loads >>> a mgDb object with the greengenes 13.5 database. greengenes 13.5 was >>> used as an >>> >> >> does it make sense to use AnnotationHub to manage these resources? > > > I would think so. At this time, trying to install greengenes13.5MgDb > package, the process "testing whether the package > can be loaded" takes a very long time -- I suspect it is doing some silent > downloading. IMHO such activities > should be explicitly undertaken by the user. > > >> Instead of downloading and managing the fasta and taxonomy files in >> .onLoad and getGreenGenes13.5Db, .onLoad would be >> >> hub = AnnotationHub() >> db_seq = hub[["AH12345"]] >> db_taxa_file = hub[["AH12346"]] >> >> > With this setup the first installation of the package could involve a long > download, silent by default. It's feasible but > quite unusual. > > >> with a 'recipe' describing how the corresponding annotation hub resources >> are to be created. This would move download and management to >> AnnotationHub, and potentially allow use of the annotation hub records by >> people with other interests. If that sounds interesting we can work up a >> pull request. >> >> Martin >> >> example database, we plan on adding additional packages for other >>> commonly used >>> databases, e.g RDP and Silva. >>> >>> The metagenomeFeatures includes two vignettes to demonstrating the mgDb >>> and >>> metagenomeAnnotation class methods using the greengenes13.5MgDb as an >>> example >>> database. >>> >>> We are planning on adding additional methods for the mgDb and >>> metagenomeAnnotation classes. For the mgDb class, assigning query >>> sequences to >>> database sequences using rRDP classifier, and/or sequence alignment >>> methods that >>> are part of the Biostrings package. For the metagenomeAnnotation class >>> we plan >>> to include the ability to create a phylogenetic tree from a >>> metagenomeAnnotation >>> object. >>> We would appreciate comments on the package and suggestions for >>> additional features. >>> >>> Links to package github repositories >>> >>> https://github.com/HCBravoLab/metagenomeFeatures >>> >>> https://github.com/HCBravoLab/greengenes13.5MgDb >>> >>> Thanks >>> >>> Nate Olson and Hector Corrada Bravo >>> >> >> >> -- >> Computational Biology / Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. >> PO Box 19024 Seattle, WA 98109 >> >> Location: Arnold Building M1 B861 >> Phone: (206) 667-2793 > > _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel