Re: [Bioc-devel] Best practices to load data for vignette/tests

Shepherd, Lori Thu, 24 Jan 2019 05:44:40 -0800

The transcriptome datasets switched to 2Bit files.  We do provide the updated 
TwoBitFiles in the annotatiuonhub  (again we have not yet added 95 but do have 
94).



> query(hub, c("ensembl", "elegans", "release-94", "2bit"))
AnnotationHub with 4 records
# snapshotDate(): 2019-01-14
# $dataprovider: Ensembl
# $species: Caenorhabditis elegans
# $rdataclass: TwoBitFile
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["AH65579"]]'

            title
  AH65579 | Caenorhabditis_elegans.WBcel235.cdna.all.2bit
  AH65580 | Caenorhabditis_elegans.WBcel235.dna_rm.toplevel.2bit
  AH65581 | Caenorhabditis_elegans.WBcel235.dna_sm.toplevel.2bit
  AH65582 | Caenorhabditis_elegans.WBcel235.ncrna.2bit





Also to get the path of the AnnotationHub downloaded resorurce please use the 
format


cache(ah["AH50789"])


instead of


ah[["AH50789"]]$path



Cheers,



Lori Shepherd

Bioconductor Core Team

Roswell Park Cancer Institute

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

________________________________
From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Julien 
Wollbrett <julien.wollbr...@unil.ch>
Sent: Tuesday, January 22, 2019 8:57:23 AM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] Best practices to load data for vignette/tests

Hi everyone,

I am currently working on a R package called BgeeCall allowing to
automatically generate present/absent expression calls from any RNA-Seq
fastq files as long as the species is present in Bgee (https://bgee.org/) .
Welcome to Bgee: a dataBase for Gene Expression Evolution<https://bgee.org/>
bgee.org
Gene expression data. Bgee is a database to retrieve and compare gene 
expression patterns in multiple animal species, produced from multiple data 
types (RNA-Seq, Affymetrix, in situ hybridization, and EST data).



The package is almost ready and I am currently writing the vignette and
some tests.

This package can be seen as a workflow taking as input one transcriptome
and at least one fastq file.

My question is how can I import these 2 files to run the vignette/tests?
They are too big to be part of my package.
Can I directly download them from SRA and ensembl (or from my own
server)? Do I need to create a dataset that will be loaded by my package
for this kind of raw and publicly available data?
Do you know if I could reuse some already existing dataset? I am
interested to any best practices infomation.
Thank you for your answers.

Best Regards,

Julien

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] Best practices to load data for vignette/tests

Reply via email to