On Fri, Sep 18, 2015 at 8:36 PM, Kasper Daniel Hansen < kasperdanielhan...@gmail.com> wrote:
> Interesting, thanks for the pointer. > > In light of the existing (and future) work on this, may I suggest an eSet > like class, but build using the technologies in SummarizedExperiment. Ie. > a SummarizedExperiment without the rowRanges. I would very much like this > for modern work using eSet like containers. Not everything has ranges. > > Vince: I am not claiming that it is easy to work with; we have pains as > well. But am I missing something or is the assay matrix only 2.3Gb? > yes it is only 2.3Gb. it isn't that hard to deal with once loaded. false positive, i guess, but provoked some useful pointers ... > > Best, > Kasper > > On Fri, Sep 18, 2015 at 6:28 PM, Peter Haverty <haverty.pe...@gene.com> > wrote: > > > Yes, bigmemoryExtras::BigMatrix and genoset::RleDataFrame() are good > tricks > > for reducing the size of your eSets and SummarizedExperiments. Both > object > > types can go into assayData or assays. In fact, that's what they were > > designed for. > > > > At Genentech, we use these for our 2.5e6 x 1e3 rectangular data from > > Illumina SNP arrays. We typically have ~6 such rectangular objects in > one > > eSet. With a mix of BigMatrix object for point estimates and > RleDataFrames > > for segmented data, readRDS times are quite reasonable. > > > > > > Pete > > > > ____________________ > > Peter M. Haverty, Ph.D. > > Genentech, Inc. > > phave...@gene.com > > > > On Fri, Sep 18, 2015 at 1:56 PM, Tim Triche, Jr. <tim.tri...@gmail.com> > > wrote: > > > > > bigmemoryExtras (Peter Haverty's extensions to bigMemory/bigMatrix) can > > be > > > handy for this, as it works well as a backend, especially if you go > about > > > splitting by chromosome as for CNV segmentation, DMR finding, etc. > It's > > > not as seamless as one might like, but it's the closest thing I've > found. > > > > > > SciDb tries to implement a similar API, but for a distributed version > of > > > this where the data itself is in a columnar database and served on > > demand. > > > I tried getting that up and running as a SummarizedExperiment backend, > > but > > > did not succeed. I have previously shoveled all of the TCGA 450k data > > into > > > one 7,000+ column bigMatrix which serializes to about 14GB on disk. > > > > > > If you have any replicates in your 700+ samples, it's a good idea to > keep > > > their SNP calls in metadata(yourSE), although if you change names it > > needs > > > to propagate into the dependent metadata. This is why I started > > monkeying > > > around with linkedExperiments where those mappings are enforced; it's > > > becoming more of an issue with the TARGET pediatric AML study, where > > there > > > are numerous diagnosis-remission-relapse trios whose identity I wish to > > > verify periodically. The SNPs on the 450k array are great for this > > > purpose, but minfi doesn't really have a slot for them per se, so live > in > > > metadata(). > > > > > > > > > --t > > > > > > On Fri, Sep 18, 2015 at 1:29 PM, Vincent Carey < > > st...@channing.harvard.edu > > > > > > > wrote: > > > > > > > i am dealing with ~700 450k arrays > > > > > > > > they are derived from one study, so it makes sense to think of > > > > > > > > them holistically. > > > > > > > > both the load time and the memory consumption are not satisfactory. > > > > > > > > has anyone worked on an object type that implements the rangedSE API > > but > > > > has > > > > > > > > the assay data out of memory? > > > > > > > > > unix.time(load("wbmse.rda")) > > > > > > > > user system elapsed > > > > > > > > 30.131 2.396 61.036 > > > > > > > > > object.size(wbmse) > > > > > > > > 124031032 bytes > > > > > > > > > dim(wbmse) > > > > > > > > [1] 485577 690 > > > > > > > > > object.size(assays(wbmse)) > > > > > > > > 2680430992 bytes > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > _______________________________________________ > > > > Bioc-devel@r-project.org mailing list > > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > _______________________________________________ > > > Bioc-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel