Re: [Bioc-devel] SummarizedExperiment

Peter Haverty Tue, 25 Mar 2014 09:33:13 -0700

One benefit of having dimnames on assays would be that one could use
DataFrames as assays, like in eSet.  My genoset class is becoming more and
more like SummarizedExperiment. The dimname issues prevent me from
switching entirely from eSet to SummarizedExperiment.


I think that keeping only one copy of dimnames is a great feature, if a bit
dangerous.  My typical object has ~6 BigMatrix and/or DataFrame of Rle
objects as assays, so the rownames actually make up a considerable portion
of the object size.  (My typical dataset is 2.5M rows by 1k samples). I've
been moving towards keeping a single dimnames copy just to improve RData
load times.

I think that assays should be required to have dimnames when they are added
to a SummarizedExperiment. These dimnames should be checked for equality
with the dimnames of the SE in the setter function.

Perhaps with the recent (R 3.1) improvements in shallow/lazy copying and
reference counting, adding dimnames to outgoing assays will be less of a
performance hit.

I also like the compromise I have seen elsewhere, where the colnames are
always retained on assays, but only one rownames copy is kept.  Colnames
are typically small and getting them wrong often makes for silent, but
catastrophic errors.

Pete

____________________
Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] SummarizedExperiment

Reply via email to