Re: [Bioc-devel] RangedSummarizedExperiment

Hervé Pagès Thu, 18 Jun 2015 12:20:13 -0700

Hi Tim,

On 06/18/2015 10:48 AM, Tim Triche, Jr. wrote:

Hey since the refactoring is already breaking stuff willy nilly, can I make a 
few suggestions?


1) please for the love of all that is holy have backwards compatible methods 
for RSEs.  It's excruciating to have RSE as the target class, supporting 
RELEASE users with SE, and have to do endless duck typing nonsense or else have 
a bunch of new generics.


That sounds apocalyptic ;-) Will add the "exptData" method for
RangedSummarizedExperiment objects today (it issues a deprecation
warning though). Hope that will help reduce the "endless duck typing
nonsense".

Just to clarify though: generally speaking if you expect switching
between release and devel to be completely transparent then you're
putting your expectations too high.


2) please investigate some sort of "overlay" approach that would allow for accordion-style bundling/unbundling of transcripts, 
regions, compartments etc.  the reason for this will become ever more obvious, but now that I have students, I don't want to explain why 
everything seems to derive from a pre-ASE, pre-intronretention, pre-graphreference mindset of 20 years ago.  If you're going to break 
stuff, how about we break it real good and make things going forward flexible so as to eventually get it "right enough" (NOT 
"perfect" or "right" but "close enough for government work" and "close enough to work out of core")


You're going to have to give a lot more details about this. In
particular please explain what "accordion-style bundling/unbundling
of transcripts, regions, compartments" is, why you need it, and
how RSE would benefit from supporting that. FWIW note that you can
put whatever metadata columns on top of the rowRanges component of
an RSE so you can always do that to store whatever bundling/grouping
information you want. Alternatively you can extend the RSE class
to achieve the same goal. Keeping RSE as simple as possible and
agnostic about complex scenarios is actually a feature.

Also I'm not sure what the "pre-ASE, pre-intronretention,
pre-graphreference mindset" was 20 years ago but note that
SummarizedExperiment was designed and implemented less than
5 years ago and with the initial purpose of addressing the
needs of RNA-seq, ChIP-seq, and other NGS experiments.


3) I'll write patches ( you may not want to actually accept them, but I'll write 'em just 
the same ) when the urge moves me, but if some sort of a 30000' summary of the desired 
end product (along with deficiencies of the current SE) were readily available, it might 
help avoid a "second system effect".  The original SE was very good, fixing 
almost everything that sucked about the battle-tested ExpressionSet.  The new RSE has the 
great feature of automatically putting transcripts about where they belong if I point it 
at, say, a GEO GSE.  I assume that with things like Kallisto we can eventually do the 
same with arbitrary SRA experiments (there's a cute hack we are pushing in BaseSpace to 
make this happen already). But without a roadmap, it's tougher for people to see what 
needs to NOT be done, and that's really important.  What belongs in the base class, and 
what in a subclass?  This is not unimportant.


No detailed roadmap but the main goal of this refactoring is to
have a degraded version of the classic SummarizedExperiment (the
need for it was discussed on this list last year I think). We

thought this refactoring would also be a good time to migrateSummarizedExperiment to its own package. But we had not intention

to modify the existing functionalities of the classic SE (and
except for the replacement of exptData with metadata, RSE passes
the unit tests of the classic SE).

Like it's often the case with software development, we know pretty
well where we want to go but we didn't know exactly how we wanted
to get there. Hence not detailed roadmap. We thought it was not a
big deal anyway because our plan is to fix what we break so the
developers don't really need to worry about the gory details of
the changes. As I said earlier, things should be fixed and the
build report back to "normal" before the end of the week.

Thanks for your patience,
H.


All of the above said, SE was great and RSE is already better in some respects. 
But with a clear roadmap and more input, I bet it (and a tight clean definition 
of what it is and isn't supposed to do) would be better-er.

(Steps off soapbox)

--t

On Jun 18, 2015, at 10:25 AM, Hervé Pagès <hpa...@fredhutch.org> wrote:

Hi Elena,

Sorry for the inconvenience caused by the refactoring of
SummarizedExperiment objects.

On 06/18/2015 03:41 AM, Elena Grassi wrote:
Hello,

I'm writing as long as I am struggling a bit to keep the pace of
RangedSummarizedExperiment in my package roar, whose main class
contains RangedSummarizedExperiment to hold some of the data.
Sometimes the developers fix issues for me but I would like to ease
their work as much as possible but for example today I stumbled upon
this:
http://bioconductor.org/checkResults/devel/bioc-LATEST/roar/zin1-buildsrc.html

that is related to the fact that I build a RoarDataset object without
rowRanges, colData etc at the beginning of the analysis and I fill
them later.

My questions are:
- apart from looking around the svn logs and source code to understand
what's going on have I missed some mail here or other information
about the roadmap for what will come for SummarizedExperiment?


A new class, SummarizedExperiment0, was introduced for representing
"degraded" RangedSummarizedExperiment objects, that is, objects with
no rowRanges component.

So RangedSummarizedExperiment now derives from SummarizedExperiment0,
which in turn derives from Vector. As a consequence of deriving from
Vector, these objects now have a length (length(x) = nrow(x)) and can
have names (names(x) = rownames(x)).

Another consequence of these changes is that the internal representation
of RangedSummarizedExperiment objects has changed so serialized
instances need to be updated and re-serialized. Also packages that
define classes that extend RangedSummarizedExperiment (like roar)
might need some tweaks. I'm in the process of re-serializing
RangedSummarizedExperiment objects and fixing the packages affected
by these changes (should be done before the end of the week).

Note that SummarizedExperiment0 might not be the definitive name for
this class but we can't use the "SummarizedExperiment" name for this
until the old SummarizedExperiment class defined in GenomicRanges is
gone (i.e. not before BioC 3.3).

- it would be better to avoid extending such a class and instead
simply having another slot to avoid such initializations issues?


Not sure I understand what you're asking exactly. Can you provide
more details?

Thanks for your patience and sorry again for the inconvenience.

H.


Thanks,
E.


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] RangedSummarizedExperiment

Reply via email to