While I tend to agree with you that PCA is too big an operation to be hidden within a plotting function (MDS is an edge-case I would say), I can’t see how we can ever reach a point where there is only one generic plot function. In the case of PCA there is a number of different plot-types that can all lay claim to the plot function of a PCA class, for instance scoreplot, scatterplot matrix of all scores, biplot, screeplot, accumulated R^2 barplot, leverage vs. distance-to-model… (you get the idea). So while having some very well-thought out classes for very common result types such as PCA, this class would still need a lot of different plot methods such as plotScores, plotScree etc (or plot(…, type=‘score’), but I don’t find that very appealing). Expanding beyond PCA only muddles the water even more - there are very few interesting data structures that only have one visual representation to-rule-them-all…
just my 2c best Thomas > Date: Mon, 20 Oct 2014 18:50:48 -0400 > From: Kevin Coombes <kevin.r.coom...@gmail.com> > > Well. I have two responses to that. > > First, I think it would be a lot better/easier for users if (most) > developers could make use of the same plot function for "basic" classes > like PCA. > > Second, if you think the basic PCA plotting routine needs enhancements, > you still have two options. On the one hand, you could (as you said) > try to convince the maintainer of PCA to add what you want. If it's > generally valuable, then he'd probably do it --- and other classes that > use it would benefit. On the other hand, if it really is a special > enhancement that only makes sense for your class, then you can derive a > class from the basic PCA class > setClass("mySpecialPCA", contains=c("PCA"), *other stuff here*) > and implement your own version of the "plot" generic for this class. > And you could tweak the "as.PCA" function so it returns an object of the > mySpecialPCA class. And the user could still just "plot" the result > without hacving to care what's happening behind the scenes. > > On 10/20/2014 5:59 PM, Michael Love wrote: >> Ah, I see now. Personally, I don't think Bioconductor developers >> should have to agree on single plotting functions for basic classes >> like 'PCA' (because this logic applies equally to the situation of all >> Bioconductor developers agreeing on single MA-plot, a single >> variance-mean plot, etc). I think letting developers define their >> plotPCA makes contributions easier (I don't have to ask the owner of >> plot.PCA to incorporate something), even though it means we have a >> growing list of generics. >> >> Still you have a good point about splitting computation and plotting. >> In practice, we subset the rows so PCA is not laborious. >> >> >> On Mon, Oct 20, 2014 at 5:38 PM, Kevin Coombes >> <kevin.r.coom...@gmail.com <mailto:kevin.r.coom...@gmail.com>> wrote: >> >> Hi, >> >> I don't see how it needs more functions (as long as you can get >> developers to agree). Suppose that someone can define a reusable >> PCA class. This will contain a single "plot" generic function, >> defined once and reused by other classes. The existing "plotPCA" >> interface can also be implemented just once, in this class, as >> >> plotPCA <- function(object, ...) plot(as.PCA(object), ...) >> >> This can be exposed to users of your class through namespaces. >> Then the only thing a developer needs to implement in his own >> class is the single "as.PCA" function. And he/she would have >> already been rquired to implement this as part of the old >> "plotPCA" function. So it can be extracted from that, and the >> developer doesn't have to reimplement the visualization code from >> the PCA class. >> >> Best, >> Kevin >> >> >> On 10/20/2014 5:15 PM, davide risso wrote: >>> Hi Kevin, >>> >>> I see your points and I agree (especially for the specific case >>> of plotPCA that involves some non trivial computations). >>> >>> On the other hand, having a wrapper function that starting from >>> the "raw" data gives you a pretty picture (with virtually zero >>> effort by the user) using a sensible choice of parameters that >>> are more or less OK for RNA-seq data is useful for practitioners >>> that just want to look for patterns in the data. >>> >>> I guess it would be the same to have a PCA method for each of the >>> objects and then using the plot method on those new objects, but >>> that would just create a lot more objects and functions than the >>> current approach (like Mike was saying). >>> >>> Your "as.pca" or "performPCA" approach would be definitely better >>> if all the different methods would create objects of the *same* >>> PCA class, but since we are talking about different packages, I >>> don't know how easy it would be to coordinate. But perhaps this >>> is the way we should go. >>> >>> Best, >>> davide >>> >>> >>> >>> On Mon, Oct 20, 2014 at 1:26 PM, Kevin Coombes >>> <kevin.r.coom...@gmail.com <mailto:kevin.r.coom...@gmail.com>> wrote: >>> >>> Hi, >>> >>> It depends. >>> >>> The "traditional" R approach to these matters is that you (a) >>> first perform some sort of an analysis and save the results >>> as an object and then (b) show or plot what you got. It is >>> part (b) that tends to be really generic, and (in my opinion) >>> should have really generic names -- like "show" or "plot" or >>> "hist" or "image". >>> >>> With PCA in particular, you usually have to perform a bunch >>> of computations in order to get the principal components from >>> some part of the data. As I understand it now, these >>> computations are performed along the way as part of the >>> various "plotPCA" functions. The "R way" to do this would be >>> something like >>> pca <- performPCA(mySpecialObject) # or >>> as.PCA(mySpecialObject) >>> plot(pca) # to get the scatter plot >>> This apporach has the user-friendly advantage that you can >>> tweak the plot (in terms of colors, symbols, ranges, titles, >>> etc) without having to recompute the principal components >>> every time. (I often find myself re-plotting the same PCA >>> several times, with different colors or symbols for different >>> factrors associated with the samples.) In addition, you could >>> then also do something like >>> screeplot(pca) >>> to get a plot of the percentages of variance explained. >>> >>> My own feeling is that if the object doesn't know what to do >>> when you tell it to "plot" itself, then you haven't got the >>> right abstraction. >>> >>> You may still end up needing generics for each kind of >>> computation you want to perform (PCA, RLE, MA, etc), which is >>> why I suggested an "as.PCA" function. After all, "as" is >>> already pretty generic. In the long run, l this would herlp >>> BioConductor developers, since they wouldn't all have to >>> reimplement the visualization code; they would just have to >>> figure out how to convert their own object into a PCA or RLE >>> or MA object. >>> >>> And I know that this "plotWhatever" approach is used >>> elsewhere in BioConductor, and it has always bothered me. It >>> just seemed that a post suggesting a new generic function >>> provided a reasonable opportunity to point out that there >>> might be a better way. >>> >>> Best, >>> Kevin >>> >>> PS: My own "ClassDicsovery" package, which is available from >>> RForge via >>> **|install.packages("ClassDiscovery", >>> repos="http://R-Forge.R-project.org" >>> <http://R-Forge.R-project.org>)|** >>> includes a "SamplePCA" class that does something roughly >>> similar to this for microarrays. >>> >>> PPS (off-topic): The worst offender in base R -- because it >>> doesn't use this "typical" approch -- is the "heatmap" >>> function. Having tried to teach this function in several >>> different classes, I have come to the conclusion that it is >>> basically unusable by mortals. And I think the problem is >>> that it tries to combine too many steps -- clustering rows, >>> clustering columns, scaling, visualization -- all in a single >>> fiunction >>> >>> >>> On 10/20/2014 3:47 PM, davide risso wrote: >>>> Hi Kevin, >>>> >>>> I don't agree. In the case of EDASeq (as I suppose it is the >>>> case for DESeq/DESeq2) plotting the principal components of >>>> the count matrix is only one of possible exploratory plots >>>> (RLE plots, MA plots, etc.). >>>> So, in my opinion, it makes more sense from an object >>>> oriented point of view to have multiple plotting methods for >>>> a single "RNA-seq experiment" object. >>>> >>>> In addition, this is the same strategy adopted elsewhere in >>>> Bioconductor, e.g., for the plotMA method. >>>> >>>> Just my two cents. >>>> >>>> Best, >>>> davide >>>> >>>> On Mon, Oct 20, 2014 at 11:30 AM, Kevin Coombes >>>> <kevin.r.coom...@gmail.com >>>> <mailto:kevin.r.coom...@gmail.com>> wrote: >>>> >>>> I understand that breaking code is a problem, and that >>>> is admittedly the main reason not to immediately adopt >>>> my suggestion. >>>> >>>> But as a purely logical exercise, creating a "PCA" >>>> object X or something similar and using either >>>> plot(X) >>>> or >>>> plot(as.PCA(mySpecialObject)) >>>> is a much more sensible use of object-oriented >>>> programming/design. This requires no new generics (to >>>> write or to learn). >>>> >>>> And you could use it to transition away from the current >>>> system by convincing the various package maintainers to >>>> re-implement plotPCA as follows: >>>> >>>> plotPCA <- function(object, ...) { >>>> plot(as.PCA(object), ...) >>>> } >>>> >>>> This would be relatively easy to eventually deprecate >>>> and teach users to switch to the alternative. >>>> >>>> >>>> On 10/20/2014 1:07 PM, Michael Love wrote: >>>>> hi Kevin, >>>>> >>>>> that would imply there is only one way to plot an >>>>> object of a given class. Additionally, it would break a >>>>> lot of code.? >>>>> >>>>> best, >>>>> >>>>> Mike >>>>> >>>>> On Mon, Oct 20, 2014 at 12:50 PM, Kevin Coombes >>>>> <kevin.r.coom...@gmail.com >>>>> <mailto:kevin.r.coom...@gmail.com>> wrote: >>>>> >>>>> But shouldn't they all really just be named "plot" >>>>> for the appropriate objects? In which case, there >>>>> would already be a perfectly good generic.... >>>>> >>>>> On Oct 20, 2014 10:27 AM, "Michael Love" >>>>> <michaelisaiahl...@gmail.com >>>>> <mailto:michaelisaiahl...@gmail.com>> wrote: >>>>> >>>>> I noticed that 'plotPCA' functions are defined >>>>> in EDASeq, DESeq2, DESeq, >>>>> affycoretools, Rcade, facopy, CopyNumber450k, >>>>> netresponse, MAIT (maybe >>>>> more). >>>>> >>>>> Sounds like a case for BiocGenerics. >>>>> >>>>> best, >>>>> >>>>> Mike >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> _______________________________________________ >>>>> Bioc-devel@r-project.org >>>>> <mailto:Bioc-devel@r-project.org> mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>>>> >>>>> >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> <http://www.avast.com/> >>>> >>>> This email is free from viruses and malware because >>>> avast! Antivirus <http://www.avast.com/> protection is >>>> active. >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Davide Risso, PhD >>>> Post Doctoral Scholar >>>> Division of Biostatistics >>>> School of Public Health >>>> University of California, Berkeley >>>> 344 Li Ka Shing Center, #3370 >>>> Berkeley, CA 94720-3370 >>>> E-mail: davide.ri...@berkeley.edu >>>> <mailto:davide.ri...@berkeley.edu> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> <http://www.avast.com/> >>> >>> This email is free from viruses and malware because avast! >>> Antivirus <http://www.avast.com/> protection is active. >>> >>> >>> >>> >>> >>> -- >>> Davide Risso, PhD >>> Post Doctoral Scholar >>> Division of Biostatistics >>> School of Public Health >>> University of California, Berkeley >>> 344 Li Ka Shing Center, #3370 >>> Berkeley, CA 94720-3370 >>> E-mail: davide.ri...@berkeley.edu <mailto:davide.ri...@berkeley.edu> >> >> >> >> ------------------------------------------------------------------------ >> <http://www.avast.com/> >> >> This email is free from viruses and malware because avast! >> Antivirus <http://www.avast.com/> protection is active. >> >> >> > > > > --- > This email is free from viruses and malware because avast! Antivirus > protection is active. > > > [[alternative HTML version deleted]] > > > > ------------------------------ > > _______________________________________________ > Bioc-devel mailing list > Bioc-devel@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > End of Bioc-devel Digest, Vol 127, Issue 43 > ******************************************* _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel