Thanks Kasper, I think that's a good solution.
Best, Davide On Thu, Jun 18, 2015 at 11:51 AM Kasper Daniel Hansen < kasperdanielhan...@gmail.com> wrote: > you can just implement this by having reserved column names in the colData > slot; that will work and will take appr. 23 seconds to implement. I agree > it is not as clean from a design perspective, but you get 100% of the > functionality and you can write a separate checker for the colData argument. > > On Thu, Jun 18, 2015 at 2:00 PM, davide risso <risso.dav...@gmail.com> > wrote: > >> Thank you all for the responses. >> >> I didn't think about the nested DataFrame solution. It should work. >> I agree that an extension might be cleaner, but I clearly need to give it >> more thought. >> >> One of the reasons I wanted to have quality and metadata as separate >> slots is that one could enforce that all the qualities are numeric, and >> have a quality() method to extract just the quality scores (e.g., for >> plotting / quality control). Having them in the same slot makes it harder >> for the user to extract just the scores (if the column order and/or names >> are not standardized). >> >> Best, >> davide >> >> >> On Thu, Jun 18, 2015 at 6:35 AM Vincent Carey <st...@channing.harvard.edu> >> wrote: >> >>> yes, if a formal extension is warranted. the metadata slot could also be >>> used. >>> >>> On Thu, Jun 18, 2015 at 2:59 PM, Kasper Daniel Hansen < >>> kasperdanielhan...@gmail.com> wrote: >>> >>> > I think the more clean solution for Davide (if he inists on having >>> separate >>> > objects; I decided against it in minfi) is to extend the class to allow >>> > this. >>> > >>> > Kasper >>> > >>> > On Thu, Jun 18, 2015 at 12:25 AM, Ryan <r...@thompsonclan.org> wrote: >>> > >>> > > Oh wow, I didn't know you could put a DataFrame into a single column >>> of >>> > > another DataFrame. That actually solves a problem for me too (I don't >>> > > intend to expose nested DataFrames to the users though). >>> > > >>> > > >>> > > On 6/17/15 7:23 PM, Martin Morgan wrote: >>> > > >>> > >> On 06/17/2015 11:41 AM, davide risso wrote: >>> > >> >>> > >>> Dear list, >>> > >>> >>> > >>> I'm creating an R package to store RNA-seq data of a somewhat large >>> > >>> project >>> > >>> in which I'm involved. >>> > >>> >>> > >>> One of the initial goals is to compare different pre-processing >>> > >>> pipelines, >>> > >>> hence I have multiple expression matrices corresponding to the same >>> > >>> samples. >>> > >>> The SummarizedExperiment class seems a good candidate, since I have >>> > >>> multiple expression matrices with the same rowData and colData >>> > >>> information. >>> > >>> >>> > >>> I have several sample-specific variables that I want to store with >>> the >>> > >>> object, namely, experimental information (e.g., batch, date, >>> > experimental >>> > >>> condition, ...) and sample quality (e.g., proportion of aligned >>> reads, >>> > >>> total duplicate reads, etc...). >>> > >>> >>> > >>> Of course, I can always create one big data frame concatenating >>> the two >>> > >>> (experimental info + sample quality), but it seems that both >>> > conceptually >>> > >>> and practically, it might be useful to have two separate data >>> frames. >>> > >>> Since this seems somewhat a reasonably standard type of information >>> > that >>> > >>> one would want to carry on, I was wondering if it would be >>> possible / >>> > >>> useful to allow the user to have multiple data.frames in the >>> colData >>> > slot >>> > >>> >>> > >> >>> > >> Actually, colData() is a DataFrame, and a DataFrame column can >>> contain a >>> > >> DataFrame. So after >>> > >> >>> > >> example(SummarizedExperiment) >>> > >> >>> > >> we could make some faux sample quality data >>> > >> >>> > >> quality = DataFrame(x=1:6, y=6:1, row.names=colnames(se1)) >>> > >> >>> > >> add this as a column in the colData() >>> > >> >>> > >> colData(se1)$quality = quality >>> > >> >>> > >> (or create the SummarizedExperiment from a similar DataFrame >>> up-front) >>> > >> and manage our grouped data >>> > >> >>> > >> > colData(se1) >>> > >> DataFrame with 6 rows and 2 columns >>> > >> Treatment quality >>> > >> <character> <DataFrame> >>> > >> A ChIP ######## >>> > >> B Input ######## >>> > >> C ChIP ######## >>> > >> D Input ######## >>> > >> E ChIP ######## >>> > >> F Input ######## >>> > >> > colData(se1[,1:2])$quality >>> > >> DataFrame with 2 rows and 2 columns >>> > >> x y >>> > >> <integer> <integer> >>> > >> A 1 6 >>> > >> B 2 5 >>> > >> >>> > >> I'm not sure that this is any less confusing to the end user than >>> having >>> > >> to manage a DataFrameList(), but it does not require any new >>> features. >>> > >> >>> > >> Martin >>> > >> >>> > >> of SummarizedExperiment. >>> > >>> >>> > >>> Best, >>> > >>> Davide >>> > >>> >>> > >>> [[alternative HTML version deleted]] >>> > >>> >>> > >>> _______________________________________________ >>> > >>> Bioc-devel@r-project.org mailing list >>> > >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > >>> >>> > >>> >>> > >> >>> > >> >>> > > _______________________________________________ >>> > > Bioc-devel@r-project.org mailing list >>> > > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > > >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > _______________________________________________ >>> > Bioc-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioc-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel >>> >> > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel