The philosophy motivating the check is that names make the relationship between samples and data explicit, rather than relying on fragile positional information. With this in mind, I wonder why your upstream work flow does not include dimnames on the matrix?
That said, the check was introduced in ------------------------------------------------------------------------ r68053 | [email protected] | 2012-07-27 03:35:55 -0400 (Fri, 27 Jul 2012) | 2 lines SummarizedExperiment uses rowData=GRangesList() as defult ------------------------------------------------------------------------ To the observations you mention below one could also add that the rownames() can be NULL, so there is an uncomfortable asymmetry. I could (1) remove the check (but use the DataFrame() constructor in an admittedly hackish way, not wanting to rely on the internal new() function). I could also (2) construct row / column names as seq_len(nrow()) / seq_len(ncol()). Or (3) the code could be tightened to more closely adhere to the philosophy above (for instance, I think duplication of columns implied by se[,2] = se[,1] is worth stop()ing over, and allowing colnames(se) = NULL only enables bad practice). Likely this would be disruptive. For what it's worth, we have > library(Biobase) > eset = ExpressionSet(matrix(0, 1, 2)) > dimnames(eset) [[1]] [1] "1" [[2]] [1] "1" "2" > colnames(eset) = NULL Error in `sampleNames<-`(`*tmp*`, value = NULL) : 'value' length (0) must equal sample number in AssayData (2) so dimnames are being imposed. (2) would be my current compromise preference. Martin ________________________________________ From: Bioc-devel [[email protected]] on behalf of Aaron Lun [[email protected]] Sent: Saturday, December 05, 2015 7:36 AM To: bioc-devel Subject: Re: [Bioc-devel] do SummarizedExperiments really need colnames? Hello all, At the start of the SummarizedExperiment constructor, there's a code block that throws an error if 'colData' is not specified and the assay matrices don't have column names. Is this really necessary? In many cases, I just want to get a matrix into the SE0 object without having to worry about column names. It doesn't seem like there's a requirement for this in the SE0 class, either; it seems happy with 'colnames(se0) <- NULL', and setting 'colData' to a 'DataFrame' with 'NULL' row names doesn't break the constructor. The requirement for column names causes issues for some manipulations - for example: out <- SummarizedExperiment(matrix(0, 10, 5), colData=DataFrame(row.names=1:5)) out[,1] <- out[,2] ## Error in `rownames<-`(`*tmp*`, value = c("2", "2", "3", "4", "5")) : ## duplicate rownames not allowed While this is fair enough, it's a bit annoying if I didn't want or need the names in the first place. The error mentioned above precedes the construction of the missing 'colData', so if column names are missing, then a more general way to construct the 'colData' would to do 'new("DataFrame", nrows=ncol(assays))'. Cheers, Aaron _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
