Re: [Bioc-devel] requirement for named assays in SummarizedExperiment

Ryan Thu, 12 Mar 2015 10:05:00 -0700

Yes, a single-assay SummarizedExperiment would be the most common casefor unnamed assays. But I think at the very least there should be awarning on unnamed assays.


On 3/12/15 9:24 AM, Martin Morgan wrote:

On 03/12/2015 08:12 AM, Tim Triche, Jr. wrote:

What he said
This doesn't make any sense from an API perspective. When would auser ever expect to see unnamed assay matrices?


When there's a single assay?

--t

On Mar 12, 2015, at 7:46 AM, Kasper Daniel Hansen<kasperdanielhan...@gmail.com> wrote:


allowing positional matching strikes me as being far too fragile.

Depending on the actual implementation, it may not even be clearthere is

an order of the assays.

On Wed, Mar 11, 2015 at 2:45 PM, Valerie Obenchain<voben...@fredhutch.org>

wrote:

Hi,

After talking with others the vote was against enforcing names onassays()and for positional matching if all names are NULL. A mixture ofnames and

NULL throws an error.

example(SummarizedExperiment)

## all named

se2 = se1
assays(cbind(se1, se2))

List of length 1
names(1): counts

## mixture of names and NULL -> error

names(assays(se1)) = NULL
assays(cbind(se1, se2))

Error in assays(cbind(se1, se2)) :

error in evaluating the argument 'x' in selecting a method forfunction

'assays': Error in .bind.arrays(args, cbind, "assays") :
  elements in ‘assays’ must have the same names

## all NULL -> positional matching

names(assays(se2)) = NULL
assays(cbind(se1, se2))

List of length 1

If we find common use cases where positional matching is needed with a
mixture of names and NULL we can always relax this constraint.

Changes are in 1.19.46.

Valerie

On 03/06/2015 08:20 AM, Valerie Obenchain wrote:

Hi Aaron,

Thanks for catching this.

I favor enforcing names in 'assays'. Combining by position aloneis toodangerous. I'm thinking of the VCF class where the genomeinformation is

stored in 'assays' and the fields are rarely in the same order.

Looks like we also need a more informative error message when names
don't match.

assays(se1)

List of length 1
names(1): counts1

assays(se2)

List of length 1
names(1): counts2

cbind(se1, se2)

Error in sQuote(accessorName) :
   argument "accessorName" is missing, with no default


Valerie

On 03/05/2015 11:09 PM, Aaron Lun wrote:

Dear all,

I stumbled upon some unexpected behaviour with cbind'ing
SummarizedExperiment objects with unnamed assays:

require(GenomicRanges)

nrows <- 5; ncols <- 4
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
rowData <- GRanges("chr1", IRanges(1:nrows, 1:nrows))
colData <- DataFrame(Treatment=1:ncols, row.names=LETTERS[1:ncols])

sset <- SummarizedExperiment(counts, rowData=rowData,colData=colData)

sset

class: SummarizedExperiment
dim: 5 4
exptData(0):
assays(1): ''
rownames: NULL
rowData metadata column names(0):
colnames(4): A B C D
colData names(1): Treatment


cbind(sset, sset)

dim: 5 8
exptData(0):
assays(0):
rownames: NULL
rowData metadata column names(0):
colnames(8): A B ... C1 D1
colData names(1): Treatment

Upon cbind'ing, the assays in the SE object are lost. I thinkthis isdue to the fact that the cbind code matches up assays by theirnames.Thus, if there are no names, the code assumes that there are noassays.


I guess this could be prevented by enforcing naming of assays in the

SummarizedExperiment constructor. Or, the binding code could bemodifiedto work positionally when there are no assay names, e.g., bycbind'ing

the first assays across all SE objects, then the second assays, etc.

Any thoughts?

Regards,

Aaron

sessionInfo()
R Under development (unstable) (2014-12-14 r67167)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils
datasets
[8] methods   base

other attached packages:
[1] GenomicRanges_1.19.42 GenomeInfoDb_1.3.13 IRanges_2.1.41
[4] S4Vectors_0.5.21      BiocGenerics_0.13.6

loaded via a namespace (and not attached):
[1] XVector_0.7.4

______________________________________________________________________The information in this email is confidential andinte...{{dropped:15}}


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: voben...@fredhutch.org
Phone: (206) 667-3158


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


    [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] requirement for named assays in SummarizedExperiment

Reply via email to