Thanks for your input, highly appreciated! I can see that the semantics of "[" are violated, so I agree that overwriting the "subset" method is probably a better way to go. Essentially, the object stores several, individual-specific count matrices from RNA-Seq experiments in an potentially allele(read group)-specific manner. So the dimensions to subset on are the read groups, the rows and columns of the matrices, and the individuals itself.
So I guess overloading the subset method with four arguments, each corresponding to one of the dimensions a subset is suitable for this kind of object, is the way to go. Thanks, Christian On 14.05.2015 15:57, Michael Lawrence wrote: > I agree with Wolfgang that the semantics of [ are being violated here. > It would though help if you could be a little less vague about your > intent. What is this data structure going to store, how should it behave? > > On Thu, May 14, 2015 at 3:35 AM, Christian Arnold > <christian.arn...@embl.de <mailto:christian.arn...@embl.de>> wrote: > > Hi there, > > I am about to develop a Bioconductor package that implements a > custom S4 object, and I am currently thinking about a few issues, > including the following: > > Say we have an S4 object that stores a lot of information in > different slots. Assume that it does make sense to extract > information out of this object in four different "dimensions" > (conceptually similar to a four-dimensional object), so one would > like to use the subset "[" operator for this, but extending beyond > the "typical" one or two dimensions to 4: > > setClass("A", > > representation=representation(a="numeric",b="numeric",c="numeric",d="numeric")) > a = new("A", a=1:5,b=1:5,c=1:5,d=1:5) > > Now it would be nice to do stuff like a[1,2,3:4,5], which should > simply return the selected elements in slots a, b, c, and d, > respectively. So a[1,2,3:4,5] would return: > > An object of class "A" > Slot "a": > [1] 1 > > Slot "b": > [1] 2 > > Slot "c": > [1] 3 4 > > Slot "d": > [1] 5 > > This is how far I've come: > > setMethod("[", c("A", "ANY", "ANY","ANY"), > function(x, i, j, ..., drop=TRUE) > { > dots <- list(...) > if (length(dots) > 2) { > stop("Too many arguments, must be four dimensional") > } > > # Parse the extra two dimensions that we need from the > ... argument > k = ifelse(length(dots) > 0 , dots[[1]], c(1:5)) > l = ifelse(length(dots) == 2, dots[[2]], c(1:5)) > > initialize(x, a=x@a[i],b=x@b[j],c=x@c[k],d=x@d[l]) > }) > > This works for stuff like a[1,2,3, 4], but fails with a general > error if one of the indices is a vector such as a[1:2,2,3, 4] or > a[1,2,3,4:5]. > > > So, in summary, my questions are: > 1. Is there a reasonable way of achieving the 4-dimensional > subsetting that works as a user would expect it to work? > 2. Does it make more sense to write a custom function instead to > achieve this, such as subsetObject() without overloading "[" > explicitly? What are the Bioconductor recommendations here? > > I'd appreciate any help, suggestions, etc! > > Thanks, > Christian > > _______________________________________________ > Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> mailing > list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > -- ————————————————————————— Christian Arnold, PhD Staff Bioinformatician SCB Unit - Computational Biology Joint appointment Genome Biology Joint appointment European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory (EMBL) Meyerhofstrasse 1; 69117, Heidelberg, Germany Email: christian.arn...@embl.de Phone: +49(0)6221-387-8472 Web: http://www.embl.de/research/units/scb/zaugg/ [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel