Good tip. Thanks Morgan. I agree that a different structure might (necessarily) be in order. I wanted to create a tree where nodes in a tree were of different derived sub-classes -- possibly holding more data and behaving polymorphically. OO programming seemed ideal for this: lots of small things with specialized behavior -- but this isn't R's strength.
On May 2, 2013, at 4:57 PM, Martin Morgan wrote: > On 05/01/2013 11:20 AM, David Kulp wrote: >> I'm using refClass for a complex multi-directional tree structure with >> possibly 100,000s of nodes. The refClass design is very impressive and I'd >> love to use it, but I've found that the size of refClass instances are very >> large and creation time is slow. For example, below is a RefClass and normal >> S4 class. The RefClass requires about 4KB per instance vs 500B for the S4 >> class -- based on adding the Ncells and Vcells of used memory reported by >> gc(). And instantiation is more than twice as slow for a RefClass. (R >> 2.14.2) >> >> Anyone have thoughts on this and whether there's any hope for improving >> resources on either front? > > Hi David -- not necessarily helpful but creating a few large objects is > always better than creating many small in R, so perhaps re-conceptualize your > data structure? As a rough analogy, instead of constructing a graph as a > large number of 'Node' instances each pointing to one another, a graph could > be represented as a data.frame containing columns of 'from' and 'to' indexes > (neighbour-edge list, a few large objects) or as an adjacency matrix. One > would also implement creation and update of the few large objects in an > R-friendly (vectorized) way. > > Perhaps there are existing packages that already model the data you're > interested in? If your multi-directional tree can be represented as a graph, > then perhaps > > http://bioconductor.org/packages/release/bioc/html/graph.html > > including facilities in the Boost graph library (RBGL, on the Bioconductor > web site, too) or the igraph package can be put to use. > > Martin > >> >> I wonder what others are doing. I've been thinking about lightweight >> alternative implementations, but nothing particularly elegant has come to >> mind, yet! >> >> Thanks! >> >> >> simple <- setRefClass('simple', fields = list(a = "character", b="numeric") >> ) gc() system.time(simple.list <- lapply(1:100000, function(i) { >> simple$new(a='foo',b=i) })) gc() >> >> setClass('simple2', representation(a="character",b="numeric")) >> setMethod("initialize", "simple2", function(.Object, a, b) { .Object@a <- a >> .Object@b <- b .Object }) >> >> gc() system.time(simple2.list <- lapply(1:100000, function(i) { >> new('simple2',a='foo',b=i) })) gc() >> >> ______________________________________________ R-help@r-project.org mailing >> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting >> guide http://www.R-project.org/posting-guide.html and provide commented, >> minimal, self-contained, reproducible code. >> > > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.