Good tip.  Thanks Morgan.
I agree that a different structure might (necessarily) be in order.  I wanted 
to create a tree where nodes in a tree were of different derived sub-classes -- 
possibly holding more data and behaving polymorphically.  OO programming seemed 
ideal for this: lots of small things with specialized behavior -- but this 
isn't R's strength.

On May 2, 2013, at 4:57 PM, Martin Morgan wrote:

> On 05/01/2013 11:20 AM, David Kulp wrote:
>> I'm using refClass for a complex multi-directional tree structure with
>> possibly 100,000s of nodes.  The refClass design is very impressive and I'd
>> love to use it, but I've found that the size of refClass instances are very
>> large and creation time is slow.  For example, below is a RefClass and normal
>> S4 class.  The RefClass requires about 4KB per instance vs 500B for the S4
>> class -- based on adding the Ncells and Vcells of used memory reported by
>> gc().  And instantiation is more than twice as slow for a RefClass.  (R
>> 2.14.2)
>> 
>> Anyone have thoughts on this and whether there's any hope for improving
>> resources on either front?
> 
> Hi David -- not necessarily helpful but creating a few large objects is 
> always better than creating many small in R, so perhaps re-conceptualize your 
> data structure? As a rough analogy, instead of constructing a graph as a 
> large number of 'Node' instances each pointing to one another, a graph could 
> be represented as a data.frame containing columns of 'from' and 'to' indexes 
> (neighbour-edge list, a few large objects) or as an adjacency matrix. One 
> would also implement creation and update of the few large objects in an 
> R-friendly (vectorized) way.
> 
> Perhaps there are existing packages that already model the data you're 
> interested in? If your multi-directional tree can be represented as a graph, 
> then perhaps
> 
>  http://bioconductor.org/packages/release/bioc/html/graph.html
> 
> including facilities in the Boost graph library (RBGL, on the Bioconductor 
> web site, too) or the igraph package can be put to use.
> 
> Martin
> 
>> 
>> I wonder what others are doing.  I've been thinking about lightweight
>> alternative implementations, but nothing particularly elegant has come to
>> mind, yet!
>> 
>> Thanks!
>> 
>> 
>> simple <- setRefClass('simple', fields = list(a = "character", b="numeric")
>> ) gc() system.time(simple.list <- lapply(1:100000, function(i) {
>> simple$new(a='foo',b=i) })) gc()
>> 
>> setClass('simple2', representation(a="character",b="numeric"))
>> setMethod("initialize", "simple2", function(.Object, a, b) { .Object@a <- a
>> .Object@b <- b .Object })
>> 
>> gc() system.time(simple2.list <- lapply(1:100000, function(i) {
>> new('simple2',a='foo',b=i) })) gc()
>> 
>> ______________________________________________ R-help@r-project.org mailing
>> list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
>> guide http://www.R-project.org/posting-guide.html and provide commented,
>> minimal, self-contained, reproducible code.
>> 
> 
> 
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
> 
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to