You're not giving us much to go on, so the info I can give is correspondingly vague.
I take it you are using RF in "unsupervised" mode. What RF does in this case is simply generate a second part of the data that have the same marginal distribution as the data you have, but the variables are independent. It then runs classification treating your data as one class and the generated data as the other class. The output is the proximity matrix, which you can use as the similarity matrix for clustering. Given that, you know that RF has to basically use twice as much memory to store the data. That's one place where it can take lots of memory. The second place is the storage of the proximity matrix itself: If you have n rows in your data, the proximity matrix is n by n. For moderate n this is going to be the part that takes up lots of memory. Just in case you haven't seen/heard: avoid the formula interface (i.e., randomForest(~., data=mydata, ...) because that can really soak up memory. Yes, 64-bit OS and 64-bit R can help, but only if you have the RAM to take advantage of the platform. Andy > -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Lindgren > Sent: Tuesday, September 07, 2010 4:28 PM > To: r-help@r-project.org > Subject: [R] RandomForests Limitations? Work Arounds? > > Greetings, > > I want to inquire about the memory limitations of the > randomForest package. > I am attempting to perform clustering analysis using RF but > I keep getting > the message that RF cannot allocate a vector of a given size. I am > currently using the 32-bit version of R to run this analysis, > are there > fewer memory issues when using the 64-bit version of R? > Mainly I want to be > able to run RF on a very large dataset, but keep having to > take very small > sample sizes to do so. Any advice is more than appreciated. > > Best, > > Michael > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.