Hi Wanghong,

Unless you have a huge linux box, you will need to sample down your 300k
rows to a few thousand.

In marketing aps, I often have data sets of comparable size.

I would suggest you start with a just a few k rows to make sure everything
else is working as you wish. Also, study carefully Andy's randomForest docs
- including the R News article a couple years ago.

In particular,

1) the formula interface is a memory hog. Andy suggests just using explicit
declaration. In you case, something like
      randomForest(Memebers[42], Memebers[-42], ...
2) proximity matirx is also memory & time intensive. Suggest proximity =
FALSE until, other things sorted out.

HTH,
Jim Porzak
TGN.com
San Francisco, CA
http://www.linkedin.com/in/jimporzak
useR Group SF: http://ia.meetup.com/67/


2008/12/26 wanghong <wangh...@neusoft.edu.cn>

> hello,
> I want to use randomForest to classify a matrix which is 331030¡Á42,the last
> column is class signal.I use £º
> Memebers.rf<-randomForest(class~.,data=Memebers,proximity=TRUE,mtry=6,ntree=200)
> which told me" the error is matrix(0,n,n) set too elements"
> then I use:
> Memebers.rf<-randomForest(class~.,data=Memebers,importance=TRUE,proximity=TRUE)
> which told me"the error is na.fail.default(list(class = c(17L, 17L, 17L,
> 29L, 29L, 29L,  :
>  missing values in object
> "
>
> what's wrong with it .Thanks a lot
>
>
> wanghong
>  wangh...@neusoft.edu.cn
> 2008-12-26
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to