Peter, You are certainly up to date. Can you try replicating this using only two nodes (since you only have two processors)? I'm not sure that specifying 5 really helps. Using 2 nodes on my mac usually gets me about a 30-40% decrease in time.
Also, are the processes just hanging or is there an error? These models may take a while. Perhaps testing with pls, lm or some other fast model might help troubleshoot. If you are not passing a sleigh object into the trainNWS call, you can do this by using trainNWSControl( start = makeSleighStarter(workerCount = 2)) The only other thing I can suggest is to send me the data (or an anonymized knock-off) so that I can test. You certainly should be able to do this, but you may be limited by your machine. Max On Mon, Mar 10, 2008 at 1:18 PM, Tait, Peter <[EMAIL PROTECTED]> wrote: > Hi Max, > Thank you for the fast response. > > Here are the versions of the R packages I am using: > > caret 3.13 > caretNWS 0.16 > nws 1.62 > > Here are the python versions > > Active Python 2.5.1.1 > nws server 1.5.2 for py2.5 > twisted 2.5.9 py2.5 > > The computer I am using has 1 Xeon dual core cpu at 1.86 GHz with 4 GB of > RAM. R is currently set up to use 2 GB of it (it starts with "C:\Program > Files\R\R-2.6.2\bin\Rgui.exe" --max-mem-size=2047M). The OS is Windows Server > 2003 R2 with SP2. > > I am running one R job/process (Rgui.exe) and almost nothing else on the > computer while R is running (no databases, web servers, office apps etc..) > > I really appreciate your help. > Cheers > Peter > > > > > >-----Original Message----- > >From: Max Kuhn [mailto:[EMAIL PROTECTED] > >Sent: Monday, March 10, 2008 12:41 PM > >To: Tait, Peter > >Cc: r-help@R-project.org > >Subject: Re: [R] caretNWS and training data set sizes > > > >What version of caret and caretNWS are you using? Also, what version > >of the nws server and twisted are you using? What kind of machine (# > >processors, how much physical memory etc)? > > > >I haven't seen any real limitations with one exception: if you are > >running P jobs on the same machine, you are replicating the memory > >needs P times. > > > >I've been running jobs with 4K to 90K samples and 1200 predictors > >without issues, so I'll need a lot more information to help you. > > > >Max > > > > > >On Mon, Mar 10, 2008 at 12:04 PM, Tait, Peter <[EMAIL PROTECTED]> wrote: > >> Hi, > >> > >> I am using the caretNWS package to train some supervised regression > >models (gbm, lasso, random forest and mars). The problem I have encountered > >started when my training data set increased in the number of predictors and > >the number of observations. > >> > >> The training data set has 347 numeric columns. The problem I have is > >when there are more then 2500 observations the 5 sleigh objects start but > >do not use any CPU resources and do not process any data. > >> > >> N=100 cpu(%) memory(K) > >> Rgui.exe 0 91737 > >> 5x sleighs (RTerm.exe) 15-25 ~27000 > >> > >> N=2500 > >> Rgui.exe 0 160000 > >> 5x sleighs (RTerm.exe) 15-25 ~74000 > >> > >> N=5000 > >> Rgui.exe 50 193000 > >> 5x sleighs (RTerm.exe) 0 ~19000 > >> > >> > >> A 10% sample of my overall data is ~22000 observations. > >> > >> Can someone give me an idea of the limitations of the nws and caretNWS > >packages in terms of the number of columns and rows of the training > >matrices and if there are other tuning/training functions that work faster > >on large datasets? > >> > >> Thanks for your help. > >> Peter > >> > >> > >> > version > >> _ > >> platform i386-pc-mingw32 > >> arch i386 > >> os mingw32 > >> system i386, mingw32 > >> status > >> major 2 > >> minor 6.2 > >> year 2008 > >> month 02 > >> day 08 > >> svn rev 44383 > >> language R > >> version.string R version 2.6.2 (2008-02-08) > >> > >> > memory.limit() > >> [1] 2047 > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting- > >guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > >-- > > > >Max > -- Max ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.