Hi David, I have no idea what "magic" you did, but running exactly the same code as you, I have the same problem as before, meaning that I get results that are identical from 2 in 2, while I should get diffrent results for each value of cost1 (which is a vector with 10 values running between 0.5 and 30) This is the result I get. 0.2197162, 0.2197162, 0.1467448, 0.1467448, 0.2247955, 0.2247955, 0.1073280, 0.1073280 0.2332475, 0.2332475
Anyway, thanks a lot for trying. PS. Probably I should switch to Mac :) David Winsemius wrote: > > I cannot really be sure what you are trying to do, but doing a bit of > "surgery" on your code lets it run on a multicore Mac: > > library(e1071) > library(snow) > library(pls) > > data(gasoline) > > X=gasoline$NIR > Y=gasoline$octane > > NR=10 > cost1=seq(0.5,30, length=NR) > > sv.lin<- function(c) { > > for (i in 1:NR) { > > ind=sample(1:60,50) > gTest<- data.frame(Y=I(Y[-ind]),X=I(X[-ind,])) > gTrain<- data.frame(Y=I(Y[ind]),X=I(X[ind,])) > > svm.lin <- svm(gTrain$X,gTrain$Y, kernel="linear",cost=c[i], > cross=5) > results.lin <- predict(svm.lin, gTest$X) > > e.test.lin <- sqrt(sum((results.lin-gTest$Y)^2)/length(gTest$Y)) > > return(e.test.lin) > } > } > > cl<- makeCluster(2, type="SOCK" ) > > clusterEvalQ(cl, library(e1071)) > cost1=seq(0.5,30, length=NR) > > clusterExport(cl,c("NR","Y","X", "cost1")) > # Pretty sure you need a copy of cost1 on each node. > > > RMSEP<-clusterApply(cl, cost1, sv.lin) > # I thought the second argument was the matrix or vector over which to > iterate. > > stopCluster(cl) > > # Since I don't know what the model meant, I cannot determine whehter > this result is interpretable> > > RMSEP > [[1]] > [1] 0.1921887 > > [[2]] > [1] 0.1924917 > > [[3]] > [1] 0.1885066 > > [[4]] > [1] 0.1871466 > > [[5]] > [1] 0.3550932 > > [[6]] > [1] 0.1226460 > > [[7]] > [1] 0.2426345 > > [[8]] > [1] 0.2126299 > > [[9]] > [1] 0.2276286 > > [[10]] > [1] 0.2064534 > > -- > David Winsemius, MD > > On Nov 18, 2009, at 7:09 AM, raluca wrote: > >> >> Hi Charlie, >> >> >> Yes, you are perfectly right, when I make the clusters I should put >> 2, not >> 10 (it remained 10 from previous trials with 10 slaves). >> >> cl<- makeCluster(2, type="SOCK" ) >> >> To tell the truth I do not understand very well what the 2nd >> parameter for >> clusterApplyLB() has to be. >> >> If the function sv.lin has just 1 parameter, sv.lin(c), where c is >> the cost, >> how should I call clusterApplyLB? >> >> >> ? clusterApply LB(cl, ?,sv.lin, c=cost1) ? >> >> >> >> Below, I am providing a working example, using the gasoline data >> that comes >> in the pls package. >> >> Thank you for your time! >> >> >> library(e1071) >> library(snow) >> library(pls) >> >> data(gasoline) >> >> X=gasoline$NIR >> Y=gasoline$octane >> >> NR=10 >> cost1=seq(0.5,30, length=NR) >> >> >> sv.lin<- function(c) { >> >> for (i in 1:NR) { >> >> ind=sample(1:60,50) >> gTest<- data.frame(Y=I(Y[-ind]),X=I(X[-ind,])) >> gTrain<- data.frame(Y=I(Y[ind]),X=I(X[ind,])) >> >> svm.lin <- svm(gTrain$X,gTrain$Y, kernel="linear",cost=c[i], >> cross=5) >> results.lin <- predict(svm.lin, gTest$X) >> >> e.test.lin <- sqrt(sum((results.lin-gTest$Y)^2)/length(gTest$Y)) >> >> return(e.test.lin) >> } >> } >> >> >> cl<- makeCluster(2, type="SOCK" ) >> >> >> clusterEvalQ(cl,library(e1071)) >> >> >> clusterExport(cl,c("NR","Y","X")) >> >> >> RMSEP<-clusterApplyLB(cl,?,sv.lin,c=cost1) >> >> stopCluster(cl) >> >> >> >> >> >> cls59 wrote: >>> >>> >>> raluca wrote: >>>> >>>> Hello, >>>> >>>> Is the first time I am using SNOW package and I am trying to tune >>>> the >>>> cost parameter for a linear SVM, where the cost (variable cost1) >>>> takes 10 >>>> values between 0.5 and 30. >>>> >>>> I have a large dataset and a pc which is not very powerful, so I >>>> need to >>>> tune the parameters using both CPUs of the pc. >>>> >>>> Somehow I cannot manage to do it. It seems that both CPUs are >>>> fitting the >>>> model for the same values of cost1, I guess the first 5, but not >>>> for the >>>> last 5. >>>> >>>> Please, can anyone help me! >>>> >>>> Here is the code: >>>> >>>> data <- data.frame(Y=I(Y),X=I(X)) >>>> data.X<-data$X >>>> data.Y<-data$Y >>>> >>>> >>> >>> >>> Helping you will be difficult as we're only three lines into your >>> example >>> and already I have no idea what the data you are using looks like. >>> Example code needs to be fully reproducible-- that means a small >>> slice of >>> representative data needs to be provided or faked using an >>> appropriate >>> random number generator. >>> >>> Some things did jump out at me about your approach and I've made some >>> notes below. >>> >>> >>> >>> raluca wrote: >>>> >>>> NR=10 >>>> cost1=seq(0.5,30, length=NR) >>>> >>>> sv.lin<- function(cl,c) { >>>> >>>> for (i in 1:NR) { >>>> >>>> ind=sample(1:414,276) >>>> >>>> hogTest<- data.frame(Y=I(data.Y[-ind]),X=I(data.X[-ind,])) >>>> hogTrain<- data.frame(Y=I(data.Y[ind]),X=I(data.X[ind,])) >>>> >>>> svm.lin <- svm(hogTrain$X,hogTrain$Y, >>>> kernel="linear",cost=c[i], >>>> cross=5) >>>> results.lin <- predict(svm.lin, hogTest$X) >>>> >>>> e.test.lin <- sqrt(sum((results.lin-hogTest$Y)^2)/ >>>> length(hogTest$Y)) >>>> >>>> return(e.test.lin) >>>> } >>>> } >>>> >>>> cl<- makeCluster(10, type="SOCK" ) >>>> >>> >>> >>> If your machine has two cores, why are you setting up a cluster >>> with 10 >>> nodes? Usually the number of nodes should equal the number of >>> cores on >>> your machine in order to keep things efficient. >>> >>> >>> >>> raluca wrote: >>>> >>>> >>>> clusterEvalQ(cl,library(e1071)) >>>> >>>> clusterExport(cl,c("data.X","data.Y","NR","cost1")) >>>> >>>> RMSEP<-clusterApplyLB(cl,cost1,sv.lin) >>>> >>> >>> >>> Are you sure this evaluation even produces results? sv.lin() is a >>> function >>> you defined above that takes two parameters-- "cl" and "c". >>> clusterApplyLB() will feed values of cost1 into sv.lin() for the >>> argument >>> "cl", but it has nothing to give for "c". At the very least, it >>> seems >>> like you would need something like: >>> >>> RMSEP <- clusterApplyLB( cl, cost1, sv.lin, c = someVector ) >>> >>> >>> >>> raluca wrote: >>>> >>>> >>>> stopCluster(cl) >>>> >>>> >>> >>> >>> Sorry I can't be very helpful, but with no data and no apparent way >>> to >>> legally call sv.lin() the way you have it set up, I can't >>> investigate the >>> problem to see if I get the same results you described. If you could >>> provide a complete working example, then there's a better chance that >>> someone on this list will be able to help you. >>> >>> Good luck! >>> >>> -Charlie >>> >> >> -- >> View this message in context: >> http://old.nabble.com/SVM-Param-Tuning-with-using-SNOW-package-tp26399401p26406709.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/SVM-Param-Tuning-with-using-SNOW-package-tp26399401p26415997.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.