Re: [R] SVM Param Tuning with using SNOW package

David Winsemius Wed, 18 Nov 2009 06:45:21 -0800

I cannot really be sure what you are trying to do, but doing a bit of"surgery" on your code lets it run on a multicore Mac:


library(e1071)
library(snow)
library(pls)


data(gasoline)

X=gasoline$NIR
Y=gasoline$octane

NR=10
cost1=seq(0.5,30, length=NR)

sv.lin<- function(c) {

for (i in 1:NR) {

ind=sample(1:60,50)
gTest<-  data.frame(Y=I(Y[-ind]),X=I(X[-ind,]))
gTrain<- data.frame(Y=I(Y[ind]),X=I(X[ind,]))

svm.lin <- svm(gTrain$X,gTrain$Y, kernel="linear",cost=c[i],cross=5)

results.lin   <- predict(svm.lin, gTest$X)

e.test.lin     <- sqrt(sum((results.lin-gTest$Y)^2)/length(gTest$Y))

return(e.test.lin)
}
}

cl<- makeCluster(2, type="SOCK" )

clusterEvalQ(cl, library(e1071))
cost1=seq(0.5,30, length=NR)

clusterExport(cl,c("NR","Y","X",  "cost1"))
# Pretty sure you need a copy of cost1 on each node.


RMSEP<-clusterApply(cl, cost1, sv.lin)

# I thought the second argument was the matrix or vector over which toiterate.


stopCluster(cl)

# Since I don't know what the model meant, I cannot determine whehterthis result is interpretable>

> RMSEP
[[1]]
[1] 0.1921887

[[2]]
[1] 0.1924917

[[3]]
[1] 0.1885066

[[4]]
[1] 0.1871466

[[5]]
[1] 0.3550932

[[6]]
[1] 0.1226460

[[7]]
[1] 0.2426345

[[8]]
[1] 0.2126299

[[9]]
[1] 0.2276286

[[10]]
[1] 0.2064534

--
David Winsemius, MD

On Nov 18, 2009, at 7:09 AM, raluca wrote:

Hi Charlie,
Yes, you are perfectly right, when I make the clusters I should put2, not
10 (it remained 10 from previous trials with 10 slaves).

cl<- makeCluster(2, type="SOCK" )
To tell the truth I do not understand very well what the 2ndparameter for
clusterApplyLB() has to be.
If the function sv.lin has just 1 parameter, sv.lin(c), where c isthe cost,
how should I call clusterApplyLB?


? clusterApply LB(cl, ?,sv.lin, c=cost1)  ?
Below, I am providing a working example, using the gasoline datathat comes
in the pls package.

Thank you for your time!


library(e1071)
library(snow)
library(pls)

data(gasoline)

X=gasoline$NIR
Y=gasoline$octane

NR=10
cost1=seq(0.5,30, length=NR)


sv.lin<- function(c) {

for (i in 1:NR) {

ind=sample(1:60,50)
gTest<-  data.frame(Y=I(Y[-ind]),X=I(X[-ind,]))
gTrain<- data.frame(Y=I(Y[ind]),X=I(X[ind,]))
svm.lin <- svm(gTrain$X,gTrain$Y, kernel="linear",cost=c[i],cross=5)
results.lin   <- predict(svm.lin, gTest$X)

e.test.lin     <- sqrt(sum((results.lin-gTest$Y)^2)/length(gTest$Y))

return(e.test.lin)
}
}


cl<- makeCluster(2, type="SOCK" )


clusterEvalQ(cl,library(e1071))


clusterExport(cl,c("NR","Y","X"))


RMSEP<-clusterApplyLB(cl,?,sv.lin,c=cost1)

stopCluster(cl)





cls59 wrote:
raluca wrote:
Hello,
Is the first time I am using SNOW package and I am trying to tunethecost parameter for a linear SVM, where the cost (variable cost1)takes 10
values between 0.5 and 30.
I have a large dataset and a pc which is not very powerful, so Ineed to
tune the parameters using both CPUs of the pc.
Somehow I cannot manage to do it. It seems that both CPUs arefitting themodel for the same values of cost1, I guess the first 5, but notfor the
last 5.

Please, can anyone help me!

Here is the code:

data <- data.frame(Y=I(Y),X=I(X))
data.X<-data$X
data.Y<-data$Y
Helping you will be difficult as we're only three lines into yourexample
and already I have no idea what the data you are using looks like.
Example code needs to be fully reproducible-- that means a smallslice ofrepresentative data needs to be provided or faked using anappropriate
random number generator.

Some things did jump out at me about your approach and I've made some
notes below.



raluca wrote:
NR=10
cost1=seq(0.5,30, length=NR)

sv.lin<- function(cl,c) {

for (i in 1:NR) {

ind=sample(1:414,276)

hogTest<-  data.frame(Y=I(data.Y[-ind]),X=I(data.X[-ind,]))
hogTrain<- data.frame(Y=I(data.Y[ind]),X=I(data.X[ind,]))
svm.lin <- svm(hogTrain$X,hogTrain$Y,kernel="linear",cost=c[i],
cross=5)
results.lin   <- predict(svm.lin, hogTest$X)
e.test.lin <- sqrt(sum((results.lin-hogTest$Y)^2)/length(hogTest$Y))
return(e.test.lin)
}
}

cl<- makeCluster(10, type="SOCK" )
If your machine has two cores, why are you setting up a clusterwith 10nodes? Usually the number of nodes should equal the number ofcores on
your machine in order to keep things efficient.



raluca wrote:
clusterEvalQ(cl,library(e1071))

clusterExport(cl,c("data.X","data.Y","NR","cost1"))

RMSEP<-clusterApplyLB(cl,cost1,sv.lin)
Are you sure this evaluation even produces results? sv.lin() is afunction
you defined above that takes two parameters-- "cl" and "c".
clusterApplyLB() will feed values of cost1 into sv.lin() for theargument"cl", but it has nothing to give for "c". At the very least, itseems
like you would need something like:

 RMSEP <- clusterApplyLB( cl, cost1, sv.lin, c = someVector )



raluca wrote:
stopCluster(cl)
Sorry I can't be very helpful, but with no data and no apparent waytolegally call sv.lin() the way you have it set up, I can'tinvestigate the
problem to see if I get the same results you described.  If you could
provide a complete working example, then there's a better chance that
someone on this list will be able to help you.

Good luck!

-Charlie
--
View this message in context: 
http://old.nabble.com/SVM-Param-Tuning-with-using-SNOW-package-tp26399401p26406709.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SVM Param Tuning with using SNOW package

Reply via email to