Hi Why are you doing duplicate queries to the database (two As and Cs in your names vector)?
Why do 5 simultaneously connection to the database server? Woukld you do 500 connections? Why not do one query and let the database server do the job for you? Try this: > options(useFancyQuotes = FALSE) > > query0 <- "select id from table where name in (%s)" > > names <- paste(sQuote(LETTERS[1:5]), collapse = ",") > names [1] "'A','B','C','D','E'" > > query <- sprintf(query0, names) > query [1] "select id from table where name in ('A','B','C','D','E')" > > dataFromDB <- sqlQuery(dbConn, query) This should work for MS SQL and MySQL servers. Yours sincerely / Med venlig hilsen Frede Aakmann Tøgersen Specialist, M.Sc., Ph.D. Plant Performance & Modeling Technology & Service Solutions T +45 9730 5135 M +45 2547 6050 fr...@vestas.com http://www.vestas.com Company reg. name: Vestas Wind Systems A/S This e-mail is subject to our e-mail disclaimer statement. Please refer to www.vestas.com/legal/notice If you have received this e-mail in error please contact the sender. > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Tevlin, Dylan > Sent: 2. juli 2014 23:49 > To: r-help@r-project.org > Subject: [R] parLapply on sqlQuery (from package RODBC) > > R Version : 2.14.1 x64 > Running on Windows 7 > > Connecting to a database on a remote Microsoft SQL Server 2012 > > The short form of my problem is the following. > > I have an unordered vectors of names, say: > > names<-c("A", "B", "A", "C","C") > > each of which have an id in a table in my db. I need to convert the names to > their corresponding ids. > > I currently have the following code to do it. > ### > names<-c("A", "B", "A", "C","C") > dbConn<-odbcDriverConnect(connection="connection string") #successfully > connects > > nameToID<-function(name, dbConn){ > #dbConn : active db connection formed via odbcDriverConnect > #name : a char string > > sqlQuery(dbConn, paste("select id from table where name='", > name, > "'", sep="")) > } > sapply(names, nameToID, dbConn=dbConn) > ### > > Barring better ways to do this, which could involve loading the table into R > then working with the problem there (which is possible), I understand why > the following doesn't work, but I cannot seem to find a solution. Attempting > to use parallelization via the package 'parallel' : > > ### > names<-c("A", "B", "A", "C","C") > dbConn<-odbcDriverConnect(connection="connection string") #successfully > connects > > nameToID<-function(name, dbConn){ > #dbConn : active db connection formed via odbcDriverConnect > #name : a char string > > sqlQuery(dbConn, paste("select id from table where name='", > name, > "'", sep="")) > } > > mc<-detectCores() > cl<-makeCluster(mc) > clusterExport(cl, c("sqlQuery", "dbConn")) > parSapply(cl, names, nameToID, dbConn=dbConn) #incorrect passing of > nameToID's second argument > ### > > As in the comment, this is not the correct way to assign the second argument > to nameToID. > > I have also tried the following: > > parSapply(cl, names, function(x) nameToID(x, dbConn)) > > in place of the previous parSapply call, but that also does not work, with the > error being thrown saying "the first parameter is not an open RODBC > connection", presumably referring to the first parameter of the sqlQuery() > > The following code does work with parallization. > > ### > names<-c("A", "B", "A", "C","C") > dbConn<-odbcDriverConnect(connection="connection string") #successfully > connects > nameToID<-function(name){ > #name : a char string > dbConn<-odbcDriverConnect(connection="string") > result<-sqlQuery(dbConn, paste("select id from table where > name='", name, "'", sep="")) > odbcClose(dbConn) > result > } > > mc<-detectCores() > cl<-makeCluster(mc) > clusterExport(cl, c("sqlQuery", "odbcDriverConnect", "odbcClose", "dbConn", > "nameToID")) #throwing everything in > parSapply(cl, names, nameToID) > ### > > But the constant opening and closing a ton of the gains from parallelization, > and seems just a bit silly. > > So the overall question would be how to pass the second parameter (the > open db connection) to the function within parSapply, in much the same way > as it is done in the regular apply? In general, how does one pass a second, > third, nth parameter to a function within a parallel routine? > > Thanks and if you need any more information let me know. > > -DT > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.