Thank you for answers. My code is very slow compared with yours ;-)
#my code system.time(r0<-f0(iBig,jBig)) user system elapsed 82.489 15.060 97.544 #Holtman's code system.time(r1<-f1(iBig,jBig)) user system elapsed 0.100 0.012 0.113 #Dunlap's code system.time(r2<-f2(iBig,jBig)) user system elapsed 0.084 0.004 0.088 2010/3/5 William Dunlap <wdun...@tibco.com> > > -----Original Message----- > > From: r-help-boun...@r-project.org > > [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos Petti > > Sent: Friday, March 05, 2010 9:43 AM > > To: r-help@r-project.org > > Subject: [R] How to match vector with a list ? > > > > Dear list, > > > > I have a vector of characters and a list of two named elements : > > > > i <- c("a","a","b","b","b","c","c","d") > > > > j <- list(j1 = c("a","c"), j2 = c("b","d")) > > > > I'm looking for a fast way to obtain a vector with names, as follows : > > > > [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2" > > A request with a such a nice copy-and-pastable > example in it deserves an answer. > > It looks to me like you want to map the item names > in i to the group names that are the names of the list j, > which maps group names to the items in each group. > When there are lots of groups it can be faster to > first invert the list j into a mapping vector pair, > as in: > > f2 <- function (i, j) { > groupNames <- rep(names(j), sapply(j, length)) # map to groupName > itemNames <- unlist(j, use.names = FALSE) # map from itemName > groupNames[match(i, itemNames, nomatch = NA)] > } > > I put your original code into a function, as this makes > testing and development easier: > > f0 <- function (i, j) { > match <- lapply(j, function(x) { > which(i %in% x) > }) > k <- vector() > for (y in 1:length(match)) { > k[match[[y]]] <- names(match[y]) > } > k > } > > With your original data these give identical results: > > > identical(f0(i,j), f2(i,j)) > [1] TRUE > > I made a list describing 1000 groups, each containing > an average of 10 members: > > jBig <- split(paste("N",1:10000,sep=""), > sample(paste("G",1:1000,sep=""),size=10000,replace=TRUE)) > > and a vector of a million items sampled from the those > member names: > > iBig <- sample(paste("N",1:10000,sep=""), replace=TRUE, size=1e6) > > Then I compared the times it took f0 and f2 to compute > the result and verified that their outputs were identical: > > > system.time(r0<-f0(iBig,jBig)) > user system elapsed > 100.89 10.20 111.27 > > system.time(r2<-f2(iBig,jBig)) > user system elapsed > 0.14 0.00 0.14 > > identical(r0,r2) > [1] TRUE > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > > > I used : > > > > match <- lapply(j, function (x) {which(i %in% x)}) > > k <- vector() > > for (y in 1:length(match)) { > > k[match[[y]]] <- names(match[y])} > > k > > [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2" > > > > But, I think a better way exists ... > > > > Thanks in advance, > > Carlos > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.