Thank you for answers.

 My code is very slow compared with yours ;-)


#my code

system.time(r0<-f0(iBig,jBig))

user system elapsed

82.489 15.060 97.544


 #Holtman's code

system.time(r1<-f1(iBig,jBig))

user system elapsed

0.100 0.012 0.113


 #Dunlap's code

system.time(r2<-f2(iBig,jBig))

user system elapsed

0.084 0.004 0.088


2010/3/5 William Dunlap <wdun...@tibco.com>

> > -----Original Message-----
> > From: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org] On Behalf Of Carlos Petti
> > Sent: Friday, March 05, 2010 9:43 AM
> > To: r-help@r-project.org
> > Subject: [R] How to match vector with a list ?
> >
> > Dear list,
> >
> > I have a vector of characters and a list of two named elements :
> >
> > i <- c("a","a","b","b","b","c","c","d")
> >
> > j <- list(j1 = c("a","c"), j2 = c("b","d"))
> >
> > I'm looking for a fast way to obtain a vector with names, as follows :
> >
> > [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
>
> A request with a such a nice copy-and-pastable
> example in it deserves an answer.
>
> It looks to me like you want to map the item names
> in i to the group names that are the names of the list j,
> which maps group names to the items in each group.
> When there are lots of groups it can be faster to
> first invert the list j into a mapping vector pair,
> as in:
>
> f2 <- function (i, j) {
>    groupNames <- rep(names(j), sapply(j, length)) # map to groupName
>    itemNames <- unlist(j, use.names = FALSE) # map from itemName
>    groupNames[match(i, itemNames, nomatch = NA)]
> }
>
> I put your original code into a function, as this makes
> testing and development easier:
>
> f0 <- function (i, j) {
>     match <- lapply(j, function(x) {
>        which(i %in% x)
>    })
>    k <- vector()
>    for (y in 1:length(match)) {
>        k[match[[y]]] <- names(match[y])
>    }
>    k
> }
>
> With your original data these give identical results:
>
> > identical(f0(i,j), f2(i,j))
> [1] TRUE
>
> I made a list describing 1000 groups, each containing
> an average of 10 members:
>
> jBig <- split(paste("N",1:10000,sep=""),
> sample(paste("G",1:1000,sep=""),size=10000,replace=TRUE))
>
> and a vector of a million items sampled from the those
> member names:
>
> iBig <- sample(paste("N",1:10000,sep=""), replace=TRUE, size=1e6)
>
> Then I compared the times it took f0 and f2 to compute
> the result and verified that their outputs were identical:
>
> > system.time(r0<-f0(iBig,jBig))
>   user  system elapsed
>  100.89   10.20  111.27
> > system.time(r2<-f2(iBig,jBig))
>   user  system elapsed
>   0.14    0.00    0.14
> > identical(r0,r2)
> [1] TRUE
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> >
> > I used :
> >
> > match <- lapply(j, function (x) {which(i %in% x)})
> > k <- vector()
> > for (y  in 1:length(match)) {
> > k[match[[y]]] <- names(match[y])}
> > k
> > [1] "j1" "j1" "j2" "j2" "j2" "j1" "j1" "j2"
> >
> > But, I think a better way exists ...
> >
> > Thanks in advance,
> > Carlos
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to