Hi Murali. I haven't compared, but this is what I would do: bestMatch<-function(searchVector, matchMat) { searchRow<-unique(sort(match(searchVector, colnames(matchMat)))) #if you're sure, you could drop unique cat("Original row indices:") print(searchRow) matchMat<-matchMat[, -searchRow, drop=FALSE] #avoid duplicates altogether cat("Corrected Matrix:\n") print(matchMat) correctedRows<-searchRow - seq_along(searchRow) + 1 #works because of the sort above cat("Corrected row indices:") print(correctedRows) sapply(correctedRows, function(cr){ lookWhere<-matchMat[cr, seq(cr-1)] cat("Will now look into:\n") print(lookWhere) cc<-which.max(lookWhere) cat("Max at position", cc, "\n") colnames(matchMat)[cc] }) } I don't think there's that much difference. Depending on specific sizes, it may be more or less costly to first shrink the search matrix like I do. And similarly depending, I may be better still if you remove the rows that you're not interested in as well (some more but similar index trickery required then.
HTH, Nick Sabbe -- ping: nick.sa...@ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of murali.me...@avivainvestors.com Sent: donderdag 31 maart 2011 16:46 To: r-help@r-project.org Subject: [R] choosing best 'match' for given factor Folks, I have a 'matching' matrix between variables A, X, L, O: > a <- structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, 0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list( c("A", "X", "L", "O"), c("A", "X", "L", "O"))) > a A X L O A 1.00 0.41 0.58 0.75 X 0.41 1.00 0.60 0.86 L 0.58 0.75 1.00 0.83 O 0.60 0.86 0.83 1.00 And I have a search vector of variables > v <- c("X", "O") I want to write a function bestMatch(searchvector, matchMat) such that for each variable in searchvector, I get the variable that it has the highest match to - but searching only among variables to the left of it in the 'matching' matrix, and not matching with any variable in searchvector itself. So in the above example, although "X" has the highest match (0.86) with "O", I can't choose "O" as it's to the right of X (and also because "O" is in the searchvector v already); I'll have to choose "A". For "O", I will choose "L", the variable it's best matched with - as it can't match "X" already in the search vector. My function bestMatch(v, a) will then return c("A", "L") My matrix a is quite large, and I have a long list of search vectors v, so I need an efficient method. I wrote this: bestMatch <- function(searchvector, matchMat) { sapply(searchvector, function(cc) { y <- matchMat[!(rownames(matchMat) %in% searchvector) & (index(rownames(matchMat)) < match(cc, rownames(matchMat))), cc, drop = FALSE]; rownames(y)[which.max(y)] }) } Any advice? Thanks, Murali ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.