Hi Murali.
I haven't compared, but this is what I would do:

bestMatch<-function(searchVector, matchMat)
{
        searchRow<-unique(sort(match(searchVector, colnames(matchMat)))) #if
you're sure, you could drop unique
        cat("Original row indices:")
        print(searchRow)
        matchMat<-matchMat[, -searchRow, drop=FALSE] #avoid duplicates
altogether
        cat("Corrected Matrix:\n")
        print(matchMat)
        correctedRows<-searchRow - seq_along(searchRow) + 1 #works because
of the sort above
        cat("Corrected row indices:")
        print(correctedRows)
        sapply(correctedRows, function(cr){
                        lookWhere<-matchMat[cr, seq(cr-1)]
                        cat("Will now look into:\n")
                        print(lookWhere)
                        cc<-which.max(lookWhere)
                        cat("Max at position", cc, "\n")
                        colnames(matchMat)[cc]
                })
}
I don't think there's that much difference. Depending on specific sizes, it
may be more or less costly to first shrink the search matrix like I do. And
similarly depending, I may be better still if you remove the rows that
you're not interested in as well (some more but similar index trickery
required then.

HTH,


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove





-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of murali.me...@avivainvestors.com
Sent: donderdag 31 maart 2011 16:46
To: r-help@r-project.org
Subject: [R] choosing best 'match' for given factor

Folks,

I have a 'matching' matrix between variables A, X, L, O:

> a <- structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, 
0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list(
    c("A", "X", "L", "O"), c("A", "X", "L", "O")))

> a
      A     X     L     O
A  1.00  0.41  0.58  0.75
X  0.41  1.00  0.60  0.86
L  0.58  0.75  1.00  0.83
O  0.60  0.86  0.83  1.00

And I have a search vector of variables

> v <- c("X", "O")

I want to write a function bestMatch(searchvector, matchMat) such that for
each variable in searchvector, I get the variable that it has the highest
match to - but searching only among variables to the left of it in the
'matching' matrix, and not matching with any variable in searchvector
itself.

So in the above example, although "X" has the highest match (0.86) with "O",
I can't choose "O" as it's to the right of X (and also because "O" is in the
searchvector v already); I'll have to choose "A".

For "O", I will choose "L", the variable it's best matched with - as it
can't match "X" already in the search vector.

My function bestMatch(v, a) will then return c("A", "L")

My matrix a is quite large, and I have a long list of search vectors v, so I
need an efficient method.

I wrote this:

bestMatch <- function(searchvector,  matchMat) {
        sapply(searchvector, function(cc) {
                             y <- matchMat[!(rownames(matchMat) %in%
searchvector) & (index(rownames(matchMat)) < match(cc, rownames(matchMat))),
cc, drop = FALSE];
                             rownames(y)[which.max(y)]
        })   
}

Any advice?

Thanks,

Murali

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to