Try this. apseq() sorts the input and appends a sequence number: 0, 1, ... to successive occurrences of each value. Apply that to both vectors transforms it into a problem that works with ordinary match:
> lookupTable <- c("a", "a","b","c","d","e","f") > matchSample <- c("a", "a","a","b","d") > > # sort and append sequence no > apseq <- function(x) { + x <- sort(x) + s <- cumsum(!duplicated(x)) + paste(x, seq(s) - match(s, s)) + } > > match(apseq(matchSample), apseq(lookupTable)) [1] 1 2 NA 3 5 On Sun, Jun 22, 2008 at 10:57 PM, <[EMAIL PROTECTED]> wrote: > Hi folks, > > Can anyone suggest an efficient way to do "matching without > replacement", or "one-to-one matching"? pmatch() doesn't quite provide > what I need... > > For example, > > lookupTable <- c("a","b","c","d","e","f") > matchSample <- c("a","a","b","d") > ##Normal match() behaviour: > match(matchSample,lookupTable) > [1] 1 1 2 4 > > My problem here is that both "a"s in matchSample are matched to the same > "a" in the lookup table. I need the elements of the lookup table to be > excluded from the table as they are matched, so that no match can be > found for the second "a". > > Function pmatch() comes close to what I need: > > pmatch(matchSample,lookupTable) > [1] 1 NA 2 4 > > Yep! However, pmatch() incorporates partial matching, which I > definitely don't want: > > lookupTable <- c("a","b","c","d","e","aaaaaaaaf") > matchSample <- c("a","a","b","d") > pmatch(matchSample,lookupTable) > [1] 1 6 2 4 > ## i.e. the second "a", matches "aaaaaaaaf" - I don't want this. > > Of course, when identical items ARE duplicated in both sample and lookup > table, I need the matching to reflect this: > > lookupTable <- c("a","a","c","d","e","f") > matchSample <- c("a","a","c","d") > ##Normal match() behaviour > match(matchSample,lookupTable) > [1] 1 1 3 4 > > No good - pmatch() is better: > > lookupTable <- c("a","a","c","d","e","f") > matchSample <- c("a","a","c","d") > pmatch(matchSample,lookupTable) > [1] 1 2 3 4 > > ...but we still have the partial matching issue... > > ##And of course, as per the usual behaviour of match(), sample elements > missing from the lookup table should return NA: > > matchSample <- c("a","frog","e","d") ; print(matchSample) > match(matchSample,lookupTable) > > Is there a nifty way to get what I'm after without resorting to a for > loop? (my code's already got too blasted many of those...) > > Thanks, > > Alec Zwart > CMIS CSIRO > [EMAIL PROTECTED] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.