Sorry---I thought it worked but I think I am actually definitely doing something wrong...
The problem might be that there are NA's and there are also duplicated values...My fault. I can't figure out what is going wrong... I'll be more thorough and modify the two df to mirror more what I have to explain better: df1 is: Name Position location francesca A 75 maria A 75 cristina B 36 And df2 is: location Country 75 UK 75 Italy 56 France 56 Austria So I thought I had to first eliminate the duplicates like this: df1_unique<-subset(df1, !duplicated(location)) df2_unique<-subset(df2, !duplicated(location)) After doing this I get: df1 : Name Position location francesca A 75 cristina B 36 And df2: location Country 75 UK 56 France And I would like to match on "Location" and the output to tell me which records are matching in df1 and not in df2, the ones matching in both, and the ones which are in df2 but are not matching in df1... Name Position Location Match francesca A 75 1 cristina B 36 0 As William suggested, df12 <- merge(df1, cbind(df2, fromDF2=TRUE), all.x=TRUE, by="location") df12$Match <- !is.na(df12$fromDF2) new_common<- new[which(new$Match==TRUE),] Would give me the records that are matching, which should be correct, but I am not getting the correct value for the non-shared elements (the variants that are in the df2 but not indf1): df2_only <- subset(df1_unique, !(location %in% df2_unique)) df2_only<- df2_unique[-which(df2_unique$location %in% df1_unique$location),] Neither of these work and give me wrong records... My questions are: 1. How do I calculate the records from df2 which are NOT in df1? 2.Do I need to eliminate the duplictaes (or is there a way to record where they came from)? Any help is very appreciated... THANK YOU very much! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.