Dear Jason, On Fri, May 29, 2009 at 2:48 PM, Jason Rupert <jasonkrup...@yahoo.com> wrote: > > I think I am using the improved version of setdiff(...) that handles > data.frames, so I think some odd behavior was expected but this one is > escaping me. > > It appears that the the addition of duplicate entries is not caught by the > setdiff(...). Is this expected behavior?
[snip] > Thanks in advance for any feedback. > > Test1_DF<-data.frame(HouseSize=c(1:100)) > Test2_DF<-rbind(Test1_DF, Test1_DF) > setdiff(Test1_DF, Test2_DF) > integer(0) > setdiff(Test2_DF, Test1_DF) > integer(0) > > However, > Test3_DF<-data.frame(HouseSize=c(1:25)) > setdiff(Test1_DF, Test3_DF) > [1] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 > [17] 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 > [33] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 > [49] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 > [65] 90 91 92 93 94 95 96 97 98 99 100 > > setdiff(Test3_DF, Test1_DF) > integer(0) You didn't explicitly say which "improved version" of setdiff() that you are using, so I can only presume that you are using the setdiff.data.frame in the prob package. The behaviour you are observing is expected and matches the base:::setdiff behaviour in the case of vectors; cf. x1 <- c(1:100) x2 <- c(x1,x1) setdiff(x1, x2) # integer(0) setdiff(x2, x1) # integer(0) x3 <- c(1:25) setdiff(x1, x3) # 26:100 setdiff(x3, x1) # integer(0) > > If so, is there another method or approach that should be used to identify > duplicate row entries between two different data frames? > The R-help archives are chock full of every possible variant of questions (and answers) about this, and you haven't said _exactly_ what you are looking for. In the absence of an already posted solution, please specify exactly what you want and I'll wager an R Ninja could dispatch it in moments. Regards, Jay *************************************************** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics & Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: gke...@ysu.edu http://www.cc.ysu.edu/~gjkerns/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.