TY very much for your setdiffDF(). It does the job perfectly. Arnaud Gaboury A2CT2 Ltd.
-----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Petr Savicky Sent: lundi 27 février 2012 20:41 To: r-help@r-project.org Subject: Re: [R] compare two data frames of different dimensions and onlykeep unique rows On Mon, Feb 27, 2012 at 07:10:57PM +0100, Arnaud Gaboury wrote: > No, but I tried your way too. > > In fact, the only three unique rows are these ones: > > Product Price Nbr.Lots > Cocoa 2440 5 > Cocoa 2450 1 > Cocoa 2440 6 > > Here is a dirty working trick I found : > > > df<-merge(exportfile,reported,all.y=T) > > df1<-merge(exportfile,reported) > > dff1<-do.call(paste,df) > > dff<-do.call(paste,df) > > dff1<-do.call(paste,df1) > > df[!dff %in% dff1,] > Product Price Nbr.Lots > 3 Cocoa 2440 5 > 4 Cocoa 2450 1 > > > My two problems are : I do think it is not so a clean code, then I won't know > by advance which of my two df will have the greates dimension (I can add some > lines to deal with it, but again, seems very heavy). Hi. Try the following. setdiffDF <- function(A, B) { A[!duplicated(rbind(B, A))[nrow(B) + 1:nrow(A)], ] } df1 <- setdiffDF(reported, exportfile) df2 <- setdiffDF(exportfile, reported) rbind(df1, df2) I obtained Product Price Nbr.Lots 3 Cocoa 2440 5 4 Cocoa 2450 1 31 Cocoa 2440 6 Is this correct? I see the row Cocoa 2440.00 6 only in exportfile and not in reported. The trick with paste() is not a bad idea. A variant of it is used also in the base function duplicated.matrix(), since it contains apply(x, MARGIN, function(x) paste(x, collapse = "\r")) If speed is critical, then possibly the paste() trick written for the whole columns, for example paste(df[[1]], df[[2]], df[[3]], sep="\r") and then setdiff() can be better. Hope this helps. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.