on 01/13/2009 01:17 PM Doran, Harold wrote: > Suppose I have a dataframe as follows: > > dat <- data.frame(id = c(1,1,2,2,2), var1 = c(10,10,20,20,25), var2 = > c('foo', 'foo', 'foo', 'foobar', 'foo')) > > Now, if I were to subset by id, such as: > >> subset(dat, id==1) > id var1 var2 > 1 1 10 foo > 2 1 10 foo > > I can see that the elements in var1 are exactly the same and the > elements in var2 are exactly the same. However, > >> subset(dat, id==2) > id var1 var2 > 3 2 20 foo > 4 2 20 foobar > 5 2 25 foo > > Shows the elements are not the same for either variable in this > instance. So, what I am looking to create is a data frame that would be > like this > > id freq var1 var2 > 1 2 TRUE TRUE > 2 3 FALSE FALSE > > Where freq is the number of times the ID is repeated in the dataframe. A > TRUE appears in the cell if all elements in the column are the same for > the ID and FALSE otherwise. It is insignificant which values differ for > my problem. > > The way I am thinking about tackling this is to loop through the ID > variable and compare the values in the various columns of the dataframe. > The problem I am encountering is that I don't think all.equal or > identical are the right functions in this case. > > So, say I was wanting to compare the elements of var1 for id ==1. I > would have > > x <- c(10,10) > > Of course, the following works > >> all.equal(x[1], x[2]) > [1] TRUE > > As would a similar call to identical. However, what if I only have a > vector of values (or if the column consists of names) that I want to > assess for equality when I am trying to automate a process over > thousands of cases? As in the example above, the vector may contain only > two values or it may contain many more. The number of values in the > vector differ by id. > > Any thoughts? > > Harold
Harold, If we are not talking about testing floats for equivalence: > merge(table(id = dat$id), aggregate(dat[-1], list(id = dat$id), function(x) length(unique(x)) == 1), by = "id") id Freq var1 var2 1 1 2 TRUE TRUE 2 2 3 FALSE FALSE HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.