Re: [R] help with duplicates

2009-06-05 Thread Peter Dalgaard
Chris Anderson wrote: I have a large dataset that contain duplicate records. How do I identify and remove duplicate records? Here's one way: > aq <- airquality[sample(NROW(airquality), replace=TRUE),] > any(duplicated(aq)) [1] TRUE > which(duplicated(aq)) [1] 2 15 34 44 45 47 49 5

Re: [R] help with duplicates

2009-06-05 Thread Jim Porzak
Chris, How large is large? How may columns? "Duplicate" across all columns of just some? Henrique gave you simple R answer. Perhaps doing in SQL is more efficient? eg SELECT DISTINCT FROM ; HTH, Jim Porzak TGN.com San Francisco, CA www.linkedin.com/in/jimporzak use R! Group SF

Re: [R] help with duplicates

2009-06-05 Thread Henrique Dallazuanna
Try this: d <- data.frame(a = c(1, 1, 2, 3), b = c(10, 10, 9, 8)) unique(d) On Fri, Jun 5, 2009 at 1:38 PM, Chris Anderson wrote: > I have a large dataset that contain duplicate records. How do I identify > and remove duplicate records? > > > Chris Anderson > 707.315.8486 > www.sassydeals4u.co

[R] help with duplicates

2009-06-05 Thread Chris Anderson
I have a large dataset that contain duplicate records. How do I identify and remove duplicate records? Chris Anderson 707.315.8486 www.sassydeals4u.com Free info for small business owners. Click here to find great products geared for