[R] Duplicates and duplicated

christiaan pauw Wed, 13 May 2009 23:18:54 -0700

Hi everybody.
I want to identify not only duplicate number but also the original number
that has been duplicated.
Example:
x=c(1,2,3,4,4,5,6,7,8,9)
y=duplicated(x)
rbind(x,y)


gives:
    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x    1    2    3    4    4    5    6    7    8     9
y    0    0    0    0    1    0    0    0    0     0

i.e. the second 4 [,5] is a duplicate.

What I want is the first and second 4. i.e [,4] and [,5] to be TRUE

    [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
x    1    2    3    4    4    5    6    7    8     9
y    0    0    0    1    1    0    0    0    0     0

I assume it can be done by sorting the vector and then checking is the next
or the previous entry matches using
identical() . I am just unsure on how to write such a loop the logic of
which (I think) is as follows:

sort x
for every value of x check if the next value is identical and return TRUE
(or 1) if it is and FALSE (or 0) if it is not
AND
check is the previous value is identical and return TRUE (or 1) if it is and
FALSE (or 0) if it is not

Im i thinking correct and can some help to write such a function

regards
Christiaan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Duplicates and duplicated

Reply via email to