Hi, The trick used by duplicated.data.frame() is to transform the supplied data.frame into a character vector by pasting together the columns using "\r" as separator. But no precautions are taken to deal with "\r" in the supplied data.frame. As a consequence it's easy to imagine situations where duplicated.data.frame() returns an incorrect answer:
> df <- data.frame(a=c("AA", "AA\r"), b=c("\rBBB", "BBB")) > df a b 1 AA \rBBB 2 AA\r BBB > duplicated(df) [1] FALSE TRUE Cheers, H. > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel