R version 2.8.1 (2008-12-22) / Windows XP There are several bugs in unique for data frames and matrices. Please find minimal reproducible examples below.
-s -----A----- Unique of a vector uses numerical comparison: > nn <- ((1+2^-52)^(5:22)) > unique(nn) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 While unique of a data frame uses comparison of the 15-digit string: > unique(data.frame(a=nn)) a 1 1 Similarly: > unique(matrix(nn,ncol=1)) [,1] [1,] 1 -----B----- > df <- data.frame(a=c("\r",""),b=c("","\r")) > unique(df) a b 1 \r > unique(as.matrix(df)) a b [1,] "\r" "" Though "\r" is no doubt rare in strings, it is perfectly legal. -----C----- For vectors and data frames, unique preserves the POSIXct class: dd <- as.POSIXct('1999-1-1') > unique(dd) [1] "1999-01-01 EST" > unique(data.frame(a=dd)) a 1 1999-01-01 But for matrices, it converts to the underlying number: > unique(matrix(dd)) [,1] [1,] 915166800 -----workaround----- The first two bugs can be worked around by converting the matrix to a list of vectors, calling unique, then converting back: library(plyr) laply(unique(alply(matrix(nn,ncol=1),1,identity)),identity,.drop=FALSE) laply(unique(alply(mm,1,identity)),identity,.drop=FALSE) ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel