I have problems converting my dataset from long to wide format. Previous 
attempts using reshape package and aggregate function were unsuccessful as they 
took too long. Apparently, my simplified solution also lasted as long. 
 
My complete codes is given below. When sample.size = 10000, the execution takes 
about 20 seconds. But sample.size = 100000 seems to take eternity. My actual 
sample.size is 15000000 i.e. 15 million. 
 
 
 
sample.size <- 10000

m <- data.frame(Name=sample(1:100000, sample.size, T), Type=sample(1:1000, 
sample.size, T), Predictor=sample(LETTERS[1:10], sample.size, T))
res <- function(m) {
    m.12.unique <- unique(m[,1:2])
    m.12.unique <- m.12.unique[order(m.12.unique[,1], m.12.unique[,2]),]
    v1 <- paste(m.12.unique[,1], m.12.unique[,2], sep=".")
    v2 <- c(sort(unique(m[,3])))
    res <- matrix(0, nr=length(v1), nc=length(v2), dimnames=list(v1, v2))
    m.ids <- paste(m[,1], m[,2], sep=".")
    for(i in 1:nrow(m)) {
      x <- m.ids[i]
      y <- m[i,3]
      res[x, y] <- res[x, y] + 1
    }
   res <- data.frame(m.12.unique[,1], m.12.unique[,2], res, row.names=NULL)
   colnames(res) <- c("Name", "Type", v2)
   return(res)
}
 
res(m)
 
> sessionInfo()
R version 2.8.0 (2008-10-20) 
i386-pc-mingw32 
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to