Hello, Dirk, maybe I'm missing something, but to avoid your for-loop-approach doesn't
M <- M/Matrix::rowSums(M) do what you want? Hth -- Gerrit --------------------------------------------------------------------- Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109 http://www.uni-giessen.de/eichner ---------------------------------------------------------------------
Hello R-Users, I'm looking for a way to scale the rows of a sparse matrix M with about 57,000 rows, 14,000 columns, and 238,000 non-zero matrix elements; see example code below. Usually I'd use the base::scale() function (see sample code), but it freezes my computer. The same happens when I try to run a for loop over the matrix rows. The conversion with as.matrix() yields a 5.8 Gb large object, which appears too large for scale(). So my question is: How can the rows of a large sparse matrix be efficiently scaled? Thanks and regards, Dirk ### Hardware/Session Info Intel Core i7 w/ 12 Gb RAM R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Ubuntu 14.04.3 LTS ### Example Code library(Matrix) set.seed(42) ## These are exemplary values for my real "problem matrix" N_ROW <- 56743 N_COL <- 13648 SIZE <- 238283 PROB <- c(0.050, 0.050, 0.099, 0.149, 0.198, 0.178, 0.119, 0.079, 0.0297, 0.0198, 0.001, 0.001, 0.001) ## get some random values to populate the sparse matrix x <- do.call( what = rbind, args = lapply(X = 1:N_ROW, FUN = function(i) expand.grid(i, sample(x = 1:N_COL, size = sample(1:15, 1), replace = TRUE) ) ) ) x[,3] <- sample(x = 1:13, size = nrow(x), replace = TRUE, prob = PROB) ## build the sparse matrix M <- Matrix::sparseMatrix( dims = c(N_ROW, N_COL), i = x[,1], j = x[,2], x = x[,3] ) print(format(object.size(M), units = "auto")) ## ******************************************* ## Scaling the rows of M ## scale() lets my computer freeze # M <- scale(t(M), center = FALSE, scale(Matrix::rowSums(M))) ## this appears to be not elegant at all and takes forever # rwsms <- Matrix::rowSums(M) # for (i in 1:nrow(M)) M[i,] <- M[i,]/rwsms[[i]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.