hi, I have a dataset (the netflix dataset) which is basically ~18k columns and well variable number of rows but let's assume 25 thousand for now. The dataset is very sparse. I was wondering how to do kmeans/nearest neighbors or kernel density estimation on it.
I tired using the spMatrix function in "Matrix" package. I think I'm able to create the matrix but as soon as I pass it to kmeans functions in package "stats" it says cannot allocate 3.3Gb. Which is basically 18k * 25K * 8. There is a sparse kmeans solver by tibshirani but that epxects a regular dense format matrix so again the issue is the same. A simple "no" this is not possible answer shall suffice as long as you are right!!! tHanks much. -- View this message in context: http://n4.nabble.com/Sparse-KMeans-KDE-Nearest-Neighbors-tp1568129p1568129.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.