Sure, but in the end I like to call clusters of genes and not of samples. Actually the experiment is a time-lapse experiment, therefore the samples (columns) are fixed anyway.
I guess my misunderstanding is that I get clustering of rows in the latter case (with dist(t(matrix))) because it's actually the heatmap function itself, that does the actual clustering on rows, right? But still my question stays the same: how can I cluster 25000 genes for 20 samples with a "normal" (i7) processor without running into several hours of clustering/ presumably anyhow "freezing" of the process? Best Maxim 2011/3/2 <rex.dw...@syngenta.com> > Don't you expect it to be a lot faster if you cluster 20 items instead of > 25000? > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Maxim > Sent: Wednesday, March 02, 2011 4:08 PM > To: r-help@r-project.org > Subject: [R] clustering problem > > Hi, > > I have a gene expression experiment with 20 samples and 25000 genes each. > I'd like to perform clustering on these. It turned out to become much > faster > when I transform the underlying matrix with t(matrix). Unfortunately then > I'm not anymore able to use cutree to access individual clusters. In > general > I do something like this: > > hc <- hclust(dist(USArrests), "ave") > > library(RColorBrewer) > library(gplots) > clrno=3 > cols<-rainbow(clrno, alpha = 1) > clstrs <- cutree(hc, k=clrno) > ccols <- cols[as.vector(clstrs)] > heatcol<-colorRampPalette(c(3,1,2), bias = 1.0)(32) > heatmap.2(as.matrix(USArrests), Rowv=as.dendrogram(hc),col=heatcol, > trace="none",RowSideColors=ccols) > > Nice, I can access 3 main clusters with cutree. But what about a situation > when I perform hclust like > > hc <- hclust(dist(t(USArrests)), "ave") > > which I have to do in order to speed up the clustering process. This I can > plot with: > > heatmap.2(as.matrix(USArrests), Colv=as.dendrogram(hc),col=heatcol, > trace="none") > > But where do I find information about the clustering that was applied to > the > rows? > cutree(hc, k=clrno) delivers the clustering on the columns, so what can I > do > to access the levels for the rows? > I guess the solution is easy, but after ours of playing around I thought it > might be a good time to contact the mailing list! > > Maxim > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > message may contain confidential information. If you are not the designated > recipient, please notify the sender immediately, and delete the original and > any copies. Any use of the message by you is prohibited. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.