Hi Hadley, Thanks for replying. The glitches are the cases where you would have a bundle of lines belonging to a specific cluster, but had spaces between them (because the place of one of the lines was saved for another line that in the meantime moved to another cluster).
I just came up with a solution for how to resolve this (After showering, it tends to help my thinking...) - it is attached at the bottom of this e-mail. I will later cleanup the code a bit and publish it. Best, Tal #---------------------------------------- set.seed(100) Data <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2)) colnames(x) <- c("x", "y") # noise <- runif(100,0,.05) line.width <- rep(.004, dim(Data)[1]) Y <- NULL X <- NULL k.range <- 2:10 plot(0,0, col = "white", xlim = c(1,10), ylim = c(-.5,1.6), xlab = "Number of clusters", ylab = "Clusters means", main = "(Basic) Clustergram") axis(side =1, at = k.range) abline(v = k.range, col = "grey") centers.points <- list() for(k in k.range) { cl <- kmeans(Data, k) clusters.vec <- cl$cluster the.centers <- apply(cl$centers,1, mean) noise <- unlist(tapply(line.width, clusters.vec, cumsum))[order(seq_along(clusters.vec)[order(clusters.vec)])] noise <- noise - mean(range(noise)) y <- the.centers[clusters.vec] + noise Y <- cbind(Y, y) x <- rep(k, length(y)) X <- cbind(X, x) centers.points[[k]] <- data.frame(y = the.centers , x = rep(k , k)) # points(the.centers ~ rep(k , k), pch = 19, col = "red", cex = 1.5) } require(colorspace) COL <- rainbow_hcl(100) matlines(t(X), t(Y), pch = 19, col = COL, lty = 1, lwd = 1.5) # add points lapply(centers.points, function(xx) {with(xx,points(y~x, pch = 19, col = "red", cex = 1.3))}) ----------------Contact Details:------------------------------------------------------- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Tue, Jun 15, 2010 at 3:45 PM, Hadley Wickham <had...@rice.edu> wrote: > > My current solution is to use a constant jitter (based on "seq") on all > the > > k number of clusters, but that causes glitches in the produced image (run > my > > code to see). > > What are the glitches? It looks pretty good to me. (I'm not sure if > the colour does anything apart from make it pretty though). > > Hadley > > -- > Assistant Professor / Dobelman Family Junior Chair > Department of Statistics / Rice University > http://had.co.nz/ > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.