Hi Hadley,
Thanks for replying.

The glitches are the cases where you would have a bundle of lines belonging
to a specific cluster, but had spaces between them (because the place of one
of the lines was saved for another line that in the meantime moved to
another cluster).

I just came up with a solution for how to resolve this (After showering, it
tends to help my thinking...) - it is attached at the bottom of this e-mail.

I will later cleanup the code a bit and publish it.

Best,
Tal






#----------------------------------------


set.seed(100)
Data <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
           matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")

# noise <- runif(100,0,.05)
line.width <- rep(.004, dim(Data)[1])
Y <- NULL
X <- NULL
k.range <- 2:10

plot(0,0, col = "white", xlim = c(1,10), ylim = c(-.5,1.6),
 xlab = "Number of clusters", ylab = "Clusters means", main = "(Basic)
Clustergram")
axis(side =1, at = k.range)
abline(v = k.range, col = "grey")

centers.points <- list()

for(k in k.range)
{
cl <- kmeans(Data, k)
 clusters.vec <- cl$cluster
the.centers <- apply(cl$centers,1, mean)

noise <- unlist(tapply(line.width, clusters.vec,
cumsum))[order(seq_along(clusters.vec)[order(clusters.vec)])]
 noise <- noise - mean(range(noise))
y <- the.centers[clusters.vec] + noise
 Y <- cbind(Y, y)
x <- rep(k, length(y))
X <- cbind(X, x)

centers.points[[k]] <- data.frame(y = the.centers , x = rep(k , k))
# points(the.centers ~ rep(k , k), pch = 19, col = "red", cex = 1.5)
}

require(colorspace)
COL <- rainbow_hcl(100)
matlines(t(X), t(Y), pch = 19, col = COL, lty = 1, lwd = 1.5)

# add points
lapply(centers.points, function(xx) {with(xx,points(y~x, pch = 19, col =
"red", cex = 1.3))})







----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




On Tue, Jun 15, 2010 at 3:45 PM, Hadley Wickham <had...@rice.edu> wrote:

> > My current solution is to use a constant jitter (based on "seq") on all
> the
> > k number of clusters, but that causes glitches in the produced image (run
> my
> > code to see).
>
> What are the glitches?  It looks pretty good to me.  (I'm not sure if
> the colour does anything apart from make it pretty though).
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to