[R] question regarding using weights in the hierarchical/ kmeans clustering process

eugen pircalabelu Thu, 28 Feb 2008 12:11:41 -0800

Hi R users!

I have a bit of a problem with using an hierarchical clustering algorithm:


 a<-c(1:15)
 b<-rep(seq(1:3), 5)
 c<-rnorm(15, 0,1)
 d<-c(sample(1:100, 15, replace=T))
 e<-c(sample(1:100, 15, replace=T))
 f<-c(sample(1:100, 15, replace=T))
 data<-data.frame(a,b,c,d,e,f)
 q<-data.frame(data$d, data$e, data$f)
 q<-scale(q)


What i want to do is to use an hierarchical cluster analysis on q data.frame, 
but using data$c as a weighting variable, could it be done? or is there a 
package that would  let me use my weights in the clustering process, but an 
hierarchical process?

Another question:
say i wanted to t.test data$d, data$e but having again data$c as weights, how 
could it be done? 

and the last 2 questions:
1. how can i weight a whole dataframe in order for me to keep my weights for a 
specific analysis, like cluster or t.test or any other analysis that does not 
let me incorporate a "weight" option? I am looking for something like in spss 
where i can weight a whole data frame and use it for a subsequent analysis, or 
something like the survey package from R but one that offers flexibility to use 
any analysis that i want (i saw that survey package offers limited connectivity 
 to such  analyses ) 
 2. why does a kmeans cluster analysis offer a  multitude  of different results?
I tried both several times 
>cclust(scale(q), 3, verbose=T)
>kmeans(scale(q), 3)
 and they both seem vary unstable even with this small data.frame with respect 
to the cluster sizing, and i don't  know why? Does it always behave  like this 
? 

Thank you and have a great day!!

       
---------------------------------

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question regarding using weights in the hierarchical/ kmeans clustering process

Reply via email to