Dear Miha,

a general way to do this is as follows:
Define a distance measure by aggregating the Euclidean distance on the (X,Y)-space and the trivial 0-1 distance (0 if category is the same) on the categorial variable. Perform cluster analysis (whichever you want) on the resulting distance matrix.

Note that there is more than one way to do this. The 0-1-distance could be incorporated in the definition of the Euclidean distance (instead of (x_i-y_i)^2), or a weighted average of the distances in X-, Y- and categorial space could be computed. Weights of variables (including possibly rescaling) have to be decided. How to do this precisely should depend on the subject matter and prior information about variable importance etc. In absence of such information, you may standardise the variablewise sums of squared pairwise distances to be equal.

Hope this helps (and you can figure out the relevant R code yourself).

Christian

On Tue, 3 Jun 2008, Miha Staut wrote:

Dear all,

I would like to perform a clustering analysis on a data frame with two coordinate 
variables (X and Y) and a categorical variable where only a != b can be established.  As 
far as I understood classification analyses, they are not an option as they partition the 
training set only in k classes of the test set.  By searching through the book 
"Modern Applied Statistics with S" I did not find a satisfactory solution.

I will be grateful for any suggestions.

Best regards
Miha



     __________________________________________________________
can.html

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to