Hi there,

whether clara is a proper way of clustering depends strongly on what your data are and particularly what interpretation or use you want for your clustering. You may do better with a hierarchical method after having defined a proper distance (however this would rather go into statistical consultation and not just R help).

Assuming that you use some reasonable dimension reduction and clustering
method, you may get a good visualization of you clustering using the methods available via functions plotcluster/discrproj in package fpc.

Best,
Christian

On Thu, 6 Mar 2008, Dani Valverde wrote:

Hello,
I have a large data matrix (68x13112), each row corresponding to one
observation (patients) and each column corresponding to the variables
(points within an NMR spectrum). I would like to carry out some kind of
clustering on these data to see how many clusters are there. I have
tried the function clara() from the package cluster. If I use the matrix
as is, I can perform the clara analysis but when I call clusplot() I get
this error:

Error in princomp.default(x, scores = TRUE, cor = ncol(x) != 2) :
'princomp' can only be used with more units than variables

Then, I reduce the dimensionality by using the function prcomp(). Then I
take the 13 first principal components (80%< variability) and I carry
out the clara() analysis again. Then, I call the clusplot() function
again and voilà!, it works. The problem is that clusplot() only
represents the two first components of my prcomp() analysis, which
represents only 15% of the variability.
So, my questions are 1) is clara() a proper way to analyze such a large
data set? and 2) Is there an appropiate method for graphic plotting of
my data, that takes into account the whole variability if my data, not
just two principal components?
Many thanks.
Best,

Dani

--
Daniel Valverde Saubí

Grup de Biologia Molecular de Llevats
Facultat de Veterinària de la Universitat Autònoma de Barcelona
Edifici V, Campus UAB
08193 Cerdanyola del Vallès- SPAIN

Centro de Investigación Biomédica en Red
en Bioingeniería, Biomateriales y
Nanomedicina (CIBER-BBN)

Grup d'Aplicacions Biomèdiques de la RMN
Facultat de Biociències
Universitat Autònoma de Barcelona
Edifici Cs, Campus UAB
08193 Cerdanyola del Vallès- SPAIN
+34 93 5814126

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to