Thank you very much for all the info and support. Now I managed to make it working on a small subset of the original data set. I think that the first error message I got (Error in as.dist(dmat[clustering == i, clustering == i]) : (subscript) logical subscript too long) is generated when the 2 objects required by cluster.stats do not have the same length.
Thanks! Laura ------- Original message ------- Da: Christian Hennig <[EMAIL PROTECTED]> Inviato: 14.6.'08, 20:46 > Dear Laura, > > I have R 2.6.0. I tried dist on a vector of length 200,000 and it told me > that it is too long. Theoretically, if you have 260,000 observations, the > length of the dist object should be 260,000*259,999/2, which is too large > for our computers, I guess. Which means that unfortunately cluster.stats > won't work for such a large data set, because it needs the full casewise > dissimilarity information. > > I don't understand how you managed to produce a dist object of length > of only 130,000 out of your data, but it certainly doesn't give all > pairwise distance information for 260,000 points and therefore cannot be > used in cluster.stats with a clustering vector of length 260,000 or so. > > Sorry, > Christian > > On Sat, 14 Jun 2008, Laura Poggio wrote: > > > Thank. See below. > > > > Laura > > > > 2008/6/14 Christian Hennig <[EMAIL PROTECTED]>: > > > >> What does str(ddata) give? > > > > > > Class 'dist' atomic [1:130816] 69.2 117.1 145.6 179.9 195.6 ... > > > > > >> > >> dcent doesn't make sense as input for cluster.stats, because you need a > >> dissimilarity matrix between all objects. > >> > > > > Yes I know ... I simply try to see if something was changing with different > > structure of data > > > > > > > >> > >> Christian > >> > >> On Sat, 14 Jun 2008, Laura Poggio wrote: > >> > >> I am sorry I did not provide enough information. > >>> I am not using img later, but data that is data.frame. > >>> I wrote that img is a "image" just to explain what kind of data is coming > >>> from, but the object I am using is data and it is a data.frame (checked > >>> many > >>> times). > >>> > >>> I am not using as.dist, but dist in order to calculate the distance matrix > >>> among the data I have. Then the whole code I am using is: > >>> > >>> data <- <- as(img, "data.frame")[1:1] #(where img is an image 256x256 > >>> px) > >>> kl <- kmeans(data, 5) > >>> library(fpc) > >>> ddata <- dist(data) > >>> dcent <- dist(kl$centers) > >>> > >>> cluster.stats(ddata, kl$cluster) > >>> cluster.stats(dcent, kl$cluster) > >>> > >>> In both cases I got the same error: > >>> Error in as.dist(dmat[clustering == i, clustering == i]) : (subscript) > >>> logical subscript too long > >>> > >>> Below the structure of the different objects is detailed below: > >>> data is "'data.frame': 262144 obs. of 1 variable" > >>> kl$centers is "num [1:5, 1]" > >>> kl$cluster is "Named int [1:262144]" > >>> > >>> I hope it is more informative. I am sorry but I did not find any > >>> explanation > >>> for the error message I am getting. > >>> > >>> Thank you very much in advance > >>> > >>> Laura > >>> > >>> > >>> > >>> 2008/6/14 Christian Hennig <[EMAIL PROTECTED]>: > >>> > >>> The given information is not enough to tell you what's going on. as.dist > >>>> doesn't appear in the given code and it's not clear to me what kind of > >>>> object img is ("a small image" doesn't tell me what R makes of it). > >>>> Also, try to read the help pages first and find out whether img is of the > >>>> format that is required by the functions. And check (using str for > >>>> example) > >>>> whether "data" is what you expect it to be. > >>>> > >>>> Christian > >>>> > >>>> > >>>> On Sat, 14 Jun 2008, Laura Poggio wrote: > >>>> > >>>> Thank you very much for your answer. > >>>> > >>>>> I tried to run the function on my data and now I am getting this message > >>>>> of > >>>>> error > >>>>> Error in as.dist(dmat[clustering == i, clustering == i]) : (subscript) > >>>>> logical subscript too long > >>>>> > >>>>> Below the code I am using (version2.7.0 of R with all packages updated): > >>>>> > >>>>> data <- <- as(img, "data.frame")[1:1] #(where img is a small image > >>>>> 256 > >>>>> px > >>>>> x 256 px) > >>>>> kl <- kmeans(data, 5) > >>>>> library(fpc) > >>>>> cluster.stats(data, kl$cluster) > >>>>> > >>>>> Thank you for any hints on the reasons and meaning of the error! > >>>>> > >>>>> Laura > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> 2008/6/13 Christian Hennig <[EMAIL PROTECTED]>: > >>>>> > >>>>> Dear Laura, > >>>>> > >>>>>> > >>>>>> Dear list, > >>>>>> > >>>>>> I just tried to use the function cluster.stat in the package fpc. > >>>>>>> I just have a couple of questions about the syntax: > >>>>>>> > >>>>>>> cluster.stats(d,clustering,alt.clustering=NULL, > >>>>>>> silhouette=TRUE,G2=FALSE,G3=FALSE) > >>>>>>> > >>>>>>> 1) the distance object (d) is an object obtained by the function > >>>>>>> dist() > >>>>>>> on > >>>>>>> my own original matrix? > >>>>>>> > >>>>>>> > >>>>>>> d is allowed to be an object of class dist or a dissimilarity matrix. > >>>>>> The answer to your question depends on what your "original matrix" is. > >>>>>> If > >>>>>> it is something on which you can compute a distance by dist(), you're > >>>>>> right, > >>>>>> at least if dist() delivers the distance you are interested in. > >>>>>> > >>>>>> 2) clustering is the clusters vector as result of one of the many > >>>>>> > >>>>>> clustering > >>>>>>> methods? > >>>>>>> > >>>>>>> > >>>>>>> The help page tells you what clustering can be. So it could be the > >>>>>> clustering/partition vector of a clustering method or it could be > >>>>>> something > >>>>>> else. Note that cluster.stats doesn't depend on any particular > >>>>>> clustering > >>>>>> method. It computes the statistics regardless of where the clustering > >>>>>> vector > >>>>>> comes from. > >>>>>> > >>>>>> Best regards, > >>>>>> Christian > >>>>>> > >>>>>> > >>>>>> Thank you very much in advance and sorry for such basic question, but > >>>>>> I > >>>>>> > >>>>>>> did > >>>>>>> not manage to clarify my mind. > >>>>>>> > >>>>>>> Laura > >>>>>>> > >>>>>>> [[alternative HTML version deleted]] > >>>>>>> > >>>>>>> ______________________________________________ > >>>>>>> R-help@r-project.org mailing list > >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>> PLEASE do read the posting guide > >>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>>> > >>>>>>> > >>>>>>> *** --- *** > >>>>>>> > >>>>>> Christian Hennig > >>>>>> University College London, Department of Statistical Science > >>>>>> Gower St., London WC1E 6BT, phone +44 207 679 1698 > >>>>>> [EMAIL PROTECTED], > >>>>>> www.homepages.ucl.ac.uk/~ucakche<http://www.homepages.ucl.ac.uk/%7Eucakche> > >>>>>> <http://www.homepages.ucl.ac.uk/%7Eucakche> > >>>>>> <http://www.homepages.ucl.ac.uk/%7Eucakche> > >>>>>> > >>>>>> > >>>>>> > >>>>> *** --- *** > >>>> Christian Hennig > >>>> University College London, Department of Statistical Science > >>>> Gower St., London WC1E 6BT, phone +44 207 679 1698 > >>>> [EMAIL PROTECTED], > >>>> www.homepages.ucl.ac.uk/~ucakche<http://www.homepages.ucl.ac.uk/%7Eucakche> > >>>> <http://www.homepages.ucl.ac.uk/%7Eucakche> > >>>> > >>>> > >>> > >> *** --- *** > >> Christian Hennig > >> University College London, Department of Statistical Science > >> Gower St., London WC1E 6BT, phone +44 207 679 1698 > >> [EMAIL PROTECTED], > >> www.homepages.ucl.ac.uk/~ucakche<http://www.homepages.ucl.ac.uk/%7Eucakche> > >> > > > > ?[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > *** --- *** > Christian Hennig > University College London, Department of Statistical Science > Gower St., London WC1E 6BT, phone +44 207 679 1698 > [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.