Re: [R] Clustering Functions used by Reverse-Dependencies

2024-02-29 Thread Leo Mada via R-help
f code. On the other hand, the help page for codetools::checkUsage is quite cryptic. But it's good to know at least where to look. Sincerely, Leonard From: Ivan Krylov Sent: Wednesday, February 28, 2024 10:36 AM To: Leo Mada via R-help Cc: Leo Mada Su

Re: [R] Clustering Functions used by Reverse-Dependencies

2024-02-28 Thread Ivan Krylov via R-help
В Sat, 24 Feb 2024 03:08:26 + Leo Mada via R-help пишет: > Are there any tools to extract the function names called by > reverse-dependencies? For well-behaved packages that declare their dependencies correctly, parsing the NAMESPACE for importFrom() and import() calls should give you the ex

Re: [R] Clustering of datasets

2022-09-05 Thread Rui Barradas
Hello, I am not at all sure that the following answers the question. The code below ries to find the optimal number of clusters. One of the changes I have made to your call to kmeans is to subset DMs not dropping the dim attribute. library(cluster) max_clust <- 10 wss <- numeric(max_clust)

Re: [R] Clustering of datasets

2022-09-05 Thread Jim Lemon
Hi Subhamitra, I think the fact that you are passing a vector of values rather than a matrix is part of the problem. As you have only one value for each country, The points plotted will be the index on the x-axis and the value for each country on the y-axis. Passing a value for ylim= means that you

Re: [R] Clustering methods for data that has bimodal distribution

2016-12-05 Thread Ranjan Maitra
Hello Adrian, It all depends on what the structure of the dataset is. For instance, you said that all your values are betweenn -1 and 1. Do the data rown sum-squared up to 1? How about the means? Are they zero. I guess all this has to depend on the application and how the data were processed or

Re: [R] clustering with hclust

2014-07-25 Thread Christian Hennig
Dear Marianna, the function agnes in library cluster can compute Ward's method from a raw data matrix (at least this is what the help page suggests). Also, you may not be using the most recent version of hclust. The most recent version has a note in its help page that states: "Two different

Re: [R] clustering of binary data

2012-12-06 Thread David L Carlson
Do not use html in r-help emails. Look below at what happens to your data. The error message is telling you that t(data) is not numeric. > str(data) That will tell you what kind of data you have. -- David L Carlson Associate Professor of Anthropology

Re: [R] Clustering analysis with ordination plots

2012-05-02 Thread Gavin Simpson
Please read the posting guide for future questions. I presume you mean using the vegan package? If so, then see this blog post of mine which shows how to do something similar: http://wp.me/pZRQ9-73 If you post more details and an example I will help further if the blog post is not sufficient for

Re: [R] Clustering analysis with ordination plots

2012-05-01 Thread Uwe Ligges
On 30.04.2012 18:44, borinot wrote: Hello to all, I'm new to R so I have a lot of problems with it, but I'll only ask the main one. I have clustered an environmental matrix We do not know what that is. Where is the example data? See the posting guide. with 2 different methods, Which

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
PS to my previous posting: Also have a look at kmeansruns in fpc. This runs kmeans for several numbers of clusters and decides the number of clusters by either Calinski&Harabasz or Average Silhouette Width. Christian On Wed, 10 Aug 2011, Ken Hutchison wrote: Hello all, I am using the clust

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
There is a number of methods in the literature to decide the number of clusters for k-means. Probably the most popular one is the Calinski and Harabasz index, implemented as calinhara in package fpc. A distance based version (and several other indexes to do this) is in function cluster.stats in

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Peter Langfelder
On Wed, Aug 10, 2011 at 12:07 PM, Ken Hutchison wrote: > Hello all, >   I am using the clustering functions in R in order to work with large > masses of binary time series data, however the clustering functions do not > seem able to fit this size of practical problem. Library 'hclust' is good > (t

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Thomas Lumley
Try the flow cytometry clustering functions in Bioconductor. -thomas On Thu, Aug 11, 2011 at 7:07 AM, Ken Hutchison wrote: > Hello all, >   I am using the clustering functions in R in order to work with large > masses of binary time series data, however the clustering functions do not > see

Re: [R] clustering based on most significant pvalues does not separate the groups!

2011-07-06 Thread pguilha
Yes absolutely, your explanation makes sense. Thanks very much. rgds Paul -- View this message in context: http://r.789695.n4.nabble.com/clustering-based-on-most-significant-pvalues-does-not-separate-the-groups-tp3644249p3649233.html Sent from the R help mailing list archive at Nabble.com. _

Re: [R] clustering based on most significant pvalues does not separate the groups!

2011-07-06 Thread S Ellison
t-tests and the like test for a difference in mean value, not for non-overlapping populations or data sets. The fact that the mean of one data set differs significantly from the mean of the other does not mean that the ranges of the individual points in each data set are disjoint. set.seed(10

Re: [R] clustering problem

2011-03-02 Thread Maxim
Sure, but in the end I like to call clusters of genes and not of samples. Actually the experiment is a time-lapse experiment, therefore the samples (columns) are fixed anyway. I guess my misunderstanding is that I get clustering of rows in the latter case (with dist(t(matrix))) because it's actua

Re: [R] clustering problem

2011-03-02 Thread rex.dwyer
Don't you expect it to be a lot faster if you cluster 20 items instead of 25000? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Maxim Sent: Wednesday, March 02, 2011 4:08 PM To: r-help@r-project.org Subject: [R] clustering problem

Re: [R] clustering fuzzy

2011-02-05 Thread pete
After ordering the table of membership degrees , i must get the difference between the first and second coloumns , between the first and second largest membership degree of object i. This for K=2,K=3,to K.max=6. This difference is multiplyed by the Crisp silhouette index vector (si). Too it d

Re: [R] clustering fuzzy

2011-02-02 Thread pete
After ordering the table of membership degrees , i must get the difference between the first and second coloumns , between the first and second largest membership degree of object i. This for K=2,K=3,to K.max=6. This difference is multiplyed by the Crisp silhouette index vector (si). Too it d

Re: [R] clustering with finite mixture model

2011-02-02 Thread Matt Shotwell
There are quite a few packages that work with finite mixtures, as evidenced by the descriptions here: http://cran.r-project.org/web/packages/index.html These might be useful: http://cran.r-project.org/web/packages/flexmix/index.html http://cran.r-project.org/web/packages/mclust/index.html -Ma

Re: [R] clustering fuzzy

2011-01-22 Thread pete
I must get an index (fuzzy silhouette), a weighted average. A average the crisp silhouette for every row (i) s and the weight of each term is determined by the difference between the membership degrees of corrisponding object to its first and second best matching fuzzy clusters. i need the differe

Re: [R] clustering fuzzy

2011-01-21 Thread pete
thank you ,you have been very kind -- View this message in context: http://r.789695.n4.nabble.com/clustering-fuzzy-tp3229853p3230228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.c

Re: [R] clustering fuzzy

2011-01-21 Thread jim holtman
use 'apply': > head(x.m) V2 V3 V4 V5 [1,] 0.66 0.04 0.01 0.30 [2,] 0.02 0.89 0.09 0.00 [3,] 0.06 0.92 0.01 0.01 [4,] 0.07 0.71 0.21 0.01 [5,] 0.10 0.85 0.04 0.01 [6,] 0.91 0.04 0.02 0.02 > x.m.sort <- apply(x.m, 1, sort, decreasing = TRUE) > head(t(x.m.sort)) [,1] [,2] [,3] [,4]

Re: [R] clustering association rules

2010-11-11 Thread Michael Hahsler
Jüri, How did you create the output? An example to cluster transactions with arules can be found in: Michael Hahsler and Kurt Hornik. Building on the arules infrastructure for analyzing transaction data with R. In R. Decker and H.-J. Lenz, editors, /Advances in Data Analysis, Proceedings of t

Re: [R] Clustering

2010-10-30 Thread David Winsemius
On Oct 30, 2010, at 7:49 AM, dpender wrote: David Winsemius wrote: On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? I am using the Clusters functio

Re: [R] Clustering

2010-10-30 Thread dpender
David Winsemius wrote: > > > On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: > >> >> On Oct 29, 2010, at 11:37 AM, dpender wrote: >> >>> Apologies for being vague, >>> >>> The structure of the output is as follows: >> >> Still no code? >> > > I am using the Clusters function from the evd

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*, "nam

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*, "names")= chr [1:131] "6667" "6668" "6669" "6670" ... Wi

Re: [R] Clustering

2010-10-29 Thread dpender
Apologies for being vague, The structure of the output is as follows: $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*, "names")= chr [1:131] "6667" "6668" "6669" "6670" ... With 613 clusters. What I require is abstracting the first and last va

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 5:14 AM, dpender wrote: That's helpful but the reason I'm using clusters in evd is that I need to specify a time condition to ensure independence. I believe this is the first we heard about any particular function or package. I therefore have an output We woul

Re: [R] Clustering

2010-10-29 Thread dpender
That's helpful but the reason I'm using clusters in evd is that I need to specify a time condition to ensure independence. I therefore have an output in the form Cluster[[i]][j-k] where i is the cluster number and j-k is the range of values above the threshold taking account of the time condi

Re: [R] clustering on scaled dataset or not?

2010-10-28 Thread Claudia Beleites
John, Hi, just a general question: when we do hierarchical clustering, should we compute the dissimilarity matrix based on scaled dataset or non-scaled dataset? daisy() in cluster package allow standardizing the variables before calculating dissimilarity matrix; I'd say that should depend

Re: [R] Clustering

2010-10-28 Thread David Winsemius
On Oct 28, 2010, at 8:00 AM, dpender wrote: I am looking to use R in order to determine the number of extreme events for a high frequency (20 minutes) dataset of wave heights that spans 25 years (657,432) data points. I require the number, spacing and duration of the extreme events as an

Re: [R] Clustering

2010-10-28 Thread Albyn Jones
I have worked with seismic data measured at 100hz, and had no trouble locating events in "long" records (several times the size of your dataset). 20 minutes is high frequency? what kind of waves are these? what is the wavelength? some details would help. albyn On Thu, Oct 28, 2010 at 05:00:10A

Re: [R] Clustering with ordinal data

2010-10-19 Thread Michael Bedward
Hello Steve, > I've been asked to help evaluate a vegetation data set, specifically to > examine it for community similarity. The initial problem I see is that the > data is ordinal.   At best this only captures a relative ranking of > abundance and ordinal ranks are assigned after data collection

Re: [R] Clustering with ordinal data

2010-10-19 Thread Steve_Friedman
10 02:23 cc PMr-help@r-project.org Subject Re: [R] Clustering with or

Re: [R] Clustering with ordinal data

2010-10-19 Thread Phil Spector
Steve - Take a look at daisy() in the cluster package. - Phil Spector Statistical Computing Facility Department of Statistics UC Be

Re: [R] Clustering

2010-06-23 Thread Tal Galili
Hi Ralph, In case of hclust, the dendrogram does show the "steps" (they are the heights presented in the graph). You can present them also in a matrix using "cutree", for example: dat <- (USArrests) n <- (dim(dat)[1]) hc <- hclust(dist(USArrests)) cutree(hc, k=1:n) You might then visualize the

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-14 Thread Henrik Aldberg
Thank you Etienne, this seems to work like a charm. Also thanks to the rest of you for your help. Henrik On 11 June 2010 13:51, Cuvelier Etienne wrote: > > > Le 11/06/2010 12:45, Henrik Aldberg a écrit : > > I have a directed graph which is represented as a matrix on the form >> >> >> 0 4 0 1

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-13 Thread Joris Meys
Henrik, the methods you use are NOT applicable to directed graphs, in the contrary even. They will split up what you want to put together. In your data, an author never cites himself. Hence, A and B are far more different than B and D according to the techniques you use. Please check out Etiennes

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-12 Thread Dave Roberts
Henrik, Given your initial matrix, that should tell you which authors are similar/dissimilar to which other authors in terms of which authors they cite. In this case authors 1 and 3 are most similar because they both cite authors 2 and 4. Authors 2 and 3 are most different because they

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-12 Thread Henrik Aldberg
Dave, I used daisy with the default settings (daisy(M) where M is the matrix). Henrik On 11 June 2010 21:57, Dave Roberts wrote: > Henrik, > >The clustering algorithms you refer to (and almost all others) expect > the matrix to be symmetric. They do not seek a graph-theoretic solution, >

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-11 Thread Dave Roberts
Henrik, The clustering algorithms you refer to (and almost all others) expect the matrix to be symmetric. They do not seek a graph-theoretic solution, but rather proximity in geometric or topological space. How did you convert y9oru matrix to a dissimilarity? Dave Roberts Henrik Al

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-11 Thread Cuvelier Etienne
Le 11/06/2010 12:45, Henrik Aldberg a écrit : I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first ro

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
Ah OK, I didn't get your question then. a dist-object is actually a vector of numbers with a couple of attributes. You can't just cut out values like that. The hclust function needs a perfect distance matrix to use the calculations. shortcut is easy : just do f <- f/2*max(f), and all values are b

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
I can't run your code. Please, just give me whatever comes on your screen when you run: dput(q) On Fri, May 28, 2010 at 10:57 PM, Ayesha Khan wrote: > I assume my matrix should look something like this?.. > > >round(distance, 4) >P00A P00B M02A M02B P04A P04B M06A M06B P0

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
I assume my matrix should look something like this?.. >round(distance, 4) P00A P00B M02A M02B P04A P04B M06A M06B P08A P08B M10A P00B 0.9678 M02A 1.0054 1.0349 M02B 1.0258 1.0052 1.2106 P04A 1.0247 0.9928 1.0145 0.9260 P04B 0.9898 0.9769 0.9875 0.9855 0.6075 M06A 1.0159 0.

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
v <- dput(x,"sampledata.txt") dim(v) q <- v[1:10,1:10] f =as.matrix(dist(t(q))) distB=NULL for(k in 1:(nrow(f)-1)) for( m in (k+1):ncol(f)) { if(f[k,m] <2) distB=rbind(distB,c(k,m,f[k,m])) } #now distB looks like this > distB [,1] [,2] [,3] [1,]12 1.6275568 [2,]13 0

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Yes Joris. I did try that and it does produce the results. I am now wondering why I wanted a matrix like structure in the first place. However, I do want 'f' to contain values less than 2 only. but when i try to get rid of values greater than 2 by doing N <- (f[f<2], f strcuture disrupts and hclust

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
errr, forget about the output of dput(q), but keep it in mind for next time. f = dist(t(q)) hclust(f,method="single") it's as simple as that. Cheers Joris On Fri, May 28, 2010 at 10:39 PM, Ayesha Khan wrote: > v <- dput(x,"sampledata.txt") > dim(v) > q <- v[1:10,1:10] > f =as.matrix(dist(t(q)))

Re: [R] clustering in R

2010-05-28 Thread Tal Galili
Hi Ayesha, I wish to help you, but without a simple self contained example that shows your issue, I will not be able to help. Try using the ?dput command to create some simple data, and let us see what you are doing. Best, Tal Contact Details:---

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Thanks Tal & Joris! I created my distance matrix distA by using the dist() function in R manipulating my output in order to get a matrix. distA =as.matrix(dist(t(x2))) # x2 being my original dataset as according to the documentaion on dist() For the default method, a "dist" object, or a matrix (of

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
As Tal said. Next to that, I read that column1 (and column2?) are supposed to be seen as factors, not as numerical variables. Did you take that into account somehow? It's easy to reproduce the error code : > n <- NULL > if(n<2)print("This is OK") Error in if (n < 2) print("This is OK") : argument

Re: [R] clustering in R

2010-05-27 Thread Tal Galili
Hi Ayesha, hclust is a way to go (much better then trying to invent the wheel here). Please add what you used to create: distA And create a sample data set to show us what you did, using dput Best, Tal Contact Details:--- Con

Re: [R] Clustering with clara

2010-01-14 Thread Christian Hennig
Dear Paco, as far as I know, there is no such problem with clara, but I may be wrong. However, in order to help you (though I'm not sure whether I'll be able to do that), we'd need to understand precisely what you were doing in R and how your data looks like (code and data; you can show us a r

Re: [R] Clustering for Ordinal data

2009-10-15 Thread Dylan Beaudette
On Wednesday 14 October 2009, Paul Evans wrote: > Hi, > > I just wanted to check whether there is a clustering package available for > ordinal data. My data looks something like: #1 #2 #3 #4. > A B C D... > D B C A... > D C A A... > where each column represents a sample, and each row some ordin

Re: [R] clustering, don't understand this error

2009-04-16 Thread Christian Hennig
Hi there, I'm travelling right now so I can't really check this but it seems that the problem is that cluster.stats needs a partition as input. hclust doesn't give you a partition but you can generate one from it using cutree. BTW, rather use "<-" than "=". Best wishes, Christian On Wed, 1

Re: [R] Clustering with Mahalanobis Distance

2008-12-10 Thread Wayne F
I don't have any experience with your particular problem, but the thing I notice is that mahalanobis is that by default you specify a covariance matrix, and it uses solve to calculate its inverse. If you could supply the inverse covariance matrix (and specify inverted=TRUE to mahalanobis), that mi

Re: [R] Clustering and functions

2008-11-08 Thread Sarah Goslee
It would help a lot if you told us what the error message was, and provided some data to work with. As it is, we can't even run the function to find out what goes wrong. And also, OS, version of R - all that stuff that the posting guide requests. Sarah On Sat, Nov 8, 2008 at 10:31 AM, Bryan Rich

Re: [R] clustering and data-mining...

2008-08-24 Thread losemind
Here is some recent update: Any thoughts? I have collected a list of experiment result data. I put them into a table. There are N rows corresponding to N data points. For i-th row, it contains data of the form y_i = f(a_i, b_i, c_i, d_i, e_i, f_i), where f is a possibly stochastic function, a,

Re: [R] Clustering large data matrix

2008-03-06 Thread Christian Hennig
Hi there, whether clara is a proper way of clustering depends strongly on what your data are and particularly what interpretation or use you want for your clustering. You may do better with a hierarchical method after having defined a proper distance (however this would rather go into statisti

Re: [R] Clustering large data matrix

2008-03-06 Thread Andris Jankevics
Hi Dani, If you are working with NMR data, which data pretreatment methods you are using? 13112 variables for NMR data sounds too lot, you should apply some data binning or peak picking methods for data reduction. Also you must consider multicollinearity problems related to spectroscopic data, the

Re: [R] clustering problem

2008-02-25 Thread Uwe Ligges
Karin Lagesen wrote: > First I just want to say thanks for all the help I've had from the > list so far..) > > I now have what I think is a clustering problem. I have lots of > objects which I have measured a dissimilarity between. Now, this list > only has one entry per pair, so it is not symme

Re: [R] Clustering with ordinal data

2008-02-15 Thread Gavin Simpson
On Fri, 2008-02-15 at 10:45 -0800, Tim Smith wrote: > Hi, > > Is there any clustering package in R that can cluster with ordinal data? > > thanks! daisy() in recommended package 'cluster' can generate dissimilarities for ordinal data using Gower's general (dis)similarity coefficient for mixed da

Re: [R] Clustering

2007-11-29 Thread Eleni Christodoulou
Thank you very much! I had misunderstood it's true... On Nov 28, 2007 6:28 PM, Birgit Lemcke <[EMAIL PROTECTED]> wrote: > Hello Eleni, > > as far as I understood and used agnes() the method argument > determines only the clustering method. > If you use diss=TRUE the distances should be taken from

Re: [R] Clustering

2007-11-28 Thread Dave Roberts
Eleni, The method= argument is in reference to how clusters are constructed, not how the dissimilarity or distance is calculated. If you pass agnes diss=TRUE then it will use the distances you have calculated by whatever means. method="complete" means that clusters are evaluated by the

Re: [R] Clustering

2007-11-28 Thread Birgit Lemcke
Hello Eleni, as far as I understood and used agnes() the method argument determines only the clustering method. If you use diss=TRUE the distances should be taken from the distance matrix. Birgit Am 28.11.2007 um 12:18 schrieb Eleni Christodoulou: > Hello all! > > I am performingsome cluste

Re: [R] Clustering techniques using R

2007-10-10 Thread ngottlieb
Sent: Monday, October 01, 2007 1:37 PM To: Maura E Monville Cc: [EMAIL PROTECTED] Subject: Re: [R] Clustering techniques using R On Mon, 1 Oct 2007, Maura E Monville wrote: > Now that I've loaded a file into an R data.frame and played with > linear regression until I got a good model

Re: [R] Clustering techniques using R

2007-10-01 Thread Prof Brian Ripley
On Mon, 1 Oct 2007, Maura E Monville wrote: > Now that I've loaded a file into an R data.frame and played with > linear regression until I got a good model, my next step is clustering > using the coefficients of the regression model (I have many files) > Thanks to some R experts' guidelines I co