Am 27.11.2010 08:39, schrieb James Davidson:
> For completeness - this result was with the Hierarchical
> Clustering(DistMatrix) node set with 'Tanimoto' similarity and 'Complete
> Linkage' for cluster comparison.  Changing the comparison to 'Single
> Linkage' did not reduce the time.
That is expected. The "linkage" only controls which distance is used in
the the (the maximum, minimum or average) but you need to look at all
distances in any case.


> Interestingly, the documentation for the 'standard' Hierarchical
> Clustering' (ie non-distance matrix) node states that it operates with
> "n-squared complexity".  
Ooops. That is certainly wrong. It is the same algorithm. n^3 would be
right.

> I guess other clustering algorithms available
> in knime must scale better than cubicly as well (k-means, fuzzy
> c-means?) - but as far as I can see they don't currently operate on
> distance matrices (or directly on bit vectors).
There is a k-medoids that should work on distance matrices. The problem
for k-means (and fuzzy c-means) is that you need the full coordinates in
order to set the prototypes in each iteration. That doesn't work if you
only have pairwise distances.


> If they could, then
> this may be a solution; or implementing the Murtagh algorithm (I am
> guessing the scaling is below cubic from my recollection of the speeds
> observed in rdkit).
Greg sent me a link to the publication. If I find some time, I will have
a look at it.

Cheers,

Thorsten

-- 
Dr.-Ing. Thorsten Meinl               room: Z815
Nycomed Chair for Bioinformatics      fax: +49 (0)7531 88-5132
and Information Mining                phone: +49 (0)7531 88-5016
Box 712, 78457 Konstanz, Germany

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to