Dear James,

On Wed, Nov 24, 2010 at 4:35 PM, James Davidson <[email protected]> wrote:
>
> Great job on the Knime nodes!  I have been giving these a go and am
> impressed (and excited about the future development!).  A couple of
> observations / comments / questions:

Thanks!

>
> 1.  I have observed that sometimes the FP node seems to generate blank
> fingerprints (doesn't appear to just be the rendering - eg blank if I swap
> to 'Bit Scratch' render as well.  I have mainly been trying the default
> Morgan FPs, and find that if I reset the node and re-run, the FP is still
> blank.  If, however, I swap the node to eg atompair, run, then swap back to
> Morgan - it seems to work...  I am running on knime 2.2.2 on Windows 32-bit.

That's odd. I haven't seen anything like this, but I haven't spent a
ton of time using the windows version. I'll try to see if I can
reproduce it.

> 2.  The next point is probably down to cheminformatics / knime naivety, but
> I must confess I am struggling a little to cluster compounds based on the
> FP...   I have used the 'Distance Matrix Calculate' node (with Tanimoto
> similarity) to get a matrix that can be used by the 'Heirarchical Clustering
> (DistMatrix)' or 'k-Medoids' nodes.  However, both of these appear to
> perform VERY slowly for a set of ~ 4000 compounds.  I also attempted to
> cluster on the fingerprints directly, using the Neighborgrams nodes - but
> must confess I am some way off understanding what I am doing!

Hierarchical Clustering (DistMatrix) does, indeed, scale poorly.
According to the docs it scales cubically in the number of rows...
that's going to hurt when N=4000. The implementation the RDKit uses
(adapted from some code by Murtagh) is pretty heavily optimized and
behaves well for large datasets.

> My limited
> experience of using the RDKit functionality to cluster compounds and eg
> select a representative set (based on the FP Tanimoto distances and the
> Murtagh clustering) was that it performed rather rapidly.  Is there the
> intention to expose this functionality in knime (or is the functionality
> already there and I just don't know how?)

It's not there yet, but it sure would be useful if the knime
implementation were faster. I don't think it makes sense to use the
RDKit implementation directly, but it may be possible to do a port of
the Murtagh algorithm to java.  Thorsten? What do you think?

>
> 3.  Any plans for Windows 64-bit support?

I haven't had a 64bit windows machine set up for development work, so
I've never even tested the RDKit under 64bit windows. I just got a new
machine, which does have windows installed. I will see about getting a
development environment on there and trying to build the RDKit, but
I'm not going to make any promises there.

> 4.  I would be interested to know what the team views as the next priorities
> - property calcs, 3D conformations, pharmacophores, rendering?  So much
> great stuff to choose from!  :-)

We're open to suggestions. In addition to what's already there, the
initial release will contain at least an AddCoordinates node which can
add either 2D coordinates (optionally aligned to a template) or a 3D
conformation. If you have things that you'd really like to see, please
pipe up.

Best Regards,
-greg

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to