On 02/01/2011 05:37 AM, Daniel Zaharevitz wrote:
> We will be
> looking to find a set of fingerprints that
> 1) never (or as close to never as we can get) return a value of 1.0
> for different structures.

I'm not sure that's it's implemented in OpenBabel, but if it's a 2D 
structural descriptor you want, you could give LINGO (Vidal, Thormann, 
and Pons, JCIM 2005; DOI: 10.1021/ci0496797) a shot. I've written a 
LINGO implementation that's primarily targeted at GPUs but has a 
reasonably fast CPU version (https://simtk.org/home/siml). The CPU code 
is not as quick as the fastest DFA-based methods, but it'll handle your 
9000^2 similarities in a matter of seconds.

(PS, it's BSD-licensed, in case anyone would like to integrate it into OB!)

> 2) has a well behaved (or maybe just well documented) relation between
> structure similarity and NCI-60 correlation. I'm not sure what we will
> get here, but I would like to be able to say something like a
> similarity score of>0.9 gives a 80% chance of a NCI-60 correlation of
>   >0.6.

The VTP paper above as well as a later one (DOI: 10.1021/ci6002152) show 
decent correlation with activity; in my experience, similar to any other 
given 2D similarity measure. I'm sure you know this, but "tuning" 
fingerprints to any given small dataset is a dangerous art; it's very 
easy to overfit.



Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
OpenBabel-discuss mailing list

Reply via email to