On 02/01/2011 05:37 AM, Daniel Zaharevitz wrote: > We will be > looking to find a set of fingerprints that > 1) never (or as close to never as we can get) return a value of 1.0 > for different structures.
I'm not sure that's it's implemented in OpenBabel, but if it's a 2D structural descriptor you want, you could give LINGO (Vidal, Thormann, and Pons, JCIM 2005; DOI: 10.1021/ci0496797) a shot. I've written a LINGO implementation that's primarily targeted at GPUs but has a reasonably fast CPU version (https://simtk.org/home/siml). The CPU code is not as quick as the fastest DFA-based methods, but it'll handle your 9000^2 similarities in a matter of seconds. (PS, it's BSD-licensed, in case anyone would like to integrate it into OB!) > 2) has a well behaved (or maybe just well documented) relation between > structure similarity and NCI-60 correlation. I'm not sure what we will > get here, but I would like to be able to say something like a > similarity score of>0.9 gives a 80% chance of a NCI-60 correlation of > >0.6. The VTP paper above as well as a later one (DOI: 10.1021/ci6002152) show decent correlation with activity; in my experience, similar to any other given 2D similarity measure. I'm sure you know this, but "tuning" fingerprints to any given small dataset is a dangerous art; it's very easy to overfit. Cheers, Imran ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss