Just came across this paper in a journal I don't normally read.
http://www.almob.org/content/5/1/9
Results
In this paper, we present a method which efficiently finds all
fingerprints in a database with Tanimoto coefficient to the query
fingerprint above a user defined threshold. The method is based on two
novel data structures for rapid screening of large databases: the kD
grid and the Multibit tree. The kD grid is based on splitting the
fingerprints into k shorter bitstrings and utilising these to compute
bounds on the similarity of the complete bitstrings. The Multibit tree
uses hierarchical clustering and similarity within each cluster to
compute similar bounds. We have implemented our method and tested it
on a large real-world data set. Our experiments show that our method
yields approximately a three-fold speed-up over previous methods.
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss