Francois, I apologize upfront if I am not using the correct verbiage (compounds, molecules, ...). I am a software developer writing a new web application for the project staff. The old web application (developed in .NET) stored 370,000+ compounds and related information generated from oBabel 2.4 in our MongoDB database. I did find an error that the C#/MongoDB interface did not understand Unsigned Integers and stored everything as signed. Doesn't look like it was a big problem because when the data was retrieved it was converted back to unsigned before being passed to Tanimoto.
In the new website I am using oBabel 3.1.1 and regenerated the information for the 370k+ compounds plus added another 30k+ compounds from a new library we have begun to use in the labs. The fingerprints are generated using fp2 [32 bit unsigned arrays) via OBFingerprint. I then use Tanimoto for similarity analysis. It takes about 5 seconds to compare a single compound to the 400k+ pre-generated fingerprints. With your questions I will attempt to educate myself a little bit more on molecular fingerprints. Any comments, references, prayers would be appreciated. -------------------------------------------------------------------- >> Fingerprints being lossy encodings of molecules: >> it is possible that different molecules end-up >> with the same fingerprint. >> >> If you use an unfolded-counted fingerprint (instead of folded-uncounted, >> usually), >> this "funny" event should occur less frequently. >> >> Another possibility might be to use a fingerprints with more bits. >> Which fingerprint are you using by the way? >> >> Regards, >> F. _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss