Francois,

   I apologize upfront if I am not using the correct verbiage (compounds, 
molecules, ...).  I am a software developer writing a new web application for 
the project staff.  The old web application (developed in .NET) stored 370,000+ 
compounds and related information generated from oBabel 2.4 in our MongoDB 
database.  I did find an error that the C#/MongoDB interface did not understand 
Unsigned Integers and stored everything as signed.  Doesn't look like it was a 
big problem because when the data was retrieved it was converted back to 
unsigned before being passed to Tanimoto.

In the new website I am using oBabel 3.1.1 and regenerated the information for 
the 370k+ compounds plus added another 30k+ compounds from a new library we 
have begun to use in the labs.  The fingerprints are generated using fp2 [32 
bit unsigned arrays) via OBFingerprint.  I then use Tanimoto for similarity 
analysis. It takes about 5 seconds to compare a single compound to the 400k+ 
pre-generated fingerprints.

With your questions I will attempt to educate myself a little bit more on 
molecular fingerprints.
Any comments, references, prayers would be appreciated.

--------------------------------------------------------------------
>> Fingerprints being lossy encodings of molecules:
>> it is possible that different molecules end-up
>> with the same fingerprint.
>>
>> If you use an unfolded-counted fingerprint (instead of folded-uncounted, 
>> usually),
>> this "funny" event should occur less frequently.
>>
>> Another possibility might be to use a fingerprints with more bits.
>> Which fingerprint are you using by the way?
>>
>> Regards,
>> F.



_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to