Francois,
Guess I have some light reading during the holidays. Thanks again for the
education and reference.
>> Here is a venerable reference on molecular fingerprints:
>> https://www.daylight.com/dayhtml/doc/theory/theory.finger.html
___
Open
On 06/12/2021 22:59, Wolcott, Chris (NIH/NCI) [C] wrote:
Francois,
I apologize upfront if I am not using the correct verbiage
(compounds, molecules, ...). I am a software developer writing a new
web application for the project staff. The old web application
(developed in .NET) stored 370,00
Andrew,
Wow, thank you for the detailed reply.
I am happy with the current processing time of 5 secs to compare 400,000+
fingerprints, but I will look at the stack overflow discussion. I am pretty
well versed in MongoDB and hadn't thought about calculating it fully in MongoDB.
I will
Hi Chris,
The FP2 fingerprint works along these lines:
1) Choose a fingerprint size 'n', which is a power of 2.
2) Allocate a vector of w = n/32 words to store the bitstring
3) For each linear subpath up to length 7 (these correspond
to n-grams for words):
a) use a hash based on the atom
Francois,
I apologize upfront if I am not using the correct verbiage (compounds,
molecules, ...). I am a software developer writing a new web application for
the project staff. The old web application (developed in .NET) stored 370,000+
compounds and related information generated from oBab
Dear Chris,
Fingerprints being lossy encodings of molecules:
it is possible that different molecules end-up
with the same fingerprint.
If you use an unfolded-counted fingerprint (instead of folded-uncounted,
usually),
this "funny" event should occur less frequently.
Another possibility might
Is it expected or is there any easy explanation why three different smiles
create the same fingerprint? Are compounds come from the same synthetic
library.
1st Compound
Canonical Smile:
O=C1N[C@H]2C[C@H](N(C2)Cc2ccncc2)C(=O)N2CCO[C@@H](C2)CN(C[C@H]2O[C@@H](C1)[C@H](O)[C@@H]2O)C(=