There have been a couple of good answers on this already.
Here's some additional technical information:

Conversion of the atom environment into a bit position is a two step
process:
  1) A hash is generated for the atom environment.
  2) When necessary (i.e. for bit vectors and explicit count vectors), the
hash is "fit" into the space of the fingerprint using the modulo operator.

The code for the generation of the hash for step 1 isn't really easy to
point to. Since the hashes for larger radii build on the hashes for earlier
radii, construction of the hash is spread throughout the definition of
calcFingerprint()
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Fingerprints/MorganFingerprints.cpp#L182
The actual hashing code that's used is a verson of boost::hash (
http://www.boost.org/doc/libs/1_60_0/doc/html/hash.html) that was forked
and integrated into the RDKit a while ago. It's been tweaked to generate
the same results on both 32bit and 64bit machines. If you really want to
get into details, the implementation is here:
https://github.com/rdkit/rdkit/tree/master/Code/RDGeneral/hash

Step 2 is done here for bit vectors:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/Fingerprints/MorganFingerprints.cpp#L173



On Thu, Oct 6, 2016 at 11:23 AM, Jacob Gora <[email protected]>
wrote:

> Hi,
>
> is there any information on how RDkit creates bitvectors from circular
> fingerprints?
> As the theoretic featurespace is too big for storage and the default
> feature space used in RDkit, when converting is only 2048, there must be
> some kind of
> information loss (and compression?).
>
> Can anyone explain how this is handled in detail?
> What features are used for the BV in the end, how is it decided on.
>
> Regards
> Jacob
>
>
>
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to