Hi Abhik,

On Sat, Jul 12, 2014 at 9:38 PM, Abhik Seal <[email protected]> wrote:

>
> I was using postgres cartridge i found there are several implementations
> for chemical features. Some of them i tried like maccs, morganbv_fp i found
> they generate hexadecimal values. So when i convert hexadecimal to binary i
> found maccs has 168 values and for morganvbv_fp it has 512 binary values.
>
>
The size of the MACCS fingerprint comes from its definition: there are a
certain number of defined features that the code searches for.
The Morgan fingerprints, on the other hand, have a variable size selectable
by the user. The default value in the cartridge is, as you have discovered,
512 bits.


> I may be wrong in understading but just to make sure if i am correct or
> not. If i am correct then how can I generate 1024 binary values or it is
> restricted to 512? I found the binary values are different using two
> different radius which is what i expect. Can this binary values be extended
> to 1024 bits or so on.  So if this is the case doesnt it cause error in
> similarity calculation ?
>

 You can change the size of the fingerprints using configuration variables:

contrib_regression=# select morganbv_fp('c1ccccc1C');
                                                            morganbv_fp

------------------------------------------------------------------------------------------------------------------------------------
 
\x00000080020000000100000000000000000000000080000400004000000000000000008000000000000002001000000021000000000000000000000000000000
(1 row)

contrib_regression=# set rdkit.morgan_fp_size=1024;
SET
contrib_regression=# select morganbv_fp('c1ccccc1C');

                                                morganbv_fp


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
\x0000008002000000010000000000000000000000008000000000000000000000000000000000000000000000100000002000000000000000000000000000000000000000000000000000000000000000000000000000000400004000000000000000008000000000000002000000000001000000000000000000000000000000
(1 row)

The options available are:
rdkit.dice_threshold           rdkit.layered_fp_size
rdkit.do_chiral_sss            rdkit.morgan_fp_size
rdkit.featmorgan_fp_size       rdkit.rdkit_fp_size
rdkit.hashed_atompair_fp_size  rdkit.ss_fp_size
rdkit.hashed_torsion_fp_size   rdkit.tanimoto_threshold


Note that a change to a configuration variable as done here only affects
the current session. If you want to make it the default for the database as
a whole you need to change the database configuration:

contrib_regression=# alter database contrib_regression set
rdkit.morgan_fp_size=1024;
ALTER DATABASE

Then disconnect (close psql) and reconnect to pick up the new setting.

I hope this helps,
-greg
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to