Hi Abhik,
On Sat, Jul 12, 2014 at 9:38 PM, Abhik Seal <[email protected]> wrote:
>
> I was using postgres cartridge i found there are several implementations
> for chemical features. Some of them i tried like maccs, morganbv_fp i found
> they generate hexadecimal values. So when i convert hexadecimal to binary i
> found maccs has 168 values and for morganvbv_fp it has 512 binary values.
>
>
The size of the MACCS fingerprint comes from its definition: there are a
certain number of defined features that the code searches for.
The Morgan fingerprints, on the other hand, have a variable size selectable
by the user. The default value in the cartridge is, as you have discovered,
512 bits.
> I may be wrong in understading but just to make sure if i am correct or
> not. If i am correct then how can I generate 1024 binary values or it is
> restricted to 512? I found the binary values are different using two
> different radius which is what i expect. Can this binary values be extended
> to 1024 bits or so on. So if this is the case doesnt it cause error in
> similarity calculation ?
>
You can change the size of the fingerprints using configuration variables:
contrib_regression=# select morganbv_fp('c1ccccc1C');
morganbv_fp
------------------------------------------------------------------------------------------------------------------------------------
\x00000080020000000100000000000000000000000080000400004000000000000000008000000000000002001000000021000000000000000000000000000000
(1 row)
contrib_regression=# set rdkit.morgan_fp_size=1024;
SET
contrib_regression=# select morganbv_fp('c1ccccc1C');
morganbv_fp
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
\x0000008002000000010000000000000000000000008000000000000000000000000000000000000000000000100000002000000000000000000000000000000000000000000000000000000000000000000000000000000400004000000000000000008000000000000002000000000001000000000000000000000000000000
(1 row)
The options available are:
rdkit.dice_threshold rdkit.layered_fp_size
rdkit.do_chiral_sss rdkit.morgan_fp_size
rdkit.featmorgan_fp_size rdkit.rdkit_fp_size
rdkit.hashed_atompair_fp_size rdkit.ss_fp_size
rdkit.hashed_torsion_fp_size rdkit.tanimoto_threshold
Note that a change to a configuration variable as done here only affects
the current session. If you want to make it the default for the database as
a whole you need to change the database configuration:
contrib_regression=# alter database contrib_regression set
rdkit.morgan_fp_size=1024;
ALTER DATABASE
Then disconnect (close psql) and reconnect to pick up the new setting.
I hope this helps,
-greg
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss