Hi Steve,
On Tue, Jun 3, 2014 at 2:08 PM, Stephen O'hagan <[email protected]>
wrote:
> I have a fragment of code generating fingerprints for a long list of
> molecules (length ~ 1000)
>
>
>
> for index in range(0,len(smi)):
>
> smiles=smi[index]
>
> mol=Chem.MolFromSmiles(smiles)
>
> AllChem.EmbedMolecule(mol)
>
> AllChem.UFFOptimizeMolecule(mol)
>
> dm = Chem.Get3DDistanceMatrix(mol)
>
> fp = Generate.Gen2DFingerprint(mol,factory, dMat=dm)
>
> fp = fp.ToBitString()
>
> bs[index]=fp
>
>
>
> The length of each bitvectors generated is 39972, and the list has a lot
> of redundant ‘1’s and ‘0’s.
>
>
>
> Is there an easy method to filter out these redundant bits?
>
What do you mean by redundant bits?
The length of the bit vectors is determined by the parameters you provide
for building the pharmacophore fingerprints (number of points, number of
features, and number of distance bins). The length of the strings that you
get from fp.ToBitString() should be equal to this number of bits.
-greg
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss