> > I'm wondering about the total number of accessible descriptors in
RDKit:
> >
> > This is is my code:
> > "
> > import sys
> > from rdkit import Chem
> > from rdkit.Chem import Descriptors
> > from rdkit.ML.Descriptors import MoleculeDescriptors
> >
> > file_in = sys.argv[1]
> > file_out = file_in+".descr.sdf"
> > ms = [x for x in Chem.SDMolSupplier(file_in) if x is not None]
> > ms_wr = Chem.SDWriter(file_out)
> >
> > nms=[x[0] for x in Descriptors._descList]
> > #nms.remove('MolecularFormula')
> > print len(Descriptors._descList)
> >
> >
> > calc = MoleculeDescriptors.MolecularDescriptorCalculator(nms)
> >
> > for i in range(len(ms)):
> > descrs = calc.CalcDescriptors(ms[i])
> > for x in range(len(descrs)):
> > ms[i].SetProp(str(nms[x]),str(descrs[x]))
> > ms_wr.write(ms[i])
> > "
> >
> > This gives me 93 descriptors in total.
> >
> > A brief look and count in the Python API
> > http://www.rdkit.org/docs/api/rdkit.Chem.Descriptors-module.html
> > ends up in more than 170 descriptors.
> >
> > Another brief look (no time to grasp in more depth) reveals that
apparently
> > the fr_* descriptors have not been calculated.
> >
> > What did I do wrong?
>
> I don't see anything obvious, but you are definitely getting
> incorrect results.
> Here's what I see:
>
> In [16]: from rdkit import Chem
> In [17]: from rdkit.ML.Descriptors import MoleculeDescriptors
> In [18]: from rdkit.Chem import Descriptors
> In [19]: len(Descriptors._descList)
> Out[19]: 177
> In [20]: calc =
> MoleculeDescriptors.MolecularDescriptorCalculator([x[0] for x in
> Descriptors._descList])
> In [21]: len(calc.GetDescriptorNames())
> Out[21]: 177
> In [22]: m = Chem.MolFromSmiles('c1ccccc1OC')
> In [23]: ds = calc.CalcDescriptors(m)
> In [24]: len(ds)
> Out[24]: 177
>
> Just to eliminate some uncertainty, can you please try the above
> commands and, if you don't see 177, add this:
>
> In [25]: from rdkit import rdBase
> In [26]: rdBase.rdkitVersion
> Out[26]: '2012.12.1pre'
>
> Thanks.
> -greg
Hi Greg,
here we go:
In [4]: from rdkit import Chem
from rdkit import rdBase
from rdkit.ML.Descriptors import MoleculeDescriptors
from rdkit.Chem import Descriptors
len(Descriptors._descList)
Out[4]: 93
In [6]: calc = MoleculeDescriptors.MolecularDescriptorCalculator([x[0] for
x in Descriptors._descList])
len (calc.GetDescriptorNames())
Out[6]: 93
In [7]: m = Chem.MolFromSmiles('c1ccccc1OC')
ds = calc.CalcDescriptors(m)
len(ds)
Out[7]: 93
In [8]: print rdBase.rdkitVersion
2012.09.1beta
Now you see that we are still using the Q3 beta :)
Only solution: Upgrading to the stable version? Or is there a workaround
available in conjunction with the Q3 beta?
Cheers & Thanks,
Paul
This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.
Click http://www.merckgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss