Hi there RDKitters,
I was wondering if there is any reason why the feature factory detects
NegIonizable (or PosIonizable) as a feature - but not the actual charges
i.e. Anion (or cation).
If you are doing feature extraction, to build pharmacophoric models, this
electrostatics data is important. The SMARTS patterns i.e. [+] and [-] and
subsequent fdef definition are trivial, so why aren't these used?
What am I missing?
# some code, because I am rambling
import rdkit
from rdkit import RDConfig
from rdkit import Chem
from rdkit.Chem import ChemicalFeatures
from rdkit.Chem import AllChem
fdefName = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
def testMol(molTxt):
m = Chem.MolFromSmiles(molTxt)
feats=factory.GetFeaturesForMol(m)
print [x.GetFamily() for x in feats]
testMol('C(=O)O')
testMol('[C-]')
testMol('[Br-]')
testMol('[Na+]')
Output (this ipy notebook is awesome):
['Donor', 'Acceptor', 'Acceptor', 'NegIonizable']
[]
[]
[]
Many thanks,
-
Jean-Paul Ebejer
Early Stage Researcher
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss