As a followup to this question - further questions/problems :)
- Why is AmideN (and SulfonamideN) defined in the BaseFeatures.fdef ?
(I cannot understand how/where these two definitions are used).
- One of the H Bond Donor definitions is AtomType NDonor
[$([Nv3](-C)(-C)-C)] :- but if a Nv3 is connected to 3 C - then there are
no hydrogens. How is this a donor? The v3 according to daylight
means "atom
with bond orders totaling 3 (includes implicit H's)"
- I had a look around and thought the
Contrib/M_Kossner/BaseFeatures_DIP2_NoMicrospecies.fdef file looked more
complete in terms of definition. Unfortunately this file does not load
with BuildFeatureFactory (ValueError). Anyone knows the history of that
file? Or why it came to being?
FACTORY =
ChemicalFeatures.BuildFeatureFactory("/opt/RDKit_2012_12_1/Contrib/M_Kossner/BaseFeatures_DIP2_NoMicrospecies.fdef")
ValueError: pattern->getNumAtoms() != len(feature weight vector)
Many thanks and sorry for the repeated emails,
JP
-
Jean-Paul Ebejer
Early Stage Researcher
On 15 April 2013 17:02, JP <[email protected]> wrote:
> Hi there RDKitters,
>
> I was wondering if there is any reason why the feature factory detects
> NegIonizable (or PosIonizable) as a feature - but not the actual charges
> i.e. Anion (or cation).
>
> If you are doing feature extraction, to build pharmacophoric models, this
> electrostatics data is important. The SMARTS patterns i.e. [+] and [-] and
> subsequent fdef definition are trivial, so why aren't these used?
>
> What am I missing?
>
> # some code, because I am rambling
>
> import rdkit
>
> from rdkit import RDConfig
> from rdkit import Chem
> from rdkit.Chem import ChemicalFeatures
> from rdkit.Chem import AllChem
>
> fdefName = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')
> factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
>
> def testMol(molTxt):
> m = Chem.MolFromSmiles(molTxt)
> feats=factory.GetFeaturesForMol(m)
> print [x.GetFamily() for x in feats]
>
> testMol('C(=O)O')
> testMol('[C-]')
> testMol('[Br-]')
> testMol('[Na+]')
>
> Output (this ipy notebook is awesome):
>
> ['Donor', 'Acceptor', 'Acceptor', 'NegIonizable']
> []
> []
> []
>
>
>
> Many thanks,
>
>
> -
> Jean-Paul Ebejer
> Early Stage Researcher
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss