On Mon, Jul 4, 2011 at 5:49 PM, JP <[email protected]> wrote: > > Is there any obvious reason why the following 8 molecules in the attached > sdf file give a bunch of "kekulization" errors ? > Is it something to do with these lines - > M CHG 2 4 1 13 -1 > M STY 1 1 DAT > M SAL 1 1 7 > M SDT 1 MRV_IMPLICIT_H > M SDD 1 0.0000 0.0000 DR ALL 0 0 > M SED 1 IMPL_H1 > (using the now old 2010_12_1)
Nope. Those molecules all contain bonds with bond-order set to aromatic and nitrogen-containing aromatic heterocycles where one (normally arbitrary) N needs an explicit H to make the ring aromatic. There are two problems here: 1) aromatic bond orders really shouldn't be used in SD files that aren't for query molecules. 2) The RDKit cannot figure out which N should have the H attached and, as is typical of the RDKit, doesn't try to guess. A method for randomly picking a tautomer that can be kekulized is described in these threads: http://www.mail-archive.com/[email protected]/msg01162.html http://www.mail-archive.com/[email protected]/msg01185.html Either fixing the SDF or running the "random" tautomer generator shoudl work. -greg ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

