On Wed, May 29, 2013 at 8:06 PM, Syeda Sabrina <[email protected]> wrote:

> Hi Greg,
>
> I was trying the following reaction in rdkit and it returns a product's
> smiles that is invalid and can't be converted back into a valid molecule.
> Any thoughts what's missing in my code?
>
> >>rxn =
> AllChem.ReactionFromSmarts('[#6:5][CH1:6]([#6:4])[NH1:2][C:7]([#6:8])=[O:9]>>[#6:8][C:7]#[N:2].[#6:4][C:6]=[C:5].[#8:9]')
>
> >> reactants =  [Chem.MolFromSmiles('CC(NC(=O)C(N)F)C(C)=O')]
>
> >> ps = rxn.RunReactants(tuple(reactants))
>
> >> or p in ps:
>     ...:     for m in p:
>     ...:         print Chem.MolToSmiles(m, isomericSmiles = True)
>     ...:
>
> N#CC(N)F
> C=CC(C)=O
> O
> N#CC(N)F
> CC=[C](C)=O
> O
>
> As you can see CC=[C](C)=O is an invalid smiles ( carbon has explicit
> valence greater than permitted) which you can't convert back into a valid
> molecule.
>
>
It is, technically, a perfectly valid SMILES, it just corresponds to an
unstable molecule that the RDKit won't sanitize.

The molecules that come back from RunReactants have not been sanitized. If
you want to check that they can be,  you can do something like this:


In [16]: rxn =
AllChem.ReactionFromSmarts('[#6:5][CH1:6]([#6:4])[NH1:2][C:7]([#6:8])=[O:9]>>[#6:8][C:7]#[N:2].[#6:4][C:6]=[C:5].[#8:9]')

In [17]: reactants =  (Chem.MolFromSmiles('CC(NC(=O)C(N)F)C(C)=O'),)

In [18]: ps = rxn.RunReactants(reactants)

In [19]: passedps=[]

In [20]: for pset in ps:
   ....:     ok = True
   ....:     for p in pset:
   ....:         try:
   ....:             Chem.SanitizeMol(p)
   ....:         except:
   ....:             ok=False
   ....:             break
   ....:     if ok:
   ....:         passedps.append(pset)
   ....:
[05:48:26] Explicit valence for atom # 2 C, 5, is greater than permitted

In [21]: len(passedps)
Out[21]: 1


A more specific solution for this particular reaction is to be a little bit
more restrictive in what can match so that you don't encounter the problem
in the first place. Something like this may work:

In [22]: rxn =
AllChem.ReactionFromSmarts('[#6&!H0:5][CH1:6]([#6:4])[NH1:2][C:7]([#6:8])=[O:9]>>[#6:8][C:7]#[N:2].[#6:4][C:6]=[C:5].[#8:9]')

In [23]: reactants =  (Chem.MolFromSmiles('CC(NC(=O)C(N)F)C(C)=O'),)

In [24]: ps = rxn.RunReactants(reactants)

In [25]: for pset in ps:
    for p in pset:
        print Chem.MolToSmiles(p)
    print '------------'
   ....:
N#CC(N)F
C=CC(C)=O
O
------------

Adding that "!H0" query to C:5 ensures that there is an H to be removed
when you form the double bond to C:6.

Hope this helps,
-greg
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to