On Thu, Dec 14, 2017 at 6:35 AM, Francois BERENGER <
[email protected]> wrote:
> On 12/14/2017 02:10 PM, Greg Landrum wrote:
> >
> > On Thu, Dec 14, 2017 at 4:22 AM, Francois BERENGER
> > <[email protected] <mailto:[email protected]>>
> > wrote:
> >
> > On 12/14/2017 05:15 AM, Sundar wrote:
> > > Hi RDkit users,
> > >
> > > I encounter following sanitize issue while I was trying to load an
> SD
> > > file using
> > > Chem.SDMolSupplier('lig.sdf')
> > >
> > > Explicit valence for atom # 16 N, 4, is greater than permitted
> > > ERROR: Could not sanitize molecule ending on line 3145
> >
> > I also encounter this exact error sometimes.
> >
> > Is there a way to tell rdkit to automatically correct this atom type?
> >
> >
> > The code currently only automatically corrects cases where it's really,
> > really obvious what the correction should be, like C-N(=O)=O ->
> > C-[N+](=O)[O-].
>
> Where is this in the code?
> I might have a look one day.
>
It's here:
https://github.com/rdkit/rdkit/blob/master/Code/GraphMol/MolOps.cpp#L194
>
> > The philosophy taken in the RDKit is that it's better to have a bad
> > structure be rejected than it is to try and learn from it.
> > If you disagree with this, it is pretty easy to switch off the
> > sanitization checks and keep the bad molecules.
>
> I understand. I also guess unsanitized molecules would make some things
> crash, just later.
That depends. You can turn off the strict property checking:
In [*2*]: m = Chem.MolFromSmiles('C1CCN1(C)C')
[08:09:23] Explicit valence for atom # 3 N, 4, is greater than permitted
In [*3*]: m = Chem.MolFromSmiles('C1CCN1(C)C',sanitize=*False*)
In [*6*]: m.UpdatePropertyCache(strict=*False*)
In [*7*]:
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_PROPERTIES)
Out[*7*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
In [*8*]: Chem.MolToSmiles(m)
Out[*8*]: 'CN1(C)CCC1'
or if you want to be more aggressive you can also turn off the cleanup that
"fixes" those odd structures:
In [*9*]: m = Chem.MolFromSmiles('CCCN(=O)=O',sanitize=*False*)
In [*10*]: m.UpdatePropertyCache(strict=*False*)
In [*11*]:
Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ALL^Chem.SANITIZE_PROPERTIES^Chem.SANITIZE_CLEANUP)
Out[*11*]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
In [*12*]: Chem.MolToSmiles(m)
Out[*12*]: 'CCCN(=O)=O'
In either case, many standard molecular operations should still work,
you'll just be operating on molecules with atoms in unreasonable valence
states.
-greg
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss