On Jun 16, 2015, at 10:20 PM, Peter Shenkin wrote:
> [N-]=[N+]=NC(=O)N1C(=O)N([N+]([O-])=O)C2(C13C4=C56)C4=C5C2=C36
> [N-]=[N+]=NC(=O)N(C(=O)N1[N+]([O-])=O)C(c23)(c4c56)C16c3c5c24
> 
> rdkit canonicalizes the two to the following, respectively:
> 
> [N-]=[N+]=NC(=O)N1C(=O)N([N+](=O)[O-])C23c4c5c2c2c-5c4C213
> [N-]=[N+]=NC(=O)N1C(=O)N([N+](=O)[O-])C23c4c5c6c(c2c4=6)C513


> I believe these represent the same structure, with the following caveat:
> 
> It is not impossible that the two SMILES actually code for different
> structures in some subtle way. I've tried visualizing them in several
> packages, however, and I've not been able to find a difference.

I've found SMARTSviewer to be an excellent way to help resolve these problems,
because it doesn't try to do any aromaticity perception. Oddly though, it
fails on the second of the 4 SMILES saying "SMARTS syntax is not correct".
I can't figure out why.

BTW, to help it out, you can ask RDKit to include all of the bond information,
as otherwise it will use the "single-or-aromatic" notation.

>>> from rdkit import Chem

>>> mol1 = 
>>> Chem.MolFromSmiles("[N-]=[N+]=NC(=O)N1C(=O)N([N+]([O-])=O)C2(C13C4=C56)C4=C5C2=C36")
>>> Chem.MolToSmiles(mol1, allBondsExplicit=True)
'[N-]=[N+]=N-C(=O)-N1-C(=O)-N(-[N+](=O)-[O-])-C23-c4:c5:c-2:c2:c-5:c:4-C-2-1-3'

>>> mol2 = 
>>> Chem.MolFromSmiles("[N-]=[N+]=NC(=O)N(C(=O)N1[N+]([O-])=O)C(c23)(c4c56)C16c3c5c24")
>>> Chem.MolToSmiles(mol2, allBondsExplicit=True)
'[N-]=[N+]=N-C(=O)-N1-C(=O)-N(-[N+](=O)-[O-])-C23-c4:c5:c6:c(:c-2:c:4=6)-C-5-1-3'



I don't know how it is that RDKit adds a double bond to the second cubane,
given only aromatic carbons and single-or-aromatic bonds in the original
SMILES.

                                Andrew
                                [email protected]



------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to