Hi, Greg, Within the SMILES framework, it seems to me that if you allow the atoms to be aromatic, then these are two Kekule structures of the same aromatic system, and however you do the canonicalization, they ought to canonicalize to the same structure, which the two examples did not do. I don't think you addressed this.
I think now that there is no issue with having a double bond between two aromatic atoms beyond our preconceptions. If that is a problem, you could Kekulize it per your first picture, (though perhaps that is inconvenient in the context of the implementation). I actually didn't realize why aromaticity (particularly the double bond) made sense when I originally wrote, so the above is with the benefit of hindsight, and your comments. I think the molecule is entertaining in several ways. In the cubane geometry, the molecule cannot be conventionally aromatic. Might it actually be antiaromatic? Could there be two forms? Dunno.... -P. On Wed, Jun 17, 2015 at 1:25 AM, Greg Landrum <[email protected]> wrote: > > > The problematic part of your two molecules can be reduced to: > [image: Inline image 3] > and > [image: Inline image 4] > That second one shows the kekulized form that the RDKit ends up using. > > These produce the following canonical SMILES: > > In [31]: Chem.CanonSmiles('C1=CC2=CC=C12') > Out[31]: 'c1cc2ccc1-2' > > In [32]: Chem.CanonSmiles('C1=CC2=C1C=C2') > Out[32]: 'c1cc2ccc1=2' > >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

