Hmm, well - probably not, you mention the always present exception in
chemistry, Peter (Sulfoxides have a similar situation, stereochemistry
from lone pairs). But generally I still think it is more dangerous to
keep or even perceive (from 3D) stereochemistry on three-coordinated N
- you will do more harm with this than fix things.



On Thu, Aug 20, 2015 at 6:40 PM, Peter Shenkin <[email protected]> wrote:
> "My initial answer, and I would love input on this, is that three-coordinate
> N should always have stereochemistry removed."
>
> Umm... even if it's a bridgehead?
>
> -P.
>
> On Thu, Aug 20, 2015 at 10:30 AM, Greg Landrum <[email protected]>
> wrote:
>>
>> This isn't a simple one, so it may take a bit to get to an answer that's
>> comprehensible.
>>
>> There are two things going on here in the RDKit:
>> 1) Ring stereochemistry
>> 2) stereochemistry about nitrogen centers
>>
>> Let's start with the second, because it's easier: RDKit does not generally
>> "believe in" stereochemistry around three coordinate nitrogens. Here's a
>> very simple example:
>> In [45]: m3 = Chem.MolFromSmiles('Br[N@](F)Cl')
>>
>> In [46]: Chem.MolToSmiles(m3,isomericSmiles=True)
>> Out[46]: 'FN(Cl)Br'
>>
>>
>> The 3D equivalent of that:
>> In [41]: m = Chem.MolFromSmiles('BrN(F)Cl')
>>
>> In [42]: AllChem.EmbedMolecule(m)
>> Out[42]: 0
>>
>> In [43]: Chem.AssignAtomChiralTagsFromStructure(m)
>>
>> In [44]: Chem.MolToSmiles(m,isomericSmiles=True)
>> Out[44]: 'FN(Cl)Br'
>>
>> Contrast this with what you get for a carbon:
>>
>> In [34]: m2 = Chem.MolFromSmiles('FC(Br)(Cl)I')
>>
>> In [35]: AllChem.EmbedMolecule(m2)
>> Out[35]: 0
>>
>> In [36]: Chem.AssignAtomChiralTagsFromStructure(m2)
>>
>> In [37]: Chem.MolToSmiles(m2,isomericSmiles=True)
>> Out[37]: 'F[C@](Cl)(Br)I'
>>
>>
>> Back to the first: ring stereochemistry. By this I mean things like
>> C[C@H]1CC[C@@H](C)CC1 - molecules where the stereochemistry information is
>> really about whether the substituents of the ring are cis or trans relative
>> to the ring plane.
>>
>> The way the RDKit handles this is something of a hack: it doesn't identify
>> those atoms as chiral centers, but it does preserve the chiral tags when
>> generating a canonical SMILES:
>>
>> In [47]: m = Chem.MolFromSmiles('C[C@H]1CC[C@@H](C)CC1')
>>
>> In [48]: Chem.FindMolChiralCenters(m)
>> Out[48]: []
>>
>> In [49]: Chem.MolToSmiles(m,isomericSmiles=True)
>> Out[49]: 'C[C@H]1CC[C@@H](C)CC1'
>>
>> Curiously, to me at least, it does the same thing with nitrogens;
>>
>> In [52]: m2 = Chem.MolFromSmiles('C[N@@]1CC[C@@H](C)CC1')
>>
>> In [53]: Chem.MolToSmiles(m2,isomericSmiles=True)
>> Out[53]: 'C[C@H]1CC[N@](C)CC1'
>>
>> Lest anyone think that this might make sense because being a ring makes
>> inversion more difficult, that's not what is going on here. If I make the
>> ring truly chiral, then the stereochemistry of the N is removed:
>>
>> In [54]: m3 = Chem.MolFromSmiles('C[N@@]1CO[C@@H](C)CC1')
>>
>> In [55]: Chem.MolToSmiles(m3,isomericSmiles=True)
>> Out[55]: 'C[C@H]1CCN(C)CO1'
>>
>> I believe that this inconsistent behavior is a bug: either N should always
>> have the input stereochemistry preserved (and that should be perceived from
>> the 3D coordinates) or it should never have the input stereochemistry
>> preserved. My initial answer, and I would love input on this, is that
>> three-coordinate N should always have stereochemistry removed.
>>
>> -greg
>>
>>
>>
>> On Thu, Aug 20, 2015 at 2:22 PM, Rob Smith <[email protected]> wrote:
>>>
>>> Hi Greg,
>>>
>>> I've attached the SDF that Corina generates. I'm not convinced it is a
>>> problem, more an observation that I'm trying to understand.
>>>
>>> Looking at the results again today - it seems that from the Corina output
>>> Indigo is interpreting the conformer (including whether the ethyl
>>> substituent on the piperidine nitrogen is equatorial or axial) - and
>>> outputting a canonical smiles string that has the conformer "encoded" in it
>>> (using the chiral flags). Whereas RDKit is reading in the Corina output,
>>> "discounting" whether the nitrogen is axial or equatorial (which due to
>>> inversion I can understand) and interpreting it as having only two chiral
>>> centers (which is correct).
>>>
>>> What is confusing me, is that when I supply RDKit with the canonical
>>> smiles string from Indigo (which has the conformer "encoded" in it), and
>>> then ask for the isomeric canonical smiles, it supplies the canonical smiles
>>> with the conformer still "encoded" within it.
>>>
>>> For example, I read in the following canonical smiles string into RDKit:
>>> CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1 (which was generated by reading
>>> in one of the mols in the SD File into RDKit and output the isomeric
>>> canonical smiles), running the FindMolChiralCenters on this molecule,
>>> correctly reports the number of chiral centres to be 2 (6S, 9R), and then
>>> asking it to output the canonical smiles string (with isomericSmiles=True)
>>> gives CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1 (1).
>>>
>>> If I take the same mol file, read it into Indigo, and ask it to output
>>> the canonical smiles string, I get:
>>> CC(C)[C@H]1CC[N@H+]1[C@@H]1CC[N@@](CC1)CC, if I read this smiles string into
>>> RDKit and run FindMolCenters on it, I get (3R, 6S) - which is fine, if I
>>> then out the canonical smiles (again with isomericSmiles=True) I get
>>> CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1. I expected this isomeric
>>> canonical smiles to be the same as (1), however RDKit appears to conserve
>>> the conformer representation given to it from an isomeric smiles string, but
>>> when reading a Mol file doesn't keep all conformer information (axial or
>>> equatorial substituents on a nitrogen).
>>>
>>> Thanks to all for your quick (and quick witted) responses
>>>
>>> Rob
>>>
>>>
>>> On Thu, Aug 20, 2015 at 3:46 AM, Greg Landrum <[email protected]>
>>> wrote:
>>>>
>>>> Hi Rob,
>>>>
>>>> The results below are quite strange. As John has already pointed out:
>>>> there really shouldn't be chirality present on either the N+ or the C that
>>>> has two methyls attached.
>>>>
>>>> I tried to reproduce the problem by running corina myself using the same
>>>> command-line options you provided (from SMILES instead of SDF, but I don't
>>>> think that should make a difference), but I get sensible results;
>>>>
>>>> In [5]: s = Chem.SDMolSupplier('sample.sdf')
>>>>
>>>> In [6]: for m in s:
>>>>     Chem.AssignAtomChiralTagsFromStructure(m)
>>>>     Chem.AssignStereochemistry(m,cleanIt=True,force=True)
>>>>    ...:     print Chem.MolToSmiles(m,True)
>>>>    ...:
>>>> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>>>>
>>>> In [7]: s = Chem.SDMolSupplier('sample.sdf')
>>>>
>>>> In [8]: for m in s:
>>>>     Chem.AssignAtomChiralTagsFromStructure(m)
>>>>     print Chem.MolToSmiles(m,True)
>>>>    ...:
>>>> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>>>> CCN1CCC([N@H+]2CC[C@H]2C(C)C)CC1
>>>>
>>>>
>>>> Could you please send the SDF that corina generates so I can try to
>>>> reproduce the problem (or at least try to understand what's gong on) from
>>>> that?
>>>>
>>>> Thanks,
>>>> -greg
>>>>
>>>> On Wed, Aug 19, 2015 at 3:00 PM, Rob Smith <[email protected]> wrote:
>>>>>
>>>>> Dear RDKit community,
>>>>>
>>>>> I'm trying to use RDKit to read in Corina generated stereoisomers (from
>>>>> a Mol file), assign chiral tags and stereochemistry to the structure and
>>>>> output the canonical smiles string for each isomer of a given molecule (in
>>>>> Python), when I do this, half the canonical smiles strings are not unique.
>>>>>
>>>>> When I read in the output from Corina into an Indigo instance, then use
>>>>> the canonical smiles from Indigo to create an RDKit molecule, canonical
>>>>> smiles strings generated from the molecule objects are all unique.
>>>>>
>>>>> I may be missing an option to enable RDKit to 'visualise' the chiral
>>>>> centre adjacent to the protonated nitrogen, so if someone can spot where
>>>>> I've made a mistake, I'd really appreciate it. I've included the output 
>>>>> and
>>>>> Python script below. If you require any further information, please let me
>>>>> know.
>>>>>
>>>>> Many thanks,
>>>>> Rob
>>>>>
>>>>> Output:
>>>>>
>>>>> RDKit Read in of Molecule
>>>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@@H+]2CC[C@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>>>> RDKit Output -  CCN1CC[C@@H]([N@H+]2CC[C@H]2[C@H](C)C)CC1
>>>>>
>>>>> INDIGO Read in of Molecule
>>>>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@H]([N@@H+]2CC[C@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>>>> RDKit Output -  CC[N@]1CC[C@H]([N@H+]2CC[C@H]2C(C)C)CC1
>>>>>
>>>>> Python script :
>>>>>
>>>>> from rdkit import Chem
>>>>> import subprocess # Used to run Corina
>>>>> from indigo import *
>>>>>
>>>>> def runCorinaTest(inputMol):
>>>>>     indigo = Indigo()
>>>>>
>>>>>     molFile = Chem.MolToMolBlock(inputMol)
>>>>>
>>>>>     corinaCommand = "echo \'" + molFile + "\' | "
>>>>>     # Then Corina - generate stereoisomers...
>>>>>     corinaCommand = corinaCommand + "/apps/corina/corina -t n -d
>>>>> canon,stergen,preserve,names,wh,flapn,msc=7,msi=128 -i t=sdf"
>>>>>     corinaResult = subprocess.check_output([corinaCommand], shell=True)
>>>>> # Gives the stereoisomer species as an SDF string
>>>>>
>>>>>     allMoleculeObjects = []
>>>>>     allMolecules = corinaResult.split("$$$$\n") # Separate Corina
>>>>> output into individual molecules
>>>>>     allMolecules = allMolecules[0:len(allMolecules)-1]
>>>>>
>>>>>     print("RDKit Read in of Molecule")
>>>>>
>>>>>     for eachMolecule in allMolecules:
>>>>>         eachMolecule = eachMolecule + "$$$$\n"
>>>>>         mol = Chem.MolFromMolBlock(eachMolecule, sanitize=True,
>>>>> removeHs=True, strictParsing=False)
>>>>>         Chem.rdmolops.AssignAtomChiralTagsFromStructure(mol,
>>>>> replaceExistingTags=True)
>>>>>         Chem.rdmolops.AssignStereochemistry(mol)
>>>>>         print("RDKit Output -  " + Chem.MolToSmiles(mol,
>>>>> isomericSmiles=True))
>>>>>
>>>>>     print("INDIGO Read in of Molecule")
>>>>>     for eachMolecule in allMolecules:
>>>>>         eachMolecule = eachMolecule + "$$$$\n"
>>>>>         mol = indigo.loadMolecule(eachMolecule)
>>>>>         # print("Indigo Output - " + mol.canonicalSmiles())
>>>>>         # Use Indigo Canonical Smiles to create RDKit molecule
>>>>>         mol = Chem.MolFromSmiles(mol.canonicalSmiles())
>>>>>         if mol is not None:
>>>>>             print("RDKit Output -  " + Chem.MolToSmiles(mol,
>>>>> isomericSmiles=True))
>>>>>
>>>>>     return 0
>>>>>
>>>>> mol = Chem.MolFromSmiles("CC(C)C1[NH+](C2CCN(CC)CC2)CC1")
>>>>> z = runCorinaTest(mol)
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>> Rdkit-discuss mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>>>
>>>>
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to