Yep, this is what I think I need. Many thanks Greg!
And thank Toby for the tips too.
Ling
>________________________________
> From: Greg Landrum <[email protected]>
>To: S.L. Chan <[email protected]>
>Cc: "[email protected]"
><[email protected]>
>Sent: Wednesday, October 30, 2013 6:08 AM
>Subject: Re: [Rdkit-discuss] atom equivalence for substructure matching
>
>
>
>Hi Ling,
>
>
>
>On Wed, Oct 30, 2013 at 2:12 AM, S.L. Chan <[email protected]> wrote:
>
>Good evening,
>>
>>
>>I would like to get an exhaustive substructure matching of a molecule onto
>>itself. Generally I could use the GetSubstructMatches function with the
>>"uniquify=False" option. However, if there is a carboxylate or a guanidinium
>>head around, this would give only "one side" of the match since the two
>>oxygens / nitrogens are not considered equivalent:
>>
>>
>>
>>>>> mol = Chem.MolFromSmiles('CC(=O)[O-]')
>>>>> patt = Chem.MolFromSmarts('CC(=O)[O-]')
>>>>> print mol.GetSubstructMatches(patt,uniquify=False)
>>((0,1,2,3),)
>>
>>
>>Now, I suppose I could do an ugly (could in principle match two single bonds)
>>hack to achieve my purpose:
>>
>>>>> mol = Chem.MolFromSmiles('CC(=O)[O-]')
>>>>> patt = Chem.MolFromSmarts('CC(~O)~O')
>>>>> print mol.GetSubstructMatches(patt,uniquify=False)
>>((0,1,2,3), (0,1,3,2))
>>
>>
>>However, this would mean that I would need to manually edit the smarts string
>>for all molecules. I just wonder if there is something similar to the
>>"Kekulize" command that would make the two oxygens equivalent? Or are there
>>other ways around this?
>
>
>This is an interesting question.
>
>
>There's no super-easy way that I can think of to get what you want, but there
>is an approach that will probably work.
>
>
>What you can do is edit the molecule to replace the substructures in question
>with something that gives the appropriate matching behavior.
>Here's one way of doing that which preserves atom types:
>
>
>In [17]: repl = Chem.MolFromSmiles('C(O)O')
>In [18]: repl.GetBondWithIdx(0).SetBondType(Chem.BondType.ONEANDAHALF)
>
>In [19]: repl.GetBondWithIdx(1).SetBondType(Chem.BondType.ONEANDAHALF)
>
>In [20]: m = Chem.MolFromSmiles('CC(C(=O)O)C(C(=O)O)C')
>
>In [21]: m.GetSubstructMatches(m,uniquify=False)
>
>Out[21]: ((0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (9, 5, 6, 7, 8, 1, 2, 3, 4, 0))
>In [25]: nm =
>Chem.ReplaceSubstructs(m,Chem.MolFromSmarts('C(=O)[OH,O-]'),repl,replaceAll=True)
>In [28]: nm[0].GetSubstructMatches(nm[0],uniquify=False)
>Out[28]:
>((0, 1, 2, 3, 4, 5, 6, 7, 8, 9),
> (0, 1, 2, 3, 4, 5, 6, 7, 9, 8),
> (0, 1, 2, 3, 4, 6, 5, 7, 8, 9),
> (0, 1, 2, 3, 4, 6, 5, 7, 9, 8),
> (3, 2, 1, 0, 7, 8, 9, 4, 5, 6),
> (3, 2, 1, 0, 7, 8, 9, 4, 6, 5),
> (3, 2, 1, 0, 7, 9, 8, 4, 5, 6),
> (3, 2, 1, 0, 7, 9, 8, 4, 6, 5))
>
>
>
>
>Note that the problem with this is that it changes the atom numbering. If you
>want to preserve atom numbering, it's a bit more complex:
>
>
>
>
>In [45]: q = Chem.MolFromSmarts('C(=O)[OH,O-]')
>In [46]: m = Chem.MolFromSmiles('CC(C(=O)O)C(C(=O)O)C')
>
>In [48]: qmatch = m.GetSubstructMatches(q)
>
>In [50]: for match in qmatch:
>
> b = m.GetBondBetweenAtoms(match[0],match[1])
> b.SetBondType(Chem.BondType.ONEANDAHALF)
> b = m.GetBondBetweenAtoms(match[0],match[2])
> b.SetBondType(Chem.BondType.ONEANDAHALF)
> m.GetAtomWithIdx(match[2]).SetFormalCharge(0)
> m.GetAtomWithIdx(match[2]).SetNoImplicit(False)
> m.GetAtomWithIdx(match[2]).SetNumExplicitHs(0)
>
>
>In [52]: m.GetSubstructMatches(m,uniquify=False)
>Out[52]:
>((0, 1, 2, 3, 4, 5, 6, 7, 8, 9),
> (0, 1, 2, 3, 4, 5, 6, 8, 7, 9),
> (0, 1, 2, 4, 3, 5, 6, 7, 8, 9),
> (0, 1, 2, 4, 3, 5, 6, 8, 7, 9),
> (9, 5, 6, 7, 8, 1, 2, 3, 4, 0),
> (9, 5, 6, 7, 8, 1, 2, 4, 3, 0),
> (9, 5, 6, 8, 7, 1, 2, 3, 4, 0),
> (9, 5, 6, 8, 7, 1, 2, 4, 3, 0))
>
>
>In all the above I'm showing how to solve the problem for carboxyls. Handling
>other groups is left as an exercise to the reader. ;-)
>
>
>
>Is that doing what you're looking for?
>-greg
>
>
>------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss