Dear Michal,

On Mon, Nov 18, 2013 at 10:55 AM, Michal Krompiec <[email protected]
> wrote:

> Hello,
> Substructure matching with SMARTS behaves strangely sometimes - see code
> below.
> The pattern with [H] matches, but the pattern with [H,F] does not
> (both should match).
>
> from rdkit import Chem
> mol=Chem.MolFromSmiles('Clc2sccc2[H]')
> mol=Chem.AddHs(mol)
> p1=Chem.MolFromSmarts('c2sccc2[H]')
> p2=Chem.MolFromSmarts('c2sccc2[H,F]')
> print(mol.HasSubstructMatch(p1))
> print(mol.HasSubstructMatch(p2))
>

The problem is that an "H" in SMARTS normally doesn't mean what you think
it does.

You can see what's going on using MolToSmarts:
>>> print(Chem.MolToSmarts(p1))
c1:,-s:,-c:,-c:,-c:,-1-,:[#1]
>>> print(Chem.MolToSmarts(p2))
c1:,-s:,-c:,-c:,-c:,-1-,:[H1,F]

In the second case, the H has been converted into a query for an atom that
has exactly one H attached. The only time that the symbol "H" in a query is
interpreted as "an atom with atomic number 1" is when it shows up as "[H]",
as in your first example.

The safest way to deal with H in SMARTS is to use [#1]:
>>> p1=Chem.MolFromSmarts('c2sccc2[#1]')
>>> p2=Chem.MolFromSmarts('c2sccc2[#1,F]')
>>> print(mol.HasSubstructMatch(p1))
True
>>> print(mol.HasSubstructMatch(p2))
True

This is confusing, but it corresponds (at least I think it does) to the
"spec" from Daylight:
http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html

-greg
------------------------------------------------------------------------------
DreamFactory - Open Source REST & JSON Services for HTML5 & Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to