Hi Michal,
The first test that I do when things like this happen is to try using the
RDKit's default behavior and checking if I get the results I expect. In
this case I do: if I read the molecule from the mol block with sanitization
then I get a match.
Now the question is why it's not happening with the partial sanitization
you are doing. The answer there lies in the fact that the SMARTS includes
aromatic atoms and bonds, but the molecule - as you are constructing it -
does not. CTAB's do not (should not) contain info about aromaticity, so the
molecule you construct includes single and double bonds, not aromatic
bonds.
If you are reading in mol blocks, I really do recommend that you go ahead
and use the sanitization, the MDL chemistry model that one gets out of mol
blocks is quite different from the rest of the RDKit.
I hope this helps,
-greg
On Tue, Nov 1, 2016 at 6:20 PM, Michał Nowotka <[email protected]> wrote:
> Hi,
>
> I have this molfile (CHEMBL265667):
>
>
> 11280714432D 1 1.00000 0.00000 0
>
> 25 27 0 0 0 999 V2000
> 3.8042 -1.6000 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.3167 -1.9000 0.0000 N 0 0 3 0 0 0 0 0 0
> 3.8042 -1.0000 0.0000 N 0 0 0 0 0 0 0 0 0
> 4.8417 -1.6000 0.0000 N 0 0 0 0 0 0 0 0 0
> 4.3167 -2.5000 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.3167 -3.6917 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.8417 -1.0000 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.3167 -0.7000 0.0000 C 0 0 0 0 0 0 0 0 0
> 3.7917 -3.3917 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.8375 -3.3917 0.0000 C 0 0 0 0 0 0 0 0 0
> 3.8000 -2.7917 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.8375 -2.7917 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.3167 -4.2917 0.0000 C 0 0 3 0 0 0 0 0 0
> 3.2875 -1.8917 0.0000 O 0 0 0 0 0 0 0 0 0
> 4.8375 -4.5917 0.0000 C 0 0 0 0 0 0 0 0 0
> 4.3167 -0.0917 0.0000 O 0 0 0 0 0 0 0 0 0
> 4.8292 -5.1917 0.0000 C 0 0 0 0 0 0 0 0 0
> 5.3500 -4.2917 0.0000 C 0 0 0 0 0 0 0 0 0
> 5.8667 -5.1917 0.0000 C 0 0 0 0 0 0 0 0 0
> 3.7917 -4.5917 0.0000 O 0 0 0 0 0 0 0 0 0
> 5.8667 -4.5917 0.0000 C 0 0 0 0 0 0 0 0 0
> 5.3542 -5.4917 0.0000 C 0 0 0 0 0 0 0 0 0
> 6.3917 -5.4917 0.0000 Cl 0 0 0 0 0 0 0 0 0
> 3.2750 -3.6917 0.0000 C 0 0 0 0 0 0 0 0 0
> 5.3542 -3.6917 0.0000 C 0 0 0 0 0 0 0 0 0
> 2 1 1 0 0 0
> 3 1 1 0 0 0
> 4 2 1 0 0 0
> 5 2 1 0 0 0
> 6 10 1 0 0 0
> 7 8 1 0 0 0
> 8 3 1 0 0 0
> 9 11 1 0 0 0
> 10 12 2 0 0 0
> 11 5 2 0 0 0
> 12 5 1 0 0 0
> 13 6 1 0 0 0
> 14 1 2 0 0 0
> 15 13 1 0 0 0
> 16 8 2 0 0 0
> 17 15 2 0 0 0
> 18 15 1 0 0 0
> 19 21 1 0 0 0
> 20 13 1 0 0 0
> 21 18 2 0 0 0
> 22 17 1 0 0 0
> 23 19 1 0 0 0
> 24 9 1 0 0 0
> 25 10 1 0 0 0
> 4 7 2 0 0 0
> 9 6 2 0 0 0
> 22 19 2 0 0 0
> M END
>
> and this smarts: [OH1]-C(-c1ccccc1)c2ccccc2
>
> I'm using this code to find a substructure:
>
> mol = Chem.MolFromMolBlock(str(molstring), sanitize=False)
> mol.UpdatePropertyCache(strict=False)
> patt = Chem.MolFromSmarts(str(smarts))
> Chem.GetSSSR(patt)
> Chem.GetSSSR(mol)
> match = mol.HasSubstructMatch(patt)
>
> and the `match` is empty.
>
> But with indigo code:
>
> mol = indigoObj.loadMolecule(str(molstring))
> patt = indigoObj.loadSmarts(str(smarts))
> match = indigoObj.substructureMatcher(mol).match(patt)
>
> match is valid and I can render this to image:
>
>
>
> I'm I missing some flag or doing something wrong?
>
> --
>
> Michal
>
> ------------------------------------------------------------
> ------------------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss