Hi Larissa,
A '*' atom in a SMILES is translated into an atom with atomic number zero.
In a normal substructure match this will only match other atoms of atomic
number zero.
If you want to turn it into a query feature, the easiest way is with the
function Chem.AdjustQueryProperties(). Here's an example:
In [2]: m = Chem.MolFromSmiles('CCO')
In [3]: q = Chem.MolFromSmiles('CC*')
In [4]: m.HasSubstructMatch(q)
Out[4]: False
In [7]: aps = Chem.AdjustQueryParameters()
In [8]: aps.adjustDegree=False
In [9]: aps.adjustRingCount=False
In [10]: nq = Chem.AdjustQueryProperties(q,aps)
In [11]: m.HasSubstructMatch(nq)
Out[11]: True
There's a bit more information on the options available in this RDKit blog
post:
http://rdkit.blogspot.com/2016/07/tuning-substructure-queries-ii.html
I hope this helps,
-greg
On Thu, May 24, 2018 at 2:22 PM Larissa Pusch <[email protected]> wrote:
> Hello,
>
> I am running rdkit version 2017.09.3.
> I read an sdf with
>
> supplier = SDMolSupplier('try/try.sdf')
>
> I then took the first mol from supplier and named it mol. I then performed
>
> fragmented = Recap.RecapDecompose(mol, minFragmentSize=3). I looped
> through its children with:
> fragmented_children_smiles = []
> for key in list(fragmented_children):
>
> fragmented_children_smiles.append(fragmented.GetAllChildren()[key].smiles)
> smile = fragmented_children_smiles[0]
>
> Now, smile is '[*]Nc1ccc(OC)cc1C([*])=O' . Theoretically, smile should of
> course be a substructure of mol. But if I check this like this:
>
> smile = MolFromSmiles(smile)
> if mol.HasSubstructMatch(smile):
> print('match smile!!!!')
>
> nothing gets printed. Apparently, this is because of the [*], if I delete
> them, there is a match. But why are they there in the first place? Why does
> HasSubstructMatch not work when they are included? And, most importantly,
> can I solve this problem, without going trough the code and deleting all
> '[*]'? First of all, I do not know if the smiles would still make sense if
> I did that and also, there are some structures like '([*])' and '()' is of
> course not valid, so deleting them for a large number of smiles would be
> really bothersome...
>
> Thank you for your help!
> Regards,
> Larissa Pusch
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss