Dear all,

I am working with peptides and RNA and want to convert sequences into 2D 
molecules.
As we use non-natural and proprietary monomers, I cannot apply the ususal 
workflows like MolFromHELM,
but have developed my own python code to build the macromolecules from their 
building blocks (basically
using Chem.CombineMols and then rdDepictor.Compute2DCoords,
see 
https://github.com/Boehringer-Ingelheim/pyPept/blob/master/src/pyPept/molecule.py).

While this works fine for even large peptides (>40 monomers), when doing the 
same for RNA I run into a problem:
after a certain size (about 12 or 13 nucleotides), the 2D embedding returns all 
coordinates as zeroes and all stereoinformation
is lost.

I tried the same using MolFromHELM, and there I do not see the same issue, I 
get valid 2D coordinates up to hundreds of nucleotides
(yes, other than what the documentation says, RNA and DNA work, too!).
Only if I first generate the molecule and then pass it through either 
rdCoordGen.AddCoords or Chem.rdDepictor.Compute2DCoords
I end up with coordinates as zero. So I suppose MolFromHELM knows sth about the 
general structure of the building blocks and uses that information,
whereas the all-purpose embedders cannot take that into account and 
subsequently fail. But then again, this MolFromHELM is not an option as I need 
non-natural
monomers (unless there is a way to teach rdkit about non-canonical monomers, 
but I haven't found anything on it).

Here is the relevant code snippet:

from rdkit import Chem
from rdkit.Chem import rdCoordGen

n_nucleotides = 20

polyA = ['R(A)P'] * n_nucleotides
polyA = '.'.join(polyA)
helm = f'RNA1{{{polyA}}}$$$$V2.0'

romol = Chem.MolFromHELM(helm)
#rdCoordGen.AddCoords(romol)

mb = Chem.MolToMolBlock(romol)

print(mb[1:300])

Now everything looks fine, but as soon as I uncomment the rdCoordGen line, the 
coordinates are zero.

Any ideas, suggestions what I could do?

Thanks,
Th.


Thomas Fox
NCE

Boehringer Ingelheim Pharma GmbH & Co. KG
Birkendorfer Str. 65 | 88397 Biberach

T +49 (7351) 54-7585<tel:+49%20(7351)%2054-7585>
E 
thomas....@boehringer-ingelheim.com<mailto:thomas....@boehringer-ingelheim.com>

[cid:image001.png@01DC0C43.6E1F9D20]<https://www.boehringer-ingelheim.com/de/>

 Save my contact
[cid:image002.png@01DC0C43.6E1F9D20]<https://eu.signature365.com/vcard/Kw7HIjoOKeNUKEl8-frtUBHxNbDdPdO1Z.vcf>



Pflichtangaben finden Sie unter: 
hier<https://www.boehringer-ingelheim.com/de/unser-unternehmen/gesellschaften-in-deutschland>
Mandatory information can be found at: 
here<https://www.boehringer-ingelheim.com/de/unser-unternehmen/gesellschaften-in-deutschland>

Datenschutzhinweis: Klicken Sie 
hier<https://www.boehringer-ingelheim.com/de/datenschutz>, um weitere 
Informationen auf der lokalen Unternehmensinternetseite des betreffenden Landes 
über Datenschutz bei Boehringer Ingelheim und zu Ihren Rechten zu erhalten. 
Privacy Notice: Click here<https://www.boehringer-ingelheim.com/de/datenschutz> 
for more information on the local company website of the respective country 
about data protection at Boehringer Ingelheim and your rights.

Diese E-Mail ist vertraulich zu behandeln. Sie kann besonderem rechtlichem 
Schutz unterliegen. Wenn Sie nicht der richtige Adressat sind, senden Sie bitte 
diese E-Mail an den Absender zurück, löschen die eingegangene E-Mail und geben 
den Inhalt der E-Mail nicht weiter. Jegliche unbefugte Bearbeitung, Nutzung, 
Vervielfältigung oder Verbreitung ist verboten. / This e-mail is confidential 
and may also be legally privileged. If you are not the intended recipient 
please reply to sender, delete the e-mail and do not disclose its contents to 
any person. Any unauthorized review, use, disclosure, copying or distribution 
is strictly prohibited.

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to