Dear all, I am working with peptides and RNA and want to convert sequences into 2D molecules. As we use non-natural and proprietary monomers, I cannot apply the ususal workflows like MolFromHELM, but have developed my own python code to build the macromolecules from their building blocks (basically using Chem.CombineMols and then rdDepictor.Compute2DCoords, see https://github.com/Boehringer-Ingelheim/pyPept/blob/master/src/pyPept/molecule.py).
While this works fine for even large peptides (>40 monomers), when doing the same for RNA I run into a problem: after a certain size (about 12 or 13 nucleotides), the 2D embedding returns all coordinates as zeroes and all stereoinformation is lost. I tried the same using MolFromHELM, and there I do not see the same issue, I get valid 2D coordinates up to hundreds of nucleotides (yes, other than what the documentation says, RNA and DNA work, too!). Only if I first generate the molecule and then pass it through either rdCoordGen.AddCoords or Chem.rdDepictor.Compute2DCoords I end up with coordinates as zero. So I suppose MolFromHELM knows sth about the general structure of the building blocks and uses that information, whereas the all-purpose embedders cannot take that into account and subsequently fail. But then again, this MolFromHELM is not an option as I need non-natural monomers (unless there is a way to teach rdkit about non-canonical monomers, but I haven't found anything on it). Here is the relevant code snippet: from rdkit import Chem from rdkit.Chem import rdCoordGen n_nucleotides = 20 polyA = ['R(A)P'] * n_nucleotides polyA = '.'.join(polyA) helm = f'RNA1{{{polyA}}}$$$$V2.0' romol = Chem.MolFromHELM(helm) #rdCoordGen.AddCoords(romol) mb = Chem.MolToMolBlock(romol) print(mb[1:300]) Now everything looks fine, but as soon as I uncomment the rdCoordGen line, the coordinates are zero. Any ideas, suggestions what I could do? Thanks, Th. Thomas Fox NCE Boehringer Ingelheim Pharma GmbH & Co. KG Birkendorfer Str. 65 | 88397 Biberach T +49 (7351) 54-7585<tel:+49%20(7351)%2054-7585> E thomas....@boehringer-ingelheim.com<mailto:thomas....@boehringer-ingelheim.com> [cid:image001.png@01DC0C43.6E1F9D20]<https://www.boehringer-ingelheim.com/de/> Save my contact [cid:image002.png@01DC0C43.6E1F9D20]<https://eu.signature365.com/vcard/Kw7HIjoOKeNUKEl8-frtUBHxNbDdPdO1Z.vcf> Pflichtangaben finden Sie unter: hier<https://www.boehringer-ingelheim.com/de/unser-unternehmen/gesellschaften-in-deutschland> Mandatory information can be found at: here<https://www.boehringer-ingelheim.com/de/unser-unternehmen/gesellschaften-in-deutschland> Datenschutzhinweis: Klicken Sie hier<https://www.boehringer-ingelheim.com/de/datenschutz>, um weitere Informationen auf der lokalen Unternehmensinternetseite des betreffenden Landes über Datenschutz bei Boehringer Ingelheim und zu Ihren Rechten zu erhalten. Privacy Notice: Click here<https://www.boehringer-ingelheim.com/de/datenschutz> for more information on the local company website of the respective country about data protection at Boehringer Ingelheim and your rights. Diese E-Mail ist vertraulich zu behandeln. Sie kann besonderem rechtlichem Schutz unterliegen. Wenn Sie nicht der richtige Adressat sind, senden Sie bitte diese E-Mail an den Absender zurück, löschen die eingegangene E-Mail und geben den Inhalt der E-Mail nicht weiter. Jegliche unbefugte Bearbeitung, Nutzung, Vervielfältigung oder Verbreitung ist verboten. / This e-mail is confidential and may also be legally privileged. If you are not the intended recipient please reply to sender, delete the e-mail and do not disclose its contents to any person. Any unauthorized review, use, disclosure, copying or distribution is strictly prohibited.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss