Hi,
I've got a problem with the InChIKeys being generated from CML for a series of adamantanes. The structures attached in the cml were generated in torch (from a smiles string) and then converted from and sdf to CML using open babel. I'm trying to use the function in the python script to add the InChIKey of the CML to the attributes (the function takes an lxml.etree.Element representation of the molecule CML block as input, and adds the generated InChIKey). I want to be able to match these 3D structures to experimental data for them that is stored in xml, which uses the InChIKey as an id for the molecule. >From the csv file the expected InChIKey and the canonicalised smiles used to generate it (in the columns exp_inchikey and exp_smiles respectively). The InChIKey that was actually generated for the cml is in the cml_inchikey column. The second part of the inchikey is different, and I was wondering why this is the case? Is it to do with some unseen stereo-chemistry that isn't in the smiles used to generate it, or is it to do with the options I'm using for the conversion or something else that I haven't thought of? Note: the expected inchikey is taken from the chemspider entry for the molecule. Thanks, Mark Driver PhD student University of Cambridge
exp_inchikey exp_smiles cml_inchikey BHTSNYQWLQLLOD-UHFFFAOYSA-N CC(C)(C)C(=O)C12CC3CC(CC(C3)C1)C2 BHTSNYQWLQLLOD-WUQLGEGHSA-N CPWSNJSGSXXVLD-UHFFFAOYSA-N FC12CC3CC(CC(C3)C1)C2 CPWSNJSGSXXVLD-CHIWXEEVSA-N DACIGVIOAFXPHW-UHFFFAOYSA-N CC(=O)C12CC3CC(CC(C3)C1)C2 DACIGVIOAFXPHW-CDECOKDKSA-N DKNWSYNQZKUICI-UHFFFAOYSA-N NC12CC3CC(CC(C3)C1)C2 DKNWSYNQZKUICI-CHIWXEEVSA-N
import openbabel as ob from lxml import etree def addStdInChIKeyToMolecule(molecule_cml): """Add stdInCHIKey attribute to a molecule. """ molecule_cml_string = etree.tostring(molecule_cml) conversion = ob.OBConversion() conversion.SetInAndOutFormats("cml", "inchi") conversion.SetOptions("K", conversion.OUTOPTIONS) molecule = ob.OBMol() conversion.ReadString(molecule, molecule_cml_string) inchikey = conversion.WriteString(molecule) inchikey = inchikey.strip() molecule_cml.set("StdInChiKey", inchikey) return molecule_cml
adamantaneexamples.cml
Description: XML document
------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://makebettercode.com/inteldaal-eval
_______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss