Re: [Open Babel] Converting CIFs to mols using OpenBabel
Hello. A very interesting initiative and rather related to what I am working in. I am working in the Crystallography Open Database (COD, www.crystallography.net), a large collection of openly accessible CIF files. I have posted a few times messages previously to this list. COD contains today 245336 files, probably a noticeable proportion of your 44000 may be already included here. The task I am actually doing is extracting the chemical connectivity from the CIF files and storing it in SMILES format, so that chemical substructure search can be performed on it. At present the SMILES collection is around 7 entries. The conversion is done trough OpenBabel. Doing this, I have found that the results are rather satisfactory for organic compounds but generally they are not for inorganic ones. In most cases, the results are not perhaps a "bug", but simply a representation of the molecule that is not coincident with the one that an inorganic chemist will usually have in his/her mind (or should I say "in my mind?"). In other cases, the results are wrong, specially with the appearance of spurious H-atoms. Because of this, I would be able to add a lot of stuff to your list of unsatisfactory conversions from CIF files. I need to review and, in most cases, fix the SMILES chains coming out from OpenBabel for inorganic compounds (either manually or semiautomatically). I am also stuck to version 2.2.3 because versions newer than this perform worse for inorganic compounds. These facts are understandable since the bond and valence concepts that are behind the spirit of cheminformatics formats and cheminformatics in general are mostly in the valence bond theory realm and thus in the organic chemistry formalism, many of these concepts (such as the definition of "double" and "aromatic" bonds) become more dubious when metal atoms are present. Also, the behaviour of common "organic" atoms is different when they interact with metals. A clear example of this is nitrogen: OpenBabel does not consider that this element usually binds to metal atoms through its lone pair (thus forming four bonds) and this introduces a lot of mistakes when OpenBabel tries to keep its trivalent state at all costs. By the way, you can use the SMILES COD collection, many of which have been humanly revised, for your task if you think it may help you in any way. I do not know if a SMILES string has enough information to build an acceptable MOL file, though. Links in chemspider to COD CIFs are of course welcome. Best wishes, Miguel Quirós El lun, 09-12-2013 a las 08:44 -0800, daya escribió: > I’ve just supervised a student project in which we used OpenBabel to convert > over 44,000 Royal Society of Chemistry CIF structures to mol files, then a > student checked over 4,000 of these conversions so that we could upload the > successfully processed CIFs to ChemSpider for the corresponding ChemSpider > compounds. A summary of the results of that project are detailed here: > http://www.chemspider.com/blog/adding-rsc-cifs-to-chemspider.html > It seemed like a valuable opportunity to identify the most frequent > OpenBabel bugs when doing a CIF to Mol conversion so these are documented in > there, along with test cases to identify the problems and with a view to > fixing them and making OpenBabel more bulletproof. > We’re taking a bit of a break from this project for now, but in the next > phase of the project will see if we can fix at least some of the bugs > identified if they haven’t already been. > But we’re sharing these results here for now though since we thought you > would be interested in the project and the performance of OpenBabel when run > over such a large and varied test set, possibly even enough to look into > some of them yourselves… > Looking forward to working with you on some of them in the future… > Aileen Day (Informatics Analyst, RSC ChemSpider) > > > > -- > View this message in context: > http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031.html > Sent from the General discussion mailing list archive at Nabble.com. > > -- > Sponsored by Intel(R) XDK > Develop, test and display web and hybrid apps with a single code base. > Download it for free now! > http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk > ___ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss -- Miguel Quirós Olozábal Departamento de Química Inorgánica. Facultad de Ciencias. Universidad de Granada. 18071 Granada. SPAIN. email: mquirosugres mquirosugres -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects thei
Re: [Open Babel] Converting CIFs to mols using OpenBabel
A colleague has pointed out to me the likely problem with the depictions. Mol files can have either 2D or 3D coordinates. I'm guessing that 3D mol files are not interpreted correctly by the drawing program you used to view the structures; it must have set the z coordinate to 0 without any warning. There is native nitro normalisation for the opposite direction using the "-b" option. To specify arbitary transformations the process is a bit more roundabout. You need to add the transformation to the file at (on my computer) C:\Users\noel\AppData\Roaming\OpenBabel-2.3.2\data\plugindefines.txt (or copy this file to your current directory and edit it there). At the following to the end: OpTransform rsc# ID. Commandline option to invoke is --nodative * # There is no datafile; the transforms are at the end of the entry Apply RSC normalisations TRANSFORM [N:1](=O)=[O:2] >> [N+:1](=O)[O-:2] Once you have done this, you have extended Open Babel with new functionality. "obabel -L ops" will list "rsc" as a new option. "obabel -L rsc" will give the help text. To use it, use "--rsc" as follows: > obabel -:CCN(=O)(=O) --rsc -osmi CC[N+](=O)[O-] I think you can add as many transformations as you want, or list them in a separate file and give the filename. - Noel On 10 December 2013 17:43, daya wrote: > Thanks for the reply Noel, > Ah, I didn't know about the "--gen2d" option - sounds like I should have > used that. I'll try it out... > And it looks like I misunderstood the --unique option... > How does the nitro normalisation work? That would definitely be something > we're interested in... > Thanks again, Aileen > > > > > -- > View this message in context: > http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031p4657043.html > Sent from the General discussion mailing list archive at Nabble.com. > > -- > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > ___ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk ___ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Re: [Open Babel] Bindings - first and last molecule option
I think, I've pin-pointed the issue in the source - OBConversion class. The first and last molecule is set by SetStartAndEnd method. It's being called only by Convert method, not by Read nor ReadFile, hence the binding dont use -f and -l. I figure if i't called SetStartAndEnd in Read or ReadFile, then it should work. Unfortunately I struggle to find the spot. Tried to add it to ReadFile, but it does not work. Any better ideas? Pozdrawiam, | Best regards, Maciek Wójcikowski mac...@wojcikowski.pl 2013/12/5 Geoffrey Hutchison > This only proves, that I really want to use it. Is the proof.py script > working correctly for anybody (attached in previous mail)? It should ouptut > molecules no. 5-10 out of 100 in proof.sdf > > > No, it doesn't work. > > % python proof.py | wc > 100 1001300 > > Hmm. Chris, any idea why this wouldn't work from Python? > > obconversion = OBConversion() > obconversion.SetInFormat("sdf") > obconversion.AddOption('f', obconversion.GENOPTIONS, "5") > obconversion.AddOption('l', obconversion.GENOPTIONS, "10") > obmol = OBMol() > > notatend = obconversion.ReadFile(obmol,"proof.sdf") > while notatend: > print obmol.GetTitle() > obmol = OBMol() > notatend = obconversion.Read(obmol) > > > -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk___ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Re: [Open Babel] Converting CIFs to mols using OpenBabel
This is very interesting - sounds like you came across the same issues as we did and came to the same conclusions too. We're also very interested to hear that you've got smiles for so many of your structures which would indeed make it straightforward to link ChemSpider to COD. We'll be in touch! Thanks for getting in touch about this... -- View this message in context: http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031p4657056.html Sent from the General discussion mailing list archive at Nabble.com. -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk ___ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss