Re: [Open Babel] Converting CIFs to mols using OpenBabel

2013-12-12 Thread Miguel Quirós Olozábal
Hello. A very interesting initiative and rather related to what I am 
working in.

I am working in the Crystallography Open Database (COD,
www.crystallography.net), a large collection of openly accessible CIF
files. I have posted a few times messages previously to this list. COD
contains today 245336 files, probably a noticeable proportion of your
44000 may be already included here.

The task I am actually doing is extracting the chemical connectivity 
from the CIF files and storing it in SMILES format, so that chemical 
substructure search can be performed on it. At present the SMILES 
collection is around 7 entries. The conversion is done trough 
OpenBabel.

Doing this, I have found that the results are rather satisfactory for
organic compounds but generally they are not for inorganic ones. In most
cases, the results are not perhaps a "bug", but simply a representation
of the molecule that is not coincident with the one that an inorganic
chemist will usually have in his/her mind (or should I say "in my
mind?"). In other cases, the results are wrong, specially with the
appearance of spurious H-atoms. Because of this, I would be able to add
a lot of stuff to your list of unsatisfactory conversions from CIF
files.

I need to review and, in most cases, fix the SMILES chains coming out
from OpenBabel for inorganic compounds (either manually or
semiautomatically). I am also stuck to version 2.2.3 because versions
newer than this perform worse for inorganic compounds.

These facts are understandable since the bond and valence concepts that
are behind the spirit of cheminformatics formats and cheminformatics in
general are mostly in the valence bond theory realm and thus in the
organic chemistry formalism, many of these concepts (such as the
definition of "double" and "aromatic" bonds) become more dubious when
metal atoms are present. Also, the behaviour of common "organic" atoms
is different when they interact with metals. A clear example of this is
nitrogen: OpenBabel does not consider that this element usually binds to
metal atoms through its lone pair (thus forming four bonds) and this
introduces a lot of mistakes when OpenBabel tries to keep its trivalent
state at all costs.

By the way, you can use the SMILES COD collection, many of which have
been humanly revised, for your task if you think it may help you in any
way. I do not know if a SMILES string has enough information to build an
acceptable MOL file, though.

Links in chemspider to COD CIFs are of course welcome.

Best wishes,
Miguel Quirós

El lun, 09-12-2013 a las 08:44 -0800, daya escribió: 
> I’ve just supervised a student project in which we used OpenBabel to convert
> over 44,000 Royal Society of Chemistry CIF structures to mol files, then a
> student checked over 4,000 of these conversions so that we could upload the
> successfully processed CIFs to ChemSpider for the corresponding ChemSpider
> compounds. A summary of the results of that project are detailed here:
> http://www.chemspider.com/blog/adding-rsc-cifs-to-chemspider.html 
> It seemed like a valuable opportunity to identify the most frequent
> OpenBabel bugs when doing a CIF to Mol conversion so these are documented in
> there, along with test cases to identify the problems and with a view to
> fixing them and making OpenBabel more bulletproof.
> We’re taking a bit of a break from this project for now, but in the next
> phase of the project will see if we can fix at least some of the bugs
> identified if they haven’t already been. 
> But we’re sharing these results here for now though since we thought you
> would be interested in the project and the performance of OpenBabel when run
> over such a large and varied test set, possibly even enough to look into
> some of them yourselves… 
> Looking forward to working with you on some of them in the future…
> Aileen Day (Informatics Analyst, RSC ChemSpider)
> 
> 
> 
> --
> View this message in context: 
> http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031.html
> Sent from the General discussion mailing list archive at Nabble.com.
> 
> --
> Sponsored by Intel(R) XDK 
> Develop, test and display web and hybrid apps with a single code base.
> Download it for free now!
> http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

-- 
Miguel Quirós Olozábal
Departamento de Química Inorgánica. Facultad de Ciencias.
Universidad de Granada. 18071 Granada. SPAIN.
email: mquirosugres
   mquirosugres





--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects thei

Re: [Open Babel] Converting CIFs to mols using OpenBabel

2013-12-12 Thread Noel O'Boyle
A colleague has pointed out to me the likely problem with the
depictions. Mol files can have either 2D or 3D coordinates. I'm
guessing that 3D mol files are not interpreted correctly by the
drawing program you used to view the structures; it must have set the
z coordinate to 0 without any warning.

There is native nitro normalisation for the opposite direction using
the "-b" option. To specify arbitary transformations the process is a
bit more roundabout. You need to add the transformation to the file at
(on my computer)
C:\Users\noel\AppData\Roaming\OpenBabel-2.3.2\data\plugindefines.txt
(or copy this file to your current directory and edit it there).

At the following to the end:
OpTransform
rsc# ID. Commandline option to invoke is --nodative
*   # There is no datafile; the transforms are at the end
of the entry
Apply RSC normalisations
TRANSFORM [N:1](=O)=[O:2] >> [N+:1](=O)[O-:2]

Once you have done this, you have extended Open Babel with new
functionality. "obabel -L ops" will list "rsc" as a new option.
"obabel -L rsc" will give the help text. To use it, use "--rsc" as
follows:

> obabel -:CCN(=O)(=O) --rsc -osmi
CC[N+](=O)[O-]

I think you can add as many transformations as you want, or list them
in a separate file and give the filename.

- Noel

On 10 December 2013 17:43, daya  wrote:
> Thanks for the reply Noel,
> Ah, I didn't know about the "--gen2d" option - sounds like I should have
> used that. I'll try it out...
> And it looks like I misunderstood the --unique option...
> How does the nitro normalisation work? That would definitely be something
> we're interested in...
> Thanks again, Aileen
>
>
>
>
> --
> View this message in context: 
> http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031p4657043.html
> Sent from the General discussion mailing list archive at Nabble.com.
>
> --
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
> ___
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Bindings - first and last molecule option

2013-12-12 Thread Maciek Wójcikowski
I think, I've pin-pointed the issue in the source - OBConversion class. The
first and last molecule is set by SetStartAndEnd method. It's being called
only by Convert method, not by Read nor ReadFile, hence the binding dont
use -f and -l. I figure if i't called SetStartAndEnd in Read or ReadFile,
then it should work. Unfortunately I struggle to find the spot. Tried to
add it to ReadFile, but it does not work.

Any better ideas?


Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


2013/12/5 Geoffrey Hutchison 

> This only proves, that I really want to use it. Is the proof.py script
> working correctly for anybody (attached in previous mail)? It should ouptut
> molecules no. 5-10 out of 100 in proof.sdf
>
>
> No, it doesn't work.
>
> % python proof.py | wc
>  100 1001300
>
> Hmm. Chris, any idea why this wouldn't work from Python?
>
> obconversion = OBConversion()
> obconversion.SetInFormat("sdf")
> obconversion.AddOption('f', obconversion.GENOPTIONS, "5")
> obconversion.AddOption('l', obconversion.GENOPTIONS, "10")
> obmol = OBMol()
>
> notatend = obconversion.ReadFile(obmol,"proof.sdf")
> while notatend:
> print obmol.GetTitle()
> obmol = OBMol()
> notatend = obconversion.Read(obmol)
>
>
>
--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


Re: [Open Babel] Converting CIFs to mols using OpenBabel

2013-12-12 Thread daya
This is very interesting - sounds like you came across the same issues as we
did and came to the same conclusions too.

We're also very interested to hear that you've got smiles for so many of
your structures which would indeed make it straightforward to link
ChemSpider to COD. We'll be in touch!

Thanks for getting in touch about this...



--
View this message in context: 
http://forums.openbabel.org/Converting-CIFs-to-mols-using-OpenBabel-tp4657031p4657056.html
Sent from the General discussion mailing list archive at Nabble.com.

--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
___
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss