On 8 June 2010 17:03, Andrew Dalke <da...@dalkescientific.com> wrote:
> On Jun 8, 2010, at 6:31 AM, Geoffrey Hutchison wrote:
>> As Noel said, we do have support for stereochemistry around double bonds in 
>> SMILES. Stereochemistry is much improved thanks to Noel and Tim 
>> Vandermeersch in the soon-to-be-releasesd v2.3. (SMARTS support for 
>> double-bond stereo is another matter.)
>
> Is there an expected date for that? If it's within the next couple of weeks 
> than I can put in the 2.3 information.
>
>> Yes. On any file format like PDB or XYZ which does not support bond types, 
>> perception is run to determine connectivity and bond order. For PDB, this is 
>> also done first via residue names. Bond perception can also be turned off 
>> via the command-line or programmatically (e.g., some users run MD 
>> simulations and have their own topology file).
>
> Where is this documented? Searching for "PDB perception" on the OpenBabel 
> site doesn't find much of anything about the algorithm used. My own history 
> of working with the PDB says it's a lot of work to get those details, and a 
> quick look at the OEChem release notes has things like:
>
>  - Added PDB support for the following:
>   - sidechain recognition for the RNA residue ‘YG’ and ‘H2U’
>   - naming of PDB residue ‘BME’
>   - the N-terminal modification ‘FOR’
>   - the cofactor ‘FMT’ (which is “formic acid” or “formate”)
>
> but I don't see mention of the quality/robustness of the PDB reader in OB.
>
>
>> Less than 1% of the time do I have to specify a format type manually. 
>> Formats can be guessed from file extensions, and for some file types (e.g., 
>> quantum packages that like the .out, .log, or .dat extensions), OB will 
>> attempt to guess the format from contents.
>
>
> The examples I've seen are all like
>
> ====== straight openbabel
> import openbabel as ob
>
> obconversion = ob.OBConversion()
> obconversion.SetInFormat("sdf")
>
>
> obmol = ob.OBMol()
>
> notatend = obconversion.ReadFile(obmol, "benzodiazepine.sdf.gz")
> while notatend:
>    ...
>    notatend = obconversion.Read(obmol)
>
>
> ==== pybel
>
> import  pybel
>
>
> for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
>    ...
>
> ===
>
>
> where the format is explicitly specified. The documentation at
>
> http://openbabel.org/dev-api/classOpenBabel_1_1OBConversion.shtml
> under "To add automatic format conversion to an existing program."
>
> uses
>
>      ifstream ifs(filename); //Original code
>      OBConversion conv;
>      OBFormat* inFormat = conv.FormatFromExt(filename);
>      OBFormat* outFormat = conv.GetFormat("ORIG");
>      istream* pIn = &ifs;
>      stringstream newstream;
>      if (inFormat && outFormat)
>      {
>         conv.SetInAndOutFormats(inFormat,outFormat);
>         conv.Convert(pIn,&newstream);
>         pIn=&newstream;
>      }
>
> which allows automatic format detection based on the extension, but it's a 
> lot of boilerplate code.
>
> I didn't realize I could leave the format name out and the code would 
> autodetect.
>
>>> OB also supports using a molecule as the query rather than a SMARTS.
>>
>> Well, you can output a SMILES from a molecule and use that as a SMARTS. 
>> That's a unit test, so we can guarantee that always works. As Chris said, 
>> there's also the fastsearch format.
>
> Perhaps that's what I was looking at. I'll have to dig into that again.
>
>> It's not currently exposed to users, but the OBChemTsfm class is used to 
>> handle pH-dependent protonation. It can handle this task too. The syntax is 
>> basically reaction SMILES.
>
> I'm assuming this will be exposed when someone volunteers to do it? ;)
>
>> We're always open to feedback about areas of documentation needing 
>> clarification. Telling us it's sketchy and/or incomplete doesn't help much. 
>> Pointers to areas needing improvement will be met with applause (and fixes).
>
> I thought it was pretty clear that the documentation in OpenBabel was 
> sketchy, in comparison to some of other toolkits, like OEChem or ChemAxon. 
> I've mentioned a few of these places in this and my other responses.
>
>
>> As Noel mentioned, we *are* pybel. So I think we win that comparison. Yes, 
>> the C++ interface is slightly more verbose, but that's also true of C++ 
>> versus Python in general.
>
> I've been thinking about this since replying to Noel's email. OpenBabel 
> publishes two different Python APIs - the one with the C++ interface and the 
> Pybel interface.
>
> The Zen of Python includes
>
>  There should be one-- and preferably only one --obvious way to do it.

I'm well aware of the Zen of Python - it governed my design decisions.
If you think that this means Pybel should not exist, I would disagree,
and I think many users would too.

> Is pybel the preferred way to do things on the Python level?

That requires a poll of users. Personally, IMHO, I don't know why you
would use the bindings directly where you could use Pybel. But usually
I end up using a combination.

>> We keep an audit log. From the command-line you get a summary:
>>
>> [ghutc...@iridium]: babel tpy-Ru.sdf tpy.mol2
>> 1 molecule converted
>> 1 info messages 23 audit log messages
>>
>> You can programmatically interrogate the error log to get the warnings, 
>> severity level, etc. The audit level is intended to cover any code which may 
>> change chemical interpretation (e.g., Kekulization, adding implicit 
>> hydrogens, bond perception, etc.).
>
> That's also what OpenEye does, but getting access to the error log, 
> synchronized with the reader, is nasty hard. Can someone show me how to get 
> that? For example, if Pybel is the preferred way to get this data, then how 
> do I get the error logs for each molecule in
>
> for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
>
>  ?

This will take me a while to dig into.

>> Hope that helps,
>
> It does. Thanks!
>
>
>                                Andrew
>                                da...@dalkescientific.com
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to