On Jun 8, 2010, at 6:31 AM, Geoffrey Hutchison wrote: > As Noel said, we do have support for stereochemistry around double bonds in > SMILES. Stereochemistry is much improved thanks to Noel and Tim Vandermeersch > in the soon-to-be-releasesd v2.3. (SMARTS support for double-bond stereo is > another matter.)
Is there an expected date for that? If it's within the next couple of weeks than I can put in the 2.3 information. > Yes. On any file format like PDB or XYZ which does not support bond types, > perception is run to determine connectivity and bond order. For PDB, this is > also done first via residue names. Bond perception can also be turned off via > the command-line or programmatically (e.g., some users run MD simulations and > have their own topology file). Where is this documented? Searching for "PDB perception" on the OpenBabel site doesn't find much of anything about the algorithm used. My own history of working with the PDB says it's a lot of work to get those details, and a quick look at the OEChem release notes has things like: - Added PDB support for the following: - sidechain recognition for the RNA residue ‘YG’ and ‘H2U’ - naming of PDB residue ‘BME’ - the N-terminal modification ‘FOR’ - the cofactor ‘FMT’ (which is “formic acid” or “formate”) but I don't see mention of the quality/robustness of the PDB reader in OB. > Less than 1% of the time do I have to specify a format type manually. Formats > can be guessed from file extensions, and for some file types (e.g., quantum > packages that like the .out, .log, or .dat extensions), OB will attempt to > guess the format from contents. The examples I've seen are all like ====== straight openbabel import openbabel as ob obconversion = ob.OBConversion() obconversion.SetInFormat("sdf") obmol = ob.OBMol() notatend = obconversion.ReadFile(obmol, "benzodiazepine.sdf.gz") while notatend: ... notatend = obconversion.Read(obmol) ==== pybel import pybel for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"): ... === where the format is explicitly specified. The documentation at http://openbabel.org/dev-api/classOpenBabel_1_1OBConversion.shtml under "To add automatic format conversion to an existing program." uses ifstream ifs(filename); //Original code OBConversion conv; OBFormat* inFormat = conv.FormatFromExt(filename); OBFormat* outFormat = conv.GetFormat("ORIG"); istream* pIn = &ifs; stringstream newstream; if (inFormat && outFormat) { conv.SetInAndOutFormats(inFormat,outFormat); conv.Convert(pIn,&newstream); pIn=&newstream; } which allows automatic format detection based on the extension, but it's a lot of boilerplate code. I didn't realize I could leave the format name out and the code would autodetect. >> OB also supports using a molecule as the query rather than a SMARTS. > > Well, you can output a SMILES from a molecule and use that as a SMARTS. > That's a unit test, so we can guarantee that always works. As Chris said, > there's also the fastsearch format. Perhaps that's what I was looking at. I'll have to dig into that again. > It's not currently exposed to users, but the OBChemTsfm class is used to > handle pH-dependent protonation. It can handle this task too. The syntax is > basically reaction SMILES. I'm assuming this will be exposed when someone volunteers to do it? ;) > We're always open to feedback about areas of documentation needing > clarification. Telling us it's sketchy and/or incomplete doesn't help much. > Pointers to areas needing improvement will be met with applause (and fixes). I thought it was pretty clear that the documentation in OpenBabel was sketchy, in comparison to some of other toolkits, like OEChem or ChemAxon. I've mentioned a few of these places in this and my other responses. > As Noel mentioned, we *are* pybel. So I think we win that comparison. Yes, > the C++ interface is slightly more verbose, but that's also true of C++ > versus Python in general. I've been thinking about this since replying to Noel's email. OpenBabel publishes two different Python APIs - the one with the C++ interface and the Pybel interface. The Zen of Python includes There should be one-- and preferably only one --obvious way to do it. Is pybel the preferred way to do things on the Python level? > We keep an audit log. From the command-line you get a summary: > > [ghutc...@iridium]: babel tpy-Ru.sdf tpy.mol2 > 1 molecule converted > 1 info messages 23 audit log messages > > You can programmatically interrogate the error log to get the warnings, > severity level, etc. The audit level is intended to cover any code which may > change chemical interpretation (e.g., Kekulization, adding implicit > hydrogens, bond perception, etc.). That's also what OpenEye does, but getting access to the error log, synchronized with the reader, is nasty hard. Can someone show me how to get that? For example, if Pybel is the preferred way to get this data, then how do I get the error logs for each molecule in for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"): ? > Hope that helps, It does. Thanks! Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss