On 8 June 2010 06:14, Andrew Dalke <da...@dalkescientific.com> wrote: > Hi all, > > I've been asked by a company to evaluate different toolkits for them to use > in-house. Geoff thought it might be better to ask on the list than ask him > directly and privately, so I'm doing that. > > I've gone through their internal requirements now I'm seeing how it matched > up to OB. Some of my conclusions are likely wrong and based on older version > of OB, so I'm hoping people here can correct me. > > I'll apologize in advance that I won't be that accessible over the next > couple of days, and likely won't be able to respond until Saturday or so. But > I felt it was best to get it out now rather than wait. > > =====
I don't know the answers to most of these things, but those I do I've answered: > As with all cheminformatics toolkits, OB has ways to get access to the atoms > and bonds of a molecule. It support molecular editing, so that atoms and > bonds can be added, deleted, or modified as desired. Atoms, bonds, and > molecules may have additional user-defined data associated with them. > > OB supports coordinates as part of the molecule (meaning that deleting the > atom deletes the associated coordinates), and it supports multiple conformer > structures. > > OB follows the Daylight approach where it has a standard chemistry model and > all input structures are reperceived based on that model. There is no way to > disable that option. > > OB is the foremost program for structure format support and interconversion. > > SD file support is complete for the v2000 and v3000, both for reading and > writing. What I'm not sure about is the level of support for v3000. It's > mostly support for the chemistry which is in v2000 but expressed differently > in v3000, and for support for more than 999 atoms? > > SMILES support is good, although it doesn't have support for stereochemistry > around double bonds. Excepting this lack, canonicalization is also good and > widely used. It does have support for stereochemistry around double bonds. Stereochemistry support is much improved for other formats in the development code though. > OB does have PDB file support. I can't tell how good the chemistry > perception is. For example, can it detect that a C-C bond is a double or > triple bond instead of a single (eg, by looking at the bond length, or by > understanding the residue names)? > > While OB does have a nearly uniform reader API (ie, I can point it to an SD > file, SMILES file, etc and get molecules), and built-in gzip support, I do > have to specify the format type manually. That is, there's no support for > guessing the format based, for example, on the extension. In most cases, the format type is the extension, but I suppose what you say is correct. > OpenBabel has SMARTS support, but I can't tell how complete it is. I know it > doesn't support double bond stereochemistry, but I think it's otherwise > complete, including recursive SMARTS. Is there anything missing? > > OB also supports using a molecule as the query rather than a SMARTS. Hmm...not sure about this. Does it? > Once the match is made, it's easy to get access to the matched atoms and > bonds, and match them up to the corresponding query atoms and bonds. > > The topic I know the least about is reactions. OB supports reaction SMILES > and SMARTS, as well as RXN files. I don't have a good idea for how good that > support is, and it's not something I used much, although my client does. > > In addition to the support for the query languages/formats, I can't tell how > to use the reactions. How would I do a unimolecular reaction (eg, convert all > of the carbons in CCCN to OOON)? How would I use a reaction for library > generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it > even possible? I looked but didn't find it. Perhaps OBChemTsfm does something here? http://openbabel.org/api/2.2.0/classOpenBabel_1_1OBChemTsfm.shtml. > OB does support some fingerprints. There's a linear hash fingerprint similar > to Daylight's and two feature fingerprint implementations, although only one > is suggested. There's no MACCS key implementation. There is no support for > large/sparse fingerprints, and the only implemented comparison method is the > Tanimoto similarity. MACCS key is in there. There is also support for user-defined fingerprints. > OB does not do depiction. For that case people should turn to other > libraries, such as OASA. OB can do depiction, at least in the development version. > There's no MCS or scaffold identification code in OB. There is a descriptor > framework system, support for different forcefields and minimization, and > InChI support. There's no nomenclature support. > OB is cross-platform (here meaning "Windows and Linux"), with access to the > library from C++, Python, .Net and Java. The documentation is incomplete and > sketchy, but because OB is used by a large number of people, there is support > both through the mailing list and by doing a web search for others who have > used the code. Also MacOSX and Ruby. Also Cygwin, MinGW. Works with all of G++, MSVC, Intel Compiler. Also, support is available from a number of independent consultants (as far as I am aware). > I have a metric for testing usability, and that's the number of lines of > code needed to count the total number of atoms of all of the records in an > input file, using one toolkit vs. pybel. OpenBabel suffers because of the > overhead of creating an OBConversion. Don't forget Pybel is part of OpenBabel. > I have another metric for comparing error handling, which is to read an SD > file with records containing errors (format errors and chemistry errors) and > seeing if I can find the number of records which failed to be read in and the > reason for the failure. I haven't figured how out to do that with OB. One other feature you haven't mentioned is that it has a plugin architecture for fingerprints, formats, operations, charge models and so forth (it's the same architecture in each case). This means that internally a company could create a single .cpp file and compile it in as a format or operation or whatever. This can be easily called from babel and can do anything under the sun. > ==== > > Thanks in advance! > > Andrew > da...@dalkescientific.com > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss > ------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss