Some extra info on bits of OB I know about. On 08/06/2010 06:14, Andrew Dalke wrote: > Hi all, > > I've been asked by a company to evaluate different toolkits for them to > use in-house. Geoff thought it might be better to ask on the list than ask > him directly and privately, so I'm doing that. > > I've gone through their internal requirements now I'm seeing how it > matched up to OB. Some of my conclusions are likely wrong and based on older > version of OB, so I'm hoping people here can correct me. > > I'll apologize in advance that I won't be that accessible over the next > couple of days, and likely won't be able to respond until Saturday or so. But > I felt it was best to get it out now rather than wait. > > ===== > > As with all cheminformatics toolkits, OB has ways to get access to the > atoms and bonds of a molecule. It support molecular editing, so that atoms > and bonds can be added, deleted, or modified as desired. Atoms, bonds, and > molecules may have additional user-defined data associated with them. > > OB supports coordinates as part of the molecule (meaning that deleting the > atom deletes the associated coordinates), and it supports multiple conformer > structures. > > OB follows the Daylight approach where it has a standard chemistry model > and all input structures are reperceived based on that model. There is no way > to disable that option. > > OB is the foremost program for structure format support and > interconversion. > > SD file support is complete for the v2000 and v3000, both for reading and > writing. What I'm not sure about is the level of support for v3000. It's > mostly support for the chemistry which is in v2000 but expressed differently > in v3000, and for support for more than 999 atoms? I think this is right for v3000, for which the support is fairly basic. v2000 has more but is not complete, e.g. no S groups. > > SMILES support is good, although it doesn't have support for > stereochemistry around double bonds. Excepting this lack, canonicalization is > also good and widely used. > > OB does have PDB file support. I can't tell how good the chemistry > perception is. For example, can it detect that a C-C bond is a double or > triple bond instead of a single (eg, by looking at the bond length, or by > understanding the residue names)? > > While OB does have a nearly uniform reader API (ie, I can point it to an > SD file, SMILES file, etc and get molecules), and built-in gzip support, I do > have to specify the format type manually. That is, there's no support for > guessing the format based, for example, on the extension. There is automatic recognition of which of several computational chemistry programs a .out and .log files came from. > > OpenBabel has SMARTS support, but I can't tell how complete it is. I know > it doesn't support double bond stereochemistry, but I think it's otherwise > complete, including recursive SMARTS. Is there anything missing? > > OB also supports using a molecule as the query rather than a SMARTS. This is true for fastsearch (indexed by fingerprints) but not, as far as I know, for ordinary SMARTS, without an explicit conversion to SMILES. > > Once the match is made, it's easy to get access to the matched atoms and > bonds, and match them up to the corresponding query atoms and bonds. > > The topic I know the least about is reactions. OB supports reaction SMILES > and SMARTS, as well as RXN files. I don't have a good idea for how good that > support is, and it's not something I used much, although my client does. Reactions are also supported in CML. > > In addition to the support for the query languages/formats, I can't tell > how to use the reactions. How would I do a unimolecular reaction (eg, convert > all of the carbons in CCCN to OOON)? How would I use a reaction for library > generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it > even possible? I looked but didn't find it. > > OB does support some fingerprints. There's a linear hash fingerprint > similar to Daylight's and two feature fingerprint implementations, although > only one is suggested. There's no MACCS key implementation. There is no > support for large/sparse fingerprints, and the only implemented comparison > method is the Tanimoto similarity. MACCS key is supported (using a datafile from RDKit). > > OB does not do depiction. For that case people should turn to other > libraries, such as OASA. It is beginning to. It has 2D coordinate generation/layout. The next version v2.3.0 will have svg depiction of single and multiple molecules. > > There's no MCS or scaffold identification code in OB. There is a descriptor > framework system, support for different forcefields and minimization, and > InChI support. There's no nomenclature support. > > OB is cross-platform (here meaning "Windows and Linux"), with access to > the library from C++, Python, .Net and Java. The documentation is incomplete > and sketchy, but because OB is used by a large number of people, there is > support both through the mailing list and by doing a web search for others > who have used the code. > > I have a metric for testing usability, and that's the number of lines of > code needed to count the total number of atoms of all of the records in an > input file, using one toolkit vs. pybel. OpenBabel suffers because of the > overhead of creating an OBConversion. > > I have another metric for comparing error handling, which is to read an SD > file with records containing errors (format errors and chemistry errors) and > seeing if I can find the number of records which failed to be read in and the > reason for the failure. I haven't figured how out to do that with OB. > > ==== > > Thanks in advance! > > Andrew > da...@dalkescientific.com > > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss > > > > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.829 / Virus Database: 271.1.1/2923 - Release Date: 06/07/10 > 07:35:00 >
------------------------------------------------------------------------------ ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss