On 8 June 2010 06:14, Andrew Dalke <da...@dalkescientific.com> wrote:
> Hi all,
>
>  I've been asked by a company to evaluate different toolkits for them to use 
> in-house. Geoff thought it might be better to ask on the list than ask him 
> directly and privately, so I'm doing that.
>
>  I've gone through their internal requirements now I'm seeing how it matched 
> up to OB. Some of my conclusions are likely wrong and based on older version 
> of OB, so I'm hoping people here can correct me.
>
>  I'll apologize in advance that I won't be that accessible over the next 
> couple of days, and likely won't be able to respond until Saturday or so. But 
> I felt it was best to get it out now rather than wait.
>
>  =====

I don't know the answers to most of these things, but those I do I've answered:

>  As with all cheminformatics toolkits, OB has ways to get access to the atoms 
> and bonds of a molecule. It support molecular editing, so that atoms and 
> bonds can be added, deleted, or modified as desired. Atoms, bonds, and 
> molecules may have additional user-defined data associated with them.
>
>  OB supports coordinates as part of the molecule (meaning that deleting the 
> atom deletes the associated coordinates), and it supports multiple conformer 
> structures.
>
>  OB follows the Daylight approach where it has a standard chemistry model and 
> all input structures are reperceived based on that model. There is no way to 
> disable that option.
>
>  OB is the foremost program for structure format support and interconversion.
>
>  SD file support is complete for the v2000 and v3000, both for reading and 
> writing. What I'm not sure about is the level of support for v3000. It's 
> mostly support for the chemistry which is in v2000 but expressed differently 
> in v3000, and for support for more than 999 atoms?
>
>  SMILES support is good, although it doesn't have support for stereochemistry 
> around double bonds. Excepting this lack, canonicalization is also good and 
> widely used.

It does have support for stereochemistry around double bonds.
Stereochemistry support is much improved for other formats in the
development code though.

>  OB does have PDB file support. I can't tell how good the chemistry 
> perception is. For example, can it detect that a C-C bond is a double or 
> triple bond instead of a single (eg, by looking at the bond length, or by 
> understanding the residue names)?
>
>  While OB does have a nearly uniform reader API (ie, I can point it to an SD 
> file, SMILES file, etc and get molecules), and built-in gzip support, I do 
> have to specify the format type manually. That is, there's no support for 
> guessing the format based, for example, on the extension.

In most cases, the format type is the extension, but I suppose what
you say is correct.

>  OpenBabel has SMARTS support, but I can't tell how complete it is. I know it 
> doesn't support double bond stereochemistry, but I think it's otherwise 
> complete, including recursive SMARTS. Is there anything missing?
>
>  OB also supports using a molecule as the query rather than a SMARTS.

Hmm...not sure about this. Does it?

>  Once the match is made, it's easy to get access to the matched atoms and 
> bonds, and match them up to the corresponding query atoms and bonds.
>
>  The topic I know the least about is reactions. OB supports reaction SMILES 
> and SMARTS, as well as RXN files. I don't have a good idea for how good that 
> support is, and it's not something I used much, although my client does.
>
>  In addition to the support for the query languages/formats, I can't tell how 
> to use the reactions. How would I do a unimolecular reaction (eg, convert all 
> of the carbons in CCCN to OOON)? How would I use a reaction for library 
> generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it 
> even possible? I looked but didn't find it.

Perhaps OBChemTsfm does something here?
http://openbabel.org/api/2.2.0/classOpenBabel_1_1OBChemTsfm.shtml.

>  OB does support some fingerprints. There's a linear hash fingerprint similar 
> to Daylight's and two feature fingerprint implementations, although only one 
> is suggested. There's no MACCS key implementation. There is no support for 
> large/sparse fingerprints, and the only implemented comparison method is the 
> Tanimoto similarity.

MACCS key is in there. There is also support for user-defined fingerprints.

>  OB does not do depiction. For that case people should turn to other 
> libraries, such as OASA.

OB can do depiction, at least in the development version.

>  There's no MCS or scaffold identification code in OB. There is a descriptor 
> framework system, support for different forcefields and minimization, and 
> InChI support. There's no nomenclature support.

>  OB is cross-platform (here meaning "Windows and Linux"), with access to the 
> library from C++, Python, .Net and Java. The documentation is incomplete and 
> sketchy, but because OB is used by a large number of people, there is support 
> both through the mailing list and by doing a web search for others who have 
> used the code.

Also MacOSX and Ruby. Also Cygwin, MinGW. Works with all of G++, MSVC,
Intel Compiler. Also, support is available from a number of
independent consultants (as far as I am aware).

>  I have a metric for testing usability, and that's the number of lines of 
> code needed to count the total number of atoms of all of the records in an 
> input file, using one toolkit vs. pybel. OpenBabel suffers because of the 
> overhead of creating an OBConversion.

Don't forget Pybel is part of OpenBabel.

>  I have another metric for comparing error handling, which is to read an SD 
> file with records containing errors (format errors and chemistry errors) and 
> seeing if I can find the number of records which failed to be read in and the 
> reason for the failure. I haven't figured how out to do that with OB.

One other feature you haven't mentioned is that it has a plugin
architecture for fingerprints, formats, operations, charge models and
so forth (it's the same architecture in each case). This means that
internally a company could create a single .cpp file and compile it in
as a format or operation or whatever. This can be easily called from
babel and can do anything under the sun.

>  ====
>
> Thanks in advance!
>
>                                Andrew
>                                da...@dalkescientific.com
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to