Some extra info on bits of OB I know about.

On 08/06/2010 06:14, Andrew Dalke wrote:
> Hi all,
>
>    I've been asked by a company to evaluate different toolkits for them to 
> use in-house. Geoff thought it might be better to ask on the list than ask 
> him directly and privately, so I'm doing that.
>
>    I've gone through their internal requirements now I'm seeing how it 
> matched up to OB. Some of my conclusions are likely wrong and based on older 
> version of OB, so I'm hoping people here can correct me.
>
>    I'll apologize in advance that I won't be that accessible over the next 
> couple of days, and likely won't be able to respond until Saturday or so. But 
> I felt it was best to get it out now rather than wait.
>
>    =====
>
>    As with all cheminformatics toolkits, OB has ways to get access to the 
> atoms and bonds of a molecule. It support molecular editing, so that atoms 
> and bonds can be added, deleted, or modified as desired. Atoms, bonds, and 
> molecules may have additional user-defined data associated with them.
>
>    OB supports coordinates as part of the molecule (meaning that deleting the 
> atom deletes the associated coordinates), and it supports multiple conformer 
> structures.
>
>    OB follows the Daylight approach where it has a standard chemistry model 
> and all input structures are reperceived based on that model. There is no way 
> to disable that option.
>
>    OB is the foremost program for structure format support and 
> interconversion.
>
>    SD file support is complete for the v2000 and v3000, both for reading and 
> writing. What I'm not sure about is the level of support for v3000. It's 
> mostly support for the chemistry which is in v2000 but expressed differently 
> in v3000, and for support for more than 999 atoms?
I think this is right for v3000, for which the support is fairly 
basic. v2000 has more but is not complete, e.g. no S groups.
>
>    SMILES support is good, although it doesn't have support for 
> stereochemistry around double bonds. Excepting this lack, canonicalization is 
> also good and widely used.
>
>    OB does have PDB file support. I can't tell how good the chemistry 
> perception is. For example, can it detect that a C-C bond is a double or 
> triple bond instead of a single (eg, by looking at the bond length, or by 
> understanding the residue names)?
>
>    While OB does have a nearly uniform reader API (ie, I can point it to an 
> SD file, SMILES file, etc and get molecules), and built-in gzip support, I do 
> have to specify the format type manually. That is, there's no support for 
> guessing the format based, for example, on the extension.
There is automatic recognition of which of several computational 
chemistry programs a .out and .log files came from.
>
>    OpenBabel has SMARTS support, but I can't tell how complete it is. I know 
> it doesn't support double bond stereochemistry, but I think it's otherwise 
> complete, including recursive SMARTS. Is there anything missing?
>
>    OB also supports using a molecule as the query rather than a SMARTS.
This is true for fastsearch (indexed by fingerprints) but not, as far 
as I know, for ordinary SMARTS, without an explicit conversion to SMILES.
>
>    Once the match is made, it's easy to get access to the matched atoms and 
> bonds, and match them up to the corresponding query atoms and bonds.
>
>    The topic I know the least about is reactions. OB supports reaction SMILES 
> and SMARTS, as well as RXN files. I don't have a good idea for how good that 
> support is, and it's not something I used much, although my client does.
Reactions are also supported in CML.
>
>    In addition to the support for the query languages/formats, I can't tell 
> how to use the reactions. How would I do a unimolecular reaction (eg, convert 
> all of the carbons in CCCN to OOON)? How would I use a reaction for library 
> generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it 
> even possible? I looked but didn't find it.
>
>    OB does support some fingerprints. There's a linear hash fingerprint 
> similar to Daylight's and two feature fingerprint implementations, although 
> only one is suggested. There's no MACCS key implementation. There is no 
> support for large/sparse fingerprints, and the only implemented comparison 
> method is the Tanimoto similarity.
MACCS key is supported (using a datafile from RDKit).
>
>   OB does not do depiction. For that case people should turn to other 
> libraries, such as OASA.
It is beginning to. It has 2D coordinate generation/layout. The next 
version v2.3.0 will have svg depiction of single and multiple molecules.
>
>   There's no MCS or scaffold identification code in OB. There is a descriptor 
> framework system, support for different forcefields and minimization, and 
> InChI support. There's no nomenclature support.
>
>    OB is cross-platform (here meaning "Windows and Linux"), with access to 
> the library from C++, Python, .Net and Java. The documentation is incomplete 
> and sketchy, but because OB is used by a large number of people, there is 
> support both through the mailing list and by doing a web search for others 
> who have used the code.
>
>    I have a metric for testing usability, and that's the number of lines of 
> code needed to count the total number of atoms of all of the records in an 
> input file, using one toolkit vs. pybel. OpenBabel suffers because of the 
> overhead of creating an OBConversion.
>
>    I have another metric for comparing error handling, which is to read an SD 
> file with records containing errors (format errors and chemistry errors) and 
> seeing if I can find the number of records which failed to be read in and the 
> reason for the failure. I haven't figured how out to do that with OB.
>
>    ====
>
> Thanks in advance!
>
>                               Andrew
>                               da...@dalkescientific.com
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.829 / Virus Database: 271.1.1/2923 - Release Date: 06/07/10 
> 07:35:00
>


------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to