Re: [Open Babel] What is the time complexity of generating a canonical SMILES

Noel O'Boyle Fri, 02 Nov 2018 03:32:19 -0700

First of all, I'd recommend using the development code rather than the
latest release. Also, ensure that hydrogens are suppressed. You may also
consider trying to avoid multiple canonicalisations of the same graph by
just writing out a non-canonical SMILES, and only at the end collating them
and canonicalising.


While I agree with Andrew, it is also true that Open Babel's
canonicalisation implementation is slow. This is partly due to the fact
that we calculate all of the graph invariants up-front rather than lazily
to split ties. In particular, we calculate the pairwise distance of every
atom from every other atom (as far as I remember). This is O(N**2).


Regards,
- Noel

On Fri, 2 Nov 2018 at 08:02, Andrew Dalke <da...@dalkescientific.com> wrote:

> On Nov 2, 2018, at 01:33, Xianghai Sheng <xsh...@ucmerced.edu> wrote:
> > I am trying to figure out the time complexity (Big O) of converting from
> OBMol to canonical smiles.
>
> I don't think that's a useful way to think about your problem.
>
> Graph canonicalization is "at least as computationally hard as the graph
> isomorphism problem." https://en.wikipedia.org/wiki/Graph_canonization ,
> and graph isomorphism is believed to be quasi-polynomial.
>
> That said, I believe the actual algorithm Open Babel uses may have
> exponential time for some graphs.
>
> However, most molecular graphs don't reach the worst-time behavior.
>
> Furthermore, time complexity is mostly useful when thinking about large N.
> Molecules typically have small N, and there may be lower-order terms which
> dominate the times for small N.
>
> If you are interested in the topic of graph canonicalization algorithms in
> general, you might start with the nauty papers; see
> http://pallini.di.uniroma1.it/
>
> > I want to know this because getting canonical smiles takes up half of
> the running time of my reaction path searching program.
>
> How would knowing the time complexity help you?
>
> While this may be obvious, if canonicalization takes 1/2 the time, then
> you'll only get 2x speedup even if that time goes to 0. How much work are
> you willing to do for a 2x speedup?
>
> Might it be possible to do about the same amount of work to parallelize
> your algorithm? For example, if your reaction path search generates a set
> of nodes which must all be canonicalized in order to evaluate them, then
> you can do all of those canonicalizations in parallel.
>
> Cheers,
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
>
>
>
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Re: [Open Babel] What is the time complexity of generating a canonical SMILES

Reply via email to