Yes - just the offending SMILES will be fine. The "reason: " is idyllic but
not necessary ...
On 11 May 2011 06:37, Greg Landrum <[email protected]> wrote:
> Dear JP,
>
> On Tue, May 10, 2011 at 11:30 AM, JP <[email protected]> wrote:
> > Thanks for this.
> > I, for one, think it is useful - not in a "parse this particular smiles
> > string" fashion.
> > But consider this use case. I have 8,000,000 molecules in a few hundred
> > smiles files on which I am calculating descriptors on the cloud. I only
> > have access to log files.
> > I get some ten thousand "SMILES Parse Error" without any additional
> info.
> > Also, I think this error should be just one line (no need to bloat log
> > files with redundant static data).
>
> Yeah, the use case is clear.
>
> > These should have a bit of static info which is the same for both (so you
> > can grep on that) and must have (on the same line) the offending smiles
> > string, which you could extract easily with regex, so I suggest something
> > structured like:
> > In [2]: Chem.MolFromSmiles('Ccc1XXXcCCC')
> > [06:06:25] SMILES Parse Error: Ccc1XXXcCCC (reason: unknown atoms X)
> > In [3]: Chem.MolFromSmiles('C1C')
> > [06:06:28] SMILES Parse Error: C1C (reason: unclosed ring for input)
>
> Providing a good reason for the failure would certainly sometimes be
> useful. It is theoretically possible, but it will require a lot of
> work (there are many, many reasons a SMILES could fail to parse). I
> think the initial version of this is going to have to just include the
> SMILES that caused the failure. Adding explanations is something that
> will need to wait.
>
> Best,
> -greg
>
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss