Dear Jan,

On Mon, Apr 29, 2013 at 8:03 AM, Jan Holst Jensen <[email protected]> 
wrote:
> Hi RDKitters,
>
> I wonder why the InChI strings generated by RDKit differ from the ones
> generated by the standard IUPAC inchi-1 executable.

At least some were due to an RDKit bug that has been fixed for a while
(it's in the 2013.03 release). The fix isn't reflected in the knime
nodes because we haven't done an update of the knime binaries in a
while; that's coming in the next day or so.

> I ran the standard InChI example file Samples.sdf through the KNIME workflow
> and compared with the InChIs generated from the IUPAC executable. A number
> of InChI strings are different; it seems to be almost all stereo-related.

Here's what I get from Python:

>
> For example: InChI strings generated for spiro.mol (spiro.mol - attached):
>
> IUPAC:
> InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3/t2*7-,8-,9-/m10/s1
> RDKit: InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3

This one still doesn't recognize the stereo. I'll file a bug for it:
In [2]: Chem.MolToInchi(Chem.MolFromMolFile('spiro.mol'))
[09:53:16] WARNING: Omitted undefined stereo
Out[2]: 'InChI=1S/2C9H14Cl2/c2*1-7(10)3-9(4-7)5-8(2,11)6-9/h2*3-6H2,1-2H3'

>
> and stertaut.mol (stertaut.mol - attached):
>
> IUPAC:
> InChI=1S/C6H6O5/c7-1-2-3(5(8)9)4(2)6(10)11/h1,3-4,7H,(H,8,9)(H,10,11)/b2-1-/t3-,4+/m0/s1
> RDKit:
> InChI=1S/C6H6O5/c7-1-2-3(5(8)9)4(2)6(10)11/h1,3-4,7H,(H,8,9)(H,10,11)/t3-,4-/m1/s1

In [3]: Chem.MolToInchi(Chem.MolFromMolFile('stertaut.mol'))
Out[3]: 
'InChI=1S/C6H6O5/c7-1-2-3(5(8)9)4(2)6(10)11/h1,3-4,7H,(H,8,9)(H,10,11)/b2-1-/t3-,4+/m0/s1'

looks fine.

>
> OK, now those InChI samples look like they are heavy on fringe cases and
> perhaps thus likely to really stress toolkits.

These are the best kind. :-)

>
> So I took something more peaceful and ran a peptide from PubChem through
> (pubchem_71296070.mol - attached).
>
> IUPAC:
> InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m1/s1
> RDKit:
> InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m0/s1
>

In [4]: Chem.MolToInchi(Chem.MolFromMolFile('pubchem_71296070.mol'))
Out[4]: 
'InChI=1S/C33H55N9O10/c1-18(43)26(37)31(49)40-23(16-20-10-4-3-5-11-20)30(48)39-21(12-6-8-14-34)28(46)38-22(13-7-9-15-35)29(47)42-27(19(2)44)32(50)41-24(33(51)52)17-25(36)45/h3-5,10-11,18-19,21-24,26-27,43-44H,6-9,12-17,34-35,37H2,1-2H3,(H2,36,45)(H,38,46)(H,39,48)(H,40,49)(H,41,50)(H,42,47)(H,51,52)/t18-,19-,21+,22+,23+,24+,26+,27+/m1/s1'

also looks fine.
-greg

------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to