On Wed, May 30, 2012 at 4:50 PM, Greg Landrum <[email protected]> wrote:
> On Wed, May 30, 2012 at 4:13 PM, Jan Holst Jensen <[email protected]>
> wrote:
>> My failing Linux Mint is 32-bit like George's 12.04. Don't know if it is
>> significant but it could be that the problem only occurs on 32-bit.
>>
>> Greg mentioned that he has successfully built and tested on Ubuntu 12.04 -
>> was that 64-bit or 32-bit ?
>
> 64bit. I'll try it on a 32bit VM tonight.
This morning I installed the rdkit on a clean 32bit Ubuntu 12.04 VM.
While installing the cartridge, I was able to reproduce the problems
you guys observed. I was also able to find the problem and a
provisional fix for it. If you edit guc.c and replace the calls to
DefineCustomRealVariable() with the following:
DefineCustomRealVariable(
"rdkit.tanimoto_threshold",
"Lower threshold of Tanimoto similarity",
"Molecules with similarity lower than
threshold are not similar by % operation",
&rdkit_tanimoto_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
#if PG_VERSION_NUM >= 90100
//(GucRealCheckHook)TanimotoLimitAssign,
NULL,
NULL,
#else
TanimotoLimitAssign,
#endif
NULL
);
DefineCustomRealVariable(
"rdkit.dice_threshold",
"Lower threshold of Dice similarity",
"Molecules with similarity lower than
threshold are not similar by # operation",
&rdkit_dice_smlar_limit,
0.5,
0.0,
1.0,
PGC_USERSET,
0,
#if PG_VERSION_NUM >= 90100
//(GucRealCheckHook)DiceLimitAssign,
NULL,
NULL,
#else
DiceLimitAssign,
#endif
NULL
);
The cartridge should build successfully and the tests should all pass
except bfpgist-91. That one will fail because two molecules with
identical similarity come back in a different order. I just checked in
a fix for that.
I have not checked in the changes to guc.c because I haven't convinced
myself that what I've done is really the right thing to be doing. Once
I've done a bit more digging I will check in the guc.c changes.
For those who want some technical detail:
on the 32bit system it looks like TanimotoLimitAssign and
DiceLimitAssign, which are there to provide extra checks that the
parameters are correct, end up being called with incorrect values. So
instead of seeing 0.5 as the threshold value, I was seeing values like
-1.14 (or some such thing). This is clearly bogus. Removing the calls
to those functions seems to have no practical effect, because all they
do is check that the thresholds are between 0 and 1, which postgresql
already does. I didn't write that code, so I'm not sure if it's there
because old versions of postgres behaved differently or if it's been
redundant all along.
Best,
-greg
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss