Re: [Rdkit-discuss] building models using descriptors

Greg Landrum Tue, 08 May 2012 21:07:43 -0700

On Mon, May 7, 2012 at 7:05 PM, Igor Filippov <[email protected]> wrote:
>
>> Here's the example output:
>>
>>
>>         *** Vote Results ***
>> misclassified: 93/242 (%38.43)  93/242 (%38.43)
>>
> Why the same set of numbers is printed twice?


If you do the predictions with a confidence threshold the two numbers
will be different. One will then be "accuracy relative to the
predictions made" while the other is "accuracy relative to the whole
data set"

>
>> average correct confidence:    0.8520
>> average incorrect confidence:  0.7673
>>
>>         Results Table:
>>
>>           72      61      |  68.57
>>           32      77      |  55.40
>>      ------- -------
>>        69.23   55.80
>>
>
> If I try to compute percentages I'm getting for example
> 72/(72+61) = 54.1%  not 68.57% or any other percentage I see there?
> However 72/(72+32) = 69.23% just as it should...

looks like that's a bug.

FYI: I've been spending some time recently looking at scikit-learn and
have been quite impressed... there's a bit of writeup here:
http://code.google.com/p/rdkit/wiki/WorkingWithSciKitLearn
It's definitely worth taking a look at.

-greg

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] building models using descriptors

Reply via email to