James Stroud wrote: > Now with one test positive for Int, you are getting pretty certain you > have an Int column. Now we take a second cell randomly from the same > column and find that it too casts to Int. > > P_2(H) = 0.9607843 --> Confidence its an Int column from round 1 > P(D|H) = 0.98 > P(D|H') = 0.02 > > P_2(H|D) = 0.9995836 > > > Yikes! But I'm still not convinced its an Int because I haven't even had > to wait a millisecond to get the answer. Lets burn some more clock cycles. > > Lets say we really have an Int column and get "lucky" with our tests (P > = 0.98**4 = 92% chance) and find two more random cells successfully cast > to Int: > > P_4(H) = 0.9999957 > P(D|H) = 0.98 > P(D|H') = 0.02 > > P(H|D) = 0.9999999
I had typos. P(D|H') should be 0.01 for all rounds. Also, I should clarify that 4 of 4 are positive with no fails observed. Integrating fails would use the last posterior as a prior in a similar scheme. Also, given a 1% false positive rate, after only 4 rounds you are 1 - (0.01**4) = 99.9999% sure your observations aren't because you accidentally pulled 4 of the false positives in succession. James -- http://mail.python.org/mailman/listinfo/python-list