All the more since you're testing the same idea on two bots
simultaneaously. So if you want to be wrong at most five percent of the
time, and consider you are better as soon as one of the bots gets
better, you have to make individual tests at the 2.5% level.
And I'm not even taking into account the fact that you want to continue
testing till you reach significance. That would again require you take a
lower level.
Jonas
On Thu, Aug 4, 2011 at 6:57 PM, Vlad Dumitrescu <[email protected]> wrote:
The scores towards gnugo are almost
identical, but the two fuegos score 449-415, which is 52% and the 95%
confidence is ~3%, i.e. ~10 ELO.
That 3% is not a 95% confidence interval, more like 1 standard
deviation... (so nothing with high confidence yet)
Erik
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go