On Mon, Nov 26, 2012 at 4:05 AM, "Ingo Althöfer" <[email protected]>wrote:
> One general comment: > > Ratings are not transitive. For instance, > A1 may score 25 % against B, > and A2 may score 22 % against B. > Then it can not be concluded that A1 will score more than 50 % > in direct duel with A2. > > It is rather easy it construct triples of "semi-simple" agents A, B, C > for some "normal" game where > A score 95+ percent against B, > B scores 95+ percent against C, > C scores 95+ percent against A. > Hi Ingo, The ELO system which tries to model game playing skill mathematically makes some assumptions that are not completely true, but are approximations to the reality. One assumption made by the ELO system is that skill IS transitive. It works quite well because in practice human skill and program skill is nearly transitive. So it has proven to be a very good model indeed. As you say it is not difficult to artificially construct classes of players who do not have transitive relationships between each other. One very simple way to do this is to take 3 equal players, and give them each a different opening book such that the book will get them quickly into losing or winning situations against each other. You can create your own "rocks/paper/scissors" non-transitive relationship this way. You can also do it with the playing algorithm but it's a bit more difficult but certainly possible. You give one program a serious weakness that one of the other 2 can easily exploit but that the other program cannot exploit - so each program has a unique exploitable weakness that only one of the other 2 programs can exploit. Don > > Ingo. > > -------- Original-Nachricht -------- > > Datum: Sun, 25 Nov 2012 17:03:33 -0800 > > Von: Leandro Marcolino <[email protected]> > > An: [email protected] > > Betreff: [Computer-go] Practical significance? > > > Hello all!.. > > > > I am currently doing a research about Computer Go. I can't tell the > > details > > about it yet, but I will post them here after (if) my paper is > accepted... > > > > In my research I compare many systems (An), playing against a fixed > strong > > adversary (B). So A1 would have a percentage of victory x1 against B, > > while > > A2 would have a percentage of victory x2, etc... Then I compare the > > percentage of victories, and for most cases I can show that one system is > > better than another with 95% of confidence. However, my adviser is asking > > me about not only the STATISTICAL significance of the results, but also > > the > > PRACTICAL significance of them. I mean, if one system is, for example > only > > 1% better than another, with 99% of confidence, the result would have a > > statistical significance, but wouldn't really matter in a practical > sense. > > > > In my case, the difference between the systems can range from about 4% to > > about 23%. Doesn't seem to be enough to argue that one system would be > > one-handicap stone better than another. But what would be the minimum > > difference for me to argue that one system is significantly better than > > another, in a practical sense? (or they are not, in the end?..) Would > > calculating ELO-ratings help me in answering this question? > > > > I think it gets even more complex if we think that, let's say, changing > > the > > percentage of victory from 95% to 100% seems to be much more significant > > (in a practical sense) than changing from 30% to 35%, even though the > > difference between the two systems is still only 5%. In my case, I am > > dealing with percentages of victories that range from around 30% to > > around > > 53%. > > > > What do you guys think?.. > > > > Thanks for your help!.. > > > > Regards, > > Leandro > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
