it is a good thing to make your prior knowledge completely fair (in the sense of not having any bias) when doing bayesian calculations. any estimator being used will reshape that knowledge on the fly.
the idea is that your prior knowledge of the ELO ranking should be about the same for every single version of the program, and as you obtain evidence (i.e. wins and losses) between pairs of programs, you can get a more and more confident belief about the actual ELO. so they'll converge to the correct values, and should do so reasonably rapidly. i believe that bayeselo is using maximum likelihood as an estimator of the "actual" elo, and there are other estimators that can be used, but this whole approach seems much better than the usual elo rating scheme. note that if you have 10 games between a particular pair of players and, say, player A wins 7 games and player B wins 3 games, and you repeat for another 10 games and get that player A wins 7 games and player B wins 3 games, this is *different* (and more!) information than if i simply told you that player A won 14 out of 20 games against player B. even more information is given if i know who won which games and in what order. bayesian inference takes this into account, which is nice. s. ----- Original Message ---- From: Don Dailey <[EMAIL PROTECTED]> To: computer-go <computer-go@computer-go.org> Sent: Tuesday, January 29, 2008 1:46:56 PM Subject: Re: [computer-go] 19x19 Study They seem under-rated to me also. Bayeselo pushes the ratings together because that is apparently a valid initial assumption. With enough games I believe that effect goes away. I could test that theory with some work. Unless there is a way to turn that off in bayelo (I don't see it) I could rate them with my own program. Perhaps I will do that test. - Don Sylvain Gelly wrote: > > but not linearly and you can see a nice gradual curve in the plot. > > Now we have something we can argue about for weeks. Why is it not > mostly linear? Could it be the memory issue I just mentioned? > > > Hi Don and all participants to that study, that is very interesting! > > The memory constrains are certainly a valid hypothesis, especially the > default settings of the release are rather conservative on that side, > because it seemed better to have a weaker player than begin to make > the player's machine swapping... Those settings are rather fitting > your memory constrains as well, so it is fine. > > Reading your email and looking at the curve, I wonder if one possible > explanation could be an artifact on how the ratings are computed? My > question is: what curve would we see for that study if the involved > players were exactly linearly scalable? That seems silly, but I > wondered if there were an underestimating of higher levels, because of > the way the bayeselo works. I am also looking at the curve after the > 5-6th level (~gnugo), as behavior may be different for very low levels. > > I don't know if my hypothesis makes sense. > Sylvain > ------------------------------------------------------------------------ > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/