Don Dailey wrote:
Another example I found is the impressive Valkyria program. Version
2.7 won 92% of it's games, more than even the top rated greenpeep0.5.1.
However, the average rating of Valkyria's opponents was only 1722.
This is quite a difference. So Valkyria is rated only 2222 compared to
greenpeep 2621 despite the fact that it wins more!
Of course 2222 is still impressive and Valkyria is number 17 out of all
the programs that have played over 200 games (over 200 of them.)
- Don
Hi,
I would like to add a note to this discussion to explain that the
computed rating is not a function of winning rate and average opponent.
Simply take this simple example into consideration:
player A:
1 win and 1 loss against a player of rating 1500
1 win against a player of rating 500
1 loss against a player of rating 4500
-> 50% against an average of 2000
player B:
1 win and 1 loss against a player of rating 2000
1 win against a player of rating 1000
1 loss against a player of rating 3000
-> 50% against an average of 2000
Although they have the same average opponent, and the same winning rate,
player A's evaluation should be much lower than player B's.
Maybe this was clear to Don already, but his message sounds a little
like it would be possible to estimate rating from winning rate and
average opponent. It is not.
Some rating algorithms try to do it anyway (like EloStat, and the rating
system of the French Chess Federation), but they are very badly flawed.
Real-life examples where they fail badly can be found on bayeselo's page:
http://remi.coulom.free.fr/Bayesian-Elo/
(look for "average of ratings" in the page)
Rémi
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/