I just want to make some comments about MC evaluation to remove some
common misunderstandings.
I have seen some complaints about misevaluation such as a program
having 65% chance of winning in a game which is lost and the other way
around. For example arguments has been proposed in line with "since
evaluation is so bad there has to be a better way of doing it".
I just want to point out one thing: any Winrates except 0% and 100% is
wrong assuming perfect play. 1% and 99% (or anything in between) means
that the program is not sophisticated enough to either a) always play
a winning move in the simulations b) search deep enough to solve the
game proving a win/loss.
*BUT*
Having an incorrect evaluation is unimportant, as long as the program
plays the best move. What really matters is which move has the highest
winrate relative to all other candidate moves.
MC-programs with no knowledge except avoiding eyefilling often
evaluate all positions as being very close to 50%. As soon as one adds
appropriate knowledge there is a much larger range between the best
and worst move on the board. Normally this range is correlated with
the strength of the program. (Be aware that buggy programs might have
even larger ranges though).
Also with UCT the winrate at the root has little to do with any
objective probability of the program winning. If one look at the
principle variation at the end of the game you will notice a large
difference in winrate at each depth. The winrates at the root change
very slowly even if it is 0% or 100% at the leaves, but still the
relative ordering of the moves at the root is often correct.
*FINALLY*
The use of MC-eval and UCT are *completely orthogonal* to using go
knowledge in the program. You can add any kind of knowledge at all
stages to the basic search algorithm and possibly benefit from it.
The problem is enginering. If your implementation of the knowledge are
too slow the program gets weaker with fixed time limits. The knowledge
you added might make the program weaker for a number of reasons.
Arguments such as "I saw program X misplay situation Y and therefore
MC-eval is flawed" is just plain wrong. It just mean that specific
program X has a flaw and nothing else.
What one can argue are "I wrote a program with a nice new method for
evaluation and a new approach to searching that plays situation Y
correctly, and also happens to beat program X all the time using the
same hardware". Until I see that argument, I will continue to beleive
that methods similar to MC-eval and UCT-search are the future of
computer go.
-Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/