I just want to make some comments about MC evaluation to remove some common misunderstandings.

I have seen some complaints about misevaluation such as a program having 65% chance of winning in a game which is lost and the other way around. For example arguments has been proposed in line with "since evaluation is so bad there has to be a better way of doing it".

I just want to point out one thing: any Winrates except 0% and 100% is wrong assuming perfect play. 1% and 99% (or anything in between) means that the program is not sophisticated enough to either a) always play a winning move in the simulations b) search deep enough to solve the game proving a win/loss.

*BUT*

Having an incorrect evaluation is unimportant, as long as the program plays the best move. What really matters is which move has the highest winrate relative to all other candidate moves.

MC-programs with no knowledge except avoiding eyefilling often evaluate all positions as being very close to 50%. As soon as one adds appropriate knowledge there is a much larger range between the best and worst move on the board. Normally this range is correlated with the strength of the program. (Be aware that buggy programs might have even larger ranges though).

Also with UCT the winrate at the root has little to do with any objective probability of the program winning. If one look at the principle variation at the end of the game you will notice a large difference in winrate at each depth. The winrates at the root change very slowly even if it is 0% or 100% at the leaves, but still the relative ordering of the moves at the root is often correct.

*FINALLY*

The use of MC-eval and UCT are *completely orthogonal* to using go knowledge in the program. You can add any kind of knowledge at all stages to the basic search algorithm and possibly benefit from it.

The problem is enginering. If your implementation of the knowledge are too slow the program gets weaker with fixed time limits. The knowledge you added might make the program weaker for a number of reasons.

Arguments such as "I saw program X misplay situation Y and therefore MC-eval is flawed" is just plain wrong. It just mean that specific program X has a flaw and nothing else.

What one can argue are "I wrote a program with a nice new method for evaluation and a new approach to searching that plays situation Y correctly, and also happens to beat program X all the time using the same hardware". Until I see that argument, I will continue to beleive that methods similar to MC-eval and UCT-search are the future of computer go.

-Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to