Re: [computer-go] Monte-Carlo Simulation Balancing

Michael Williams Wed, 29 Apr 2009 13:39:37 -0700

David Silver wrote:

Hi Michael,
But one thing confuses me: You are using the value from Fuego's 10ksimulations as an approximation of the actual value of the position.But isn't the actualvalue of the position either a win or a loss? On such small boards,can't you assume that Fuego is able to correctly determin who iswinning and round it'sevaluation to the nearest win/loss? i.e. if it evaluates the positionto 0.674, that gets rounded to 1. If such an assumption about Fuego'sability to readthe position on a small board is valid, then it should improve theresults of your balanced simulation strategy, right? Or am I missingsomething?
It's true that 5x5 Go is solved, so in principle we could have used thetrue minimax values. However we chose to use an approach that can scaleto larger boards, which means that we should treat the expertevaluations as approximate. And in fact Fuego was not always accurate on6x6 boards, as we used only 10k simulations in our training set.
Also, I think it really helps to have "soft" rather than "hard" expertevaluations. We want a simulation policy that helps differentiate e.g. a90% winning position from an 85% winning position. Rounding all theexpert evaluations to 0 or 1 would lose much of this advantage.
-Dave


By this argument (your last paragraph), you need to do some magical
number of simulations for the training data.  Not enough simulations
and you have too much noise.  And infinite simulations gives you hard
0 or 1 results.  But I can't argue with your results.

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Monte-Carlo Simulation Balancing

Reply via email to