On 7/11/07, Jacques Basaldúa <[EMAIL PROTECTED]> wrote:
I will try to explain it better: E.g. The game is in a position where black is in danger. That position is the root node. All stones in the root node are inherited in any node below, except when they are captured. Your trick pretends to favor defense. Therefore, black has higher probabilities of survival than with uniformly random playouts. Since all nodes in the tree have been inherited from root, they are mostly in the same situation. The simulation is a stochastical estimator of either the territorial value of the game or the percentage of win (which is determined by comparing the former with some threshold). Since you are favoring systematically one of the players (the one who is in danger at root is always the same player) you are biasing the estimation. Because the variance of a random playout is so big compared with the difference in conditional probability: P(win | a good move) - P(win | a bad move) is a very small number -> the smallest bias is too much bias -> the program gets weaker.
I'm still having trouble understanding this, but I will try to say what I got out of it. It seems that when black is in trouble, a bias towards defensive moves on both sides means that black would be playing well and white would be playing poorly, because the best move for black is likely one of those moves, while the best move for white is probably not a defensive move. And this would mean that a position where black is in trouble would look stronger than in a random playout (due to black playing well only for this kind of situation) which would make it harder to tell which positions are actually good. Or in general, an improvement in play that only works for some positions will tend to make those positions look good, and make it hard to tell which positions actually are good. - Brian _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/