On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote: > ChooseMove(node, board) { > bias = 0.015 // I put a random number here, to be tuned > b = bias * bias / 0.25 > best_value = -1 > best_move = PASSMOVE > for (move in board.allmoves) { > c = node.child(move).counts > w = node.child(move).wins > rc = node.rave_counts[move] > rw = node.rave_wins[move] > coefficient = 1 - rc / (rc + c + rc * c * b) > value = w / c * coef + rw / rc * (1 - coef) // please here take care of > the c==0 and rc == 0 cases > if (value > best_value) { > best_value = value > best_move = move > } > } > return best_move > }
Hi, it seems to me that, when you select play in the tree, you don't have an exploration component. You use just a weighted average of score and RAVE score. So, if : - the best play is a good only if played immediatly and very bad if played later in the game : - the first playout for this play resulted in a lost. score and RAVE score will be very low and this play will never be considered again until a very long time. Is it simplified code and in reality you replace w/c and rw/rc by scores with exploration component or did you realy use it as is ? Tom -- Thomas Lavergne "Entia non sunt multiplicanda praeter necessitatem." (Guillaume d'Ockham) thomas.laver...@reveurs.org http://oniros.org _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/