On Mon, 2006-10-30 at 19:34 -0800, Peter Drake wrote: > I'm running into a problem where my Monte Carlo program is very slow > to acknowledge that its favorite move has a strong counter. > Part of the problem is that the value of a move is based on how many > of the runs through that move have succeeded. If there were a lot of > them before the correct reply was discovered, the move has > considerable "inertia" and it takes time to recover.
The inertia effect is minor, once it finds the right move it tends to quickly recover. The reason this works so well is that you are taking very large samples, 100 samples wouldn't begin to be enough. If you want to experiment, the best way to solve this problem in my opinion is to emphasize the later moves more than the early moves but in a very smooth way. The way I did this was to hit each move I encounter as well as it's siblings, with a "degrade factor". I multiply the scores and the moves by the same constant. Something very close to 1.0 such as 0.99999. You have to use floating point values to do this. Consider how this works - if you play 100 games and win half of them, with my method it matters which ones you win. With the normal method it doesn't matter. With my method you will score much higher if you win more of the games towards the end. Conversely, if this move is refuted and you lose most of the later games, your score will drop much more quickly. - Don > Has anyone tried only counting the most recent games through a move > (say, the last 100), rather than the entire history? > > Peter Drake > Assistant Professor of Computer Science > Lewis & Clark College > http://www.lclark.edu/~drake/ > > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/