I'd like to start a more specific discussion about ways to combine tactical information with MC-UCT. Here's the scenario.
It's the bot's turn and, prior to starting any playouts, it runs a tactical analyzer (for want of a better name) that labels each string as unconditionally alive, unconditionally dead, conditionally alive, uncertain, or too strong to bother testing. For each conditionally alive string, it finds the move(s) to kill/rescue it. Each possible move, has a critical score indicating the net number of stones whose life/death status would be changed for the better if?the bot?moved there.?Moves with a non-zero critical score are "critical moves." ?I can only afford to run the tactical analyzer at the root node (if that!). The tactical analyzer (also MC-UCT based) makes mistakes. There are several parameters that trade-off reliability vs. speed. The parameters can be adjusted on-the-fly. For uncertain strings, there is a probability score available that is meaningful. Finding the right trade-offs to use in the tactical analyzer must be done in the context of how the information will be used.?(Replacing the tactical analyzer with a better one would doubtless help.) So how should I make use of this, and perhaps additional related?information? Here are some things I have tried, or at least considered. - SINS OF OMISSION * If the highest critical score passes some threshold,?force UCT to consider only moves, (at the root, remember,) above that threshold. (Please don't give me a helpful suggestion for when there is only one such move ;-).) If the threshold is pretty high and the parameters set for good reliability, this never seems to produce worse moves and occasionally saves the bot from game-losing blunders. Even when global UCT is?cranked up to?a high number of playouts, this technique can still rescue it from itself. The downside is that this rule doesn't trigger very often. If the threshold is a bit lower, quality of moves suffers. My gut-feel is that, for 9x9 at least, it's best to only apply this rule for game-swinging moves. * Put critical moves first in progressive widening. They are almost always already there from other rules, so this makes no difference for me. * Identify important strings and add rules in the playouts to encourage contact moves with these strings. This did not work for me and I'm sick of dealing with it. * Add a small bias to the UCT score so that critical moves get tested more often, or just use it directly as a tie breaker. Doesn't seem to help; maybe with careful tweaking... - SINS OF COMMISSION * Disallow moves that would kill your own string. My guess is that this rule should only be applied when the move would kill a pretty large string. * Disallow moves inside live strings or contact moves with dead strings unless those moves are also critical moves. Don't see much impact. - OTHER * Serious disagreements between the tactical analyzer and the global UCT might be a warning sign that UCT isn't going to be able to understand the current position even with more playouts. * When to pass. On KGS, Antigo might know it's won, but not the status of every group. That would be awkward in the scoring phase, so it only passes when it can prove life or death for every string. * Connectivity. I confess to having no clue, but wise people on this list doubtless will. * Collect information from all those local searches and somehow use it in the global search. This is my?long term goal. Maybe n-grams... ???? I think that there is a crossover point where, once the cpu time needed to do a decent tactical analysis at the root becomes a small fraction of the total available for a move, it becomes worthwhile to do so. My program/hardware isn't there yet, but it's just a matter of time. - Dave Hillis ________________________________________________________________________ More new features than ever. Check out the new AIM(R) Mail ! - http://webmail.aim.com
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/