I'm just digging through tons of posts I didn't have time to read. This one is particularly interesting for me. Thanks for sharing this idea. I don't have slow tactical reader, but it helps me to understand why heavy playouts work while direct optimization of strength of the playout doesn't help.
I wonder what would happen if we limit capture and atari heuristics to only prehistoric stones. Lukasz On Mon, Feb 2, 2009 at 13:36, Mark Boon <tesujisoftw...@gmail.com> wrote: > I haven't gotten very far yet in incorporating many of the suggestions > published on this mailing-list into the MCTS ref-bot. As such I feel I still > have a lot of catching up to do when it comes to MC programs, mostly due to > lack of time. > > But I had an idea I wanted to share as I haven't seen anything like it > described here. It comes forth from an observation I had when looking at > playouts and what effects some of the patterns had on it. So far it's my > opinion that guiding playouts is mostly useful in order to maintain certain > features of the original position and prevent the random walk from stumbling > into an unreasonable result. As an example I'm going to use the simple case > of a stone in atari that cannot escape. When random play tries an escaping > move, I make the program automatically play the capturing move to maintain > the status of the stone(s) (now more than one) in atari. When implementing > something like that in the playouts however, more often than not this > 'pattern' arises not based on the original position but purely from the > random play. I figured it doesn't help the program at all trying to maintain > the captured status of a stone or stones that weren't even on the board at > the start of the playout. > > So I tried a simple experiment: whenever a single stone is placed on the > board I record the time (move-number really) it was created in an array I > call stoneAge. When more stones are connected to the original they get the > same age. When two chains merge I pick an arbitrary age of the two (I could > have picked the smallest number, but it doesn't really matter). So for each > chain of stones the array marks the earliest time of creation. Next, when a > playout starts, I mark the starting time in a variable I call 'playoutStart' > and there's a simple function: > > boolean isPrehistoric(int chain) > { > return stoneAge[chain]<=playoutStart; > } > > During playout, I only apply the tactical reader to chains for which the > isPrehistoric() function returns true. Tests show that using this method > doesn't affect the strength of the program at all. But the amount of time > spent in the tactical reader is cut in less than half. > > I'm suspecting the same holds true to a large degree for other patterns, but > I haven't had the time yet to test that. Other cases may not provide as much > gain because they are cheaper to compute. But I think in general it's better > to let the random play run its course as much as possible and restrict moves > guided by patterns as much as possible to situations relevant to the > original position. The stone-age information is very cheap to maintain so > it's hardly a burden to use. > > Hope this helps anyone, especially those with slow tactical readers :) If > anyone manages to use this successfully in other situations than tactical > readers I'd be interested to hear it, as so far it's only a hunch that this > has wider applicability than just tactics. I was going to wait until posting > this until I had time to try it out for myself but lately I didn't have the > time. > > Mark > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/