Petri Pitkanen wrote: > I think Monte-Carlo is more attempting solve a different issue > altogether. Sure it is a search tree buyt main problem is the > evaluation function. Currently we do not know any good way to evaluate > the situation on go board until the game is at very late stages. And I > think - not that I could support this with any testing - that most of > the current evaluation function would not play better if they had > deeper global search and actually may play worse with wider global > search. > Yes, evaluation is a difficult problem in Go. UCT neatly "solves" that problem because UCT does not require a formal evaluation function.
> Relatively speaking chess eval of adding piece values together and > doing nothing else is far closer to optimal evaluation function that > what is currently available in Go. I think the go equivalent is to add up the number of stones and the total number of liberties for each side, where a liberty only gets to count 1 time. This evaluation function has the characteristic that it returns the correct score when the game is complete if you use logical rules. Although some evaluation functions can be constructed which play worse with depth, that is not the general case. In general, any "reasonable" (not good - just reasonable) evaluation function will play better and better with increasing search depth. For instance the evaluation function I just suggested improves with depth. Even with no quies search of any kind - just using the raw evaluation above you can extrapolate and see that this must be true. Obviously, since it knows nothing about dead groups it's quite naive and plays very poorly at 1 ply. However, even at this depth it won't play on the edges of the board which makes it better than random already. At 1 ply it will be just barely smart enough to make a capture (which also makes it better than random.) At 2 ply it will be smart enough to defend against atari. It will get this wrong a lot, for instance it will try to defend groups that cannot be defended. It would probably also play out all ladders to the end from the defenders side - win or lose at 2 ply. But after each additional ply, it will be a little wiser than before, capable of something it was not capable of before. It will also refine what it already "knows", such as being a bit smarter about ladders. For instance at 4 ply, it will know whether to defend a 2 move ladder or whether it's futile, thereby using it's moves elsewhere. If you imagine such a search doing 30 or 40 ply (on a 9x9 board) it might actually be playing pretty strong. It would certainly play endings very well at this point. At low depth levels in the opening, such a program would scatter the stones randomly, trying to get 1 space and 4 liberties for each stone. Although it wouldn't play on the edge or corner for no reason, it would likely play a lot of second rank moves as you mention. There is some depth level, however, where this behavior will start to change. At some point it will learn that moves like b2 runs you out of options later. The opponent gains time attacking this stone from the outside and whether you are able to defend it or not, the opponent has placed a lot of stones where they count. On 9x9 boards I think it would start playing what looks like fairly normal moves at around 15-30 ply. At some point, probably well before it can perform an exhaustive search of the whole game, it will play very strongly, perhaps as good as the best players. I'm not claiming this is a "good" way to proceed, I'm just trying to refute the idea that it would play worse with depth - this is clearly not true. Of course it's possible to refine this evaluation considerably - it's pretty lousy as I described it but I used it as an example because it's easy to understand and describe. - Don > Also there is not much published information evaluation functions in > Go. Obviously a go programming is a business and giving out such > information does not make sense. Best publicly available thingy is > GnuGo and it does not even have one. > > Any simplistic Go-veal would probably result in very bad choices in > early stages of game - like playing on second row without proper > reason. So selective search is part of eval. And this shortcoming is > pretty obvious in MC programs. when they play on full sized board they > make extremely funny moves and so good result against humans only in > ultra-blitz conditions - humans scale better I guess. > > Petri >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/