> I think it would be much more informative to compare evaluator A and > evaluator B in the following way. > Make a bot that searched to a fixed depth d before then calling a > static evaluator (maybe this depth is 1 or 2 or something small). Try > and determine the strength of a bot using A and a bot using B as > accurately as possible against a variety of opponents. The better > evaluator is defined to be the one that results in the stronger bot.
If you do this I'd suggest also including monte-carlo as one of your static evaluators. You want a score, but monte carlo usually returns information like "17 black wins, 3 white wins". However you can instead just sum ownership in the terminal positions, so if A1 is owned by black 15 times, white 5 times, count that as a point for black. If exactly equal ownership count the point for neither side. (Alternatively just sum black and white score of each terminal position.) You could have two or three versions using different number of playouts (with the result trade-off of more playouts means fewer nodes visited in the global search); I suspect 20-50 playouts will be optimum. My hunch is that monte carlo version will always out perform any static evaluation, given the same overall time (*). But it would be interesting to know. Darren *: Or run the experiment giving the static evaluation four times the clock time, on the assumption there is more potential for optimization in complex code. -- Darren Cook, Software Researcher/Developer http://dcook.org/mlsn/ (English-Japanese-German-Chinese-Arabic open source dictionary/semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/