Hi, Today I had an interesting discussion about bots learning from expert (Pro/strong KGS) games to prebias the tree search and/or (soft-)prune parts of the tree. Point was, that playing situational moves out of their usual context can throw the bot off, and force it to 'look' into the wrong direction first. No doubt, the bot can recover from this misjudgement with some playouts, but it is still first send into the wrong direction. Example: Imagine cutting a onespace jump. The bot, looking into it's pattern database, will usually only find this situation, when this move is somehow reasonable. In those cases, often the answer is difficult and sacrifices have to be made. But the most punishing answers won't even be in his database, as he has never seen the case in a pro game, where the move is clearly punishable. But instead the bots tree search will first check the standard answers for difficult cases instead of the clear punishments. It may happen, that the bot then chooses a submissive answer (because that is what usually happens to the reasonable version of the move) instead of the good move/punishment. Surely this example isn't perfect, but I hope it illustrates the problem, I see. Similar things happen with joseki, which can be played correctly, but most likely not properly punished, as the wrong variations are not available in the database, except when they are contextually possible.
What makes this problem even worse, is that with the standard methods of playtesting it won't be noticed. In tests against (own or other) engines, if both use a similar database, those moves won't appear out of context. And even playtesting against random opponents on KGS won't show those weaknesses clearly, as even if single players identify those weak spots, their number of games won't be significant usually. I'm not even sure, how one could systematically check for such misjudgements by the bot. Overall, I'm in no way against learning from expert games, and I think there is no doubt, that it is a significant source for improvement of the bots. But the question remains, how those weaknesses could be fixed. The bots have learned how to answer proper play. But how do they learn to answer unusual/bad play? Marc
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
