Hi,
Today I had an interesting discussion about bots learning from expert
(Pro/strong KGS) games to prebias the tree search and/or (soft-)prune parts
of the tree.
Point was, that playing situational moves out of their usual context can
throw the bot off, and force it to 'look' into the wrong direction first.
No doubt, the bot can recover from this misjudgement with some playouts,
but it is still first send into the wrong direction.
Example: Imagine cutting a onespace jump. The bot, looking into it's
pattern database, will usually only find this situation, when this move is
somehow reasonable. In those cases, often the answer is difficult and
sacrifices have to be made. But the most punishing answers won't even be in
his database, as he has never seen the case in a pro game, where the move
is clearly punishable. But instead the bots tree search will first check
the standard answers for difficult cases instead of the clear punishments.
It may happen, that the bot then chooses a submissive answer (because that
is what usually happens to the reasonable version of the move) instead of
the good move/punishment.
Surely this example isn't perfect, but I hope it illustrates the problem, I
see. Similar things happen with joseki, which can be played correctly, but
most likely not properly punished, as the wrong variations are not
available in the database, except when they are contextually possible.

What makes this problem even worse, is that with the standard methods of
playtesting it won't be noticed. In tests against (own or other) engines,
if both use a similar database, those moves won't appear out of context.
And even playtesting against random opponents on KGS won't show those
weaknesses clearly, as even if single players identify those weak spots,
their number of games won't be significant usually. I'm not even sure, how
one could systematically check for such misjudgements by the bot.

Overall, I'm in no way against learning from expert games, and I think
there is no doubt, that it is a significant source for improvement of the
bots. But the question remains, how those weaknesses could be fixed. The
bots have learned how to answer proper play. But how do they learn to
answer unusual/bad play?

Marc
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to