Łukasz Lew lukasz....@gmail.com:

>Another one is to have set of labeled positions (win/loss)  and make
>your program predict the labels. (This is what MoGo guys did)
>It is much faster. But how well it is correlated with the true strength?

I think it could be good if the set of positions is well-chosen. But the
correct choice of positions will depend on the program's strength and
playing style, and will have to change over time as the program
develops. Plus you have to avoid overfitting to your test positions.

>I had an idea to further refine this measure to predict ownership of
>each intersection. This is much faster and completed games don't need
>labeling.

Programs have been written that play using that criterion.

Here's an idea that I've proposed before: Play games, but instead of
looking only at win/loss, look at the evolution of the score over the
game, using some kind of average over the temporal differences. Score
drops indicate that the program was surprised, and you want the program
to avoid surprises. (Score increases mostly mean that the program thinks
the opponent made a mistake. If it's wrong and the opponent continues to
play well, a score drop will follow eventually!)

Temporal differences are ubiquitous in reinforcement learning. My point
is that the same measurements should work for all kinds of learning,
including hand tuning.

The ideas can be combined: Look at temporal differences per
intersection. Or if the program backs up information from a search, you
can look at temporal differences inside the search. More generally, if
it works by combining information in any way (which every program must),
then you should be able to decompose the factors and look at temporal
differences per factor. Suddenly, instead of getting one bit of
information per game, you're getting tons of details about how well the
program understands what's happening throughout.

>Another idea is to try to predict moves in a set of (pro) games.
>Is the prediction rate well correlated with program strength?

No, very poorly correlated. I think that's well known.

  Jay

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to