Quoting Heikki Levanto <[EMAIL PROTECTED]>:
On Wed, Feb 07, 2007 at 04:42:01PM -0500, Don Dailey wrote:
In truth the only thing that matters is to increase your winning
percentage - not your score. There seems to be no point in tampering
with this.
I guess I must accept the wisdom of those who have tried these things.
Still, it hurts my intuition that it could be better for a program to
choose a line where it seems to win by 2 points, when another line seems
to end in a 100 point win. What if the opponent can improve his play
from what I expected, and gain an extra 3 points somewhere?
Maybe all this shows that we have not (yet?) understood all the
complications of the MC evaluation, and that more research is needed?
No, it is actually quite simple although of course more reseacrh is needed for
MC but not on this topic. Here I mght repeat what others already said but I
will try to make it clear out of how I understand it.
The line that ends in a 100 point win is often only a win if the
opponent makes
a mistake, otherwise you lose some points or even the game. The fallacy
here (I
have been there too) is that you think of specific situations where all lines
ends with victory. But in real games many lines often exists that wins big but
are also risky. You are of course safe when all moves are valued 1000
each. But before that there might be some evaluated for example 998
because the MC
eval knows that something may go wrong (a 7ply deep tactical defect exists) if
you then add the average score of 2 to 1000 and an average score of 10 to 998
then the program makes a mistake. It is all about eliminating uncertainty. The
average score can contain a very large proportion of losees if it is
compensated by bigger wins.
Since mc eval is based on random games it will always have scores less
than the
maximum 1000, in won positions that can still be lost by greedy moves. And
there is no way of securely determine the point when "greedy" play. Not even
when all moves are evaluated as sure wins, there can be bugs in the program
that inflates the win score of moves that leads to a tactical disaster.
The solution is to improve the simulation part so that real sure wins are
evaluated as sure wins early and among these one can bias the selected moves
toward profitable moves. Valkyria does something like this, although I have
forgotten the exact details now.
This problem I think become less important when the program become stronger in
other areas. The reason is that the simulation part plays a very bad endgame.
Thus a strong MC/UCT program realizes that more territory is also an insurance
against a bad endgame. I have no solid backing for this but my impression is
that Valkyria used to win with 0.5 points almost always but now often
wins with
much more.
-Magnus
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/