Christoph Birk wrote:
> On Tue, 4 Mar 2008, Magnus Persson wrote:
>> But here you are missing the point that close to 0% winning
>> probability means that it cannot win against random play. The
>> opponent  could lose only by killing his own groups.
>
> I don't know why you (and Don) keep bringing up the 0% against random
> play ...
> I am talking about a (typical) situation in the endgame
> where best play (as seen from the program) leads to a sure 0.5 pt loss.
> Many MC programs will make unreasonable attempts of winning by chosing
> a line that shows a possible win (10 pt) if the opponent makes a
> (stupid) mistake. Instead they should go for the (supposedly sure)
> 0.5 pt loss, because the opponent will much more likely make
> the 1pt mistake, and not the 10 pt mistake.
This is where the 0% against random play comes in, what you skeptically
call "supposedly sure" is based on the fact that tree search cannot find
a single line of play that wins.    So you and I can sit down and play
one of these games and I will make random moves and still beat you.  
And this is the kind of position you want to get into?   

>
> The problem is that the likelihood of your opponent making a mistake
> is hard to determine by the UCT (MC) playouts. I guess one needs
> to use  the meta information that is is more likely to make a small
> mistake than to make a big one.
When you get into opponent modeling,  you have to understand your
opponent, because usually opponent modeling involves playing weaker
moves in exchange for better practical winning chances.   

If that's really what you want,  why not just using the territory
scoring method instead of the win/loss record for your MC player?     It
does everything you want it to do.  It tries to win big,  it prefers to
loss small than lose big,  and it doesn't care how often it loses,  if
it can win big enough times to make up for it.      You can easily
modify your current program by considering the results of a playout as
some fraction of a win instead of a win or loss, based on the possible
win range.   You can do this in a linear way, or you can introduce a
bias to control the extent to which it acts like the programs we have now. 

- Don





>
> This is not specific to any particular opponent.
>
> Christoph
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to