Christoph Birk wrote: > On Tue, 4 Mar 2008, Magnus Persson wrote: >> But here you are missing the point that close to 0% winning >> probability means that it cannot win against random play. The >> opponent could lose only by killing his own groups. > > I don't know why you (and Don) keep bringing up the 0% against random > play ... > I am talking about a (typical) situation in the endgame > where best play (as seen from the program) leads to a sure 0.5 pt loss. > Many MC programs will make unreasonable attempts of winning by chosing > a line that shows a possible win (10 pt) if the opponent makes a > (stupid) mistake. Instead they should go for the (supposedly sure) > 0.5 pt loss, because the opponent will much more likely make > the 1pt mistake, and not the 10 pt mistake. This is where the 0% against random play comes in, what you skeptically call "supposedly sure" is based on the fact that tree search cannot find a single line of play that wins. So you and I can sit down and play one of these games and I will make random moves and still beat you. And this is the kind of position you want to get into?
> > The problem is that the likelihood of your opponent making a mistake > is hard to determine by the UCT (MC) playouts. I guess one needs > to use the meta information that is is more likely to make a small > mistake than to make a big one. When you get into opponent modeling, you have to understand your opponent, because usually opponent modeling involves playing weaker moves in exchange for better practical winning chances. If that's really what you want, why not just using the territory scoring method instead of the win/loss record for your MC player? It does everything you want it to do. It tries to win big, it prefers to loss small than lose big, and it doesn't care how often it loses, if it can win big enough times to make up for it. You can easily modify your current program by considering the results of a playout as some fraction of a win instead of a win or loss, based on the possible win range. You can do this in a linear way, or you can introduce a bias to control the extent to which it acts like the programs we have now. - Don > > This is not specific to any particular opponent. > > Christoph > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/