On Wed, Aug 19, 2009 at 9:39 AM, Magnus Persson <magnus.pers...@phmp.se>wrote:
> Don, what you write is certainly true for even games, but I think the > problem is a real one in high handicap games with the computer as white. I > use a hack to make Valkyria continue playing the opening in handicap games > as white. It is forbidden to resign in the opening and early middle game > because it would if it could. In handicap games the situation is different. You have roughly even chances whether taking the handicap or giving it. I think this illustrates that fundamentally this is an opponent modeling issue. And I really like the idea that someone had of throwing in occasional pass moves for the player who is presumed weaker. There is an analogy in computer chess - it's called the null move heuristic. If it is white to move, you can measure the potential of blacks positions by playing a pass move for white (called the null move) and the a reduced depth search. Whatever score is returned can be considered a lower bound - since you as white skipped one of your moves. With Go it is a little different. If you are trying to beat a much stronger player but you have been given a nice advantage due to handicap, then the playouts will see you easily winning the game and you will play these random looking moves. However, if the computer throws in some pass moves for itself in the playouts, it will play more focused - it will be challenged to find strategies that work in the presense of it's own sloppy play. In other words the computer will stop this "anything works" attitude and it will focus on robust strategies that give it some room for error. It should be able to find these more robust strategies because it knows it is comfortably ahead. I don't know if this will actually work, it's only at the idea stage as far as I know - but it's something that seems more consistent with the actual problem. Komi manipulation changes the goal which is very dangerous but this ideas does not change the goal, just how it is achieved. > > To rephrase your argument for even games, the problem situation should > never occur because the losing player *should out of courtesy* resign long > before the evalutaion become so skewed. That's not correct, because with handicap games the premise is different. My reasoning is based on the well known fact that you will not often get outplayed by signficantly weaker opponents and you will not often outplay signficantly stronger opponents. But this does not apply to handicap games because nobody was outplayed - you started from a game that is a dead win for one side. In a handicap game, it's not only likely, it is CERTAIN that you will find youself in a dead won game against a much stronger opponent. In even games this is going to be a rare occurance. > > But this does not apply to h9 games on 19x19 for example. And if I am not > mistaken strong heavy playouts evaluates such positions very > pessimistically, and thus we have a problem to solve, which grows with > increasing playing strength. Still stronger programs will discriminate > between bad and good moves even with extreme scores, so I think the > dimensions of this problem is exaggerated. Yes, it's a problem. And likewise with komi manipulation, the stronger the program is the more likely a small komi change will wildly change the score, from dead won to dead lost or visa versa. Imagine a program so strong that it always plays random moves when losing, and when winning it randomly plays any move that does not lose. It should be obvious that in a winning position, it is going to play a winning move with certainty, but if you adjust komi to make it play "better" it will play a random move - which could be a losing move. This thought experiment consistitues a kind of proof that the idea at it's most fundamental level is wrong. This can be salvaged by doing multiple searches with different komi's and only playing moves they have in common. I think this all gets complicated (and interesting) becasue we tend to think in two different ways about playing games, one way is all about correctness, finding the best move in the game theoretic sense and the other is how to improve your practical winning chances in the face of fallible opposition (such as blowing smoke in his face.) So it's rather hard to make any kind of proof that something like this is "better" or "not better" - it all has to be emprical. - Don > > > -Magnus > > > Quoting Don Dailey <dailey....@gmail.com>: > > One must decide if the goal is to improve the program or to improve it's >> playing behavior when it's in a dead won or dead lost positions. >> >> It's my belief that you can probably cannot improve the playing strength >> soley with komi manipulation, but at a slight decrease in playing >> strength >> you can probably improve the behavior, as measured by a willingness to >> fight >> for space that is technically not relevant to the goal of winning the >> game. And only then if this is done carefully. However I believe >> there are better ways, such a pre-ordering the moves. >> >> Even if this can prove to be a gain, you are really working very hard to >> find something that only kicks in when the game is already decided - how >> to >> play when the game is already won or already lost. But only the case >> when >> the game is lost is this very interesting from the standpoint of making >> the >> program stronger. >> >> And even this case is not THAT interesting, because if you are losing, on >> average you are losing to stronger players. So you are working hard on >> an >> algorithm to beat stronger players when you are in a dead lost game? How >> much sense does that make? >> >> So the only realistic pay-off here is how to salvage lost games against >> players that are relatively close in strength since you can expect not to >> be >> in this situation very often agaist really weak players. So you are >> hoping to bamboozle players who are not not weaker than you - in >> situations >> where you have been bamboozled (since you are losing, you are the one >> being >> outplayed.) >> >> That is why I believe that at best you are looking at only a very minor >> improvement. If I were working on this problem I would be focused only >> on >> the playing style, not the playing strength. >> >> If you want more than the most minor playing strength improvement out of >> this algorithm, you have to start using it BEFORE the loss is clear, but >> then you are no longer playing for the win when you lower your goals, you >> are playing for the loss. >> >> - Don >> >> >> >> >> 2009/8/19 Stefan Kaitschick <stefan.kaitsch...@hamburg.de> >> >> One last rumination on dynamic komi: >>> >>> The main objection against introducing dynamic komi is that it ignores >>> the >>> true goal >>> of winning by half a point. The power of the win/loss step function as >>> scoring function underscores >>> the validity of this critique. And yet, the current behaviour of mc bots, >>> when either leading or trailing by a large margin, resembles random play. >>> The simple reason for this is that either every move is a win or every >>> move >>> is a loss. >>> So from the perspective of securing a win, every move is as good as any >>> other move. >>> Humans know how to handle these situations. They try to catch up from >>> behind, or try to play safely while defending enough of a winning margin. >>> For a bot, there are some numerical clues when it is missbehaving. >>> When the calculated win rate is either very high or low and many move >>> candidates have almost identical win rates, the bot is in coin toss >>> country. >>> A simple rule would be this: define a minimum value that has to separate >>> the win rate of the 2 best move candidates. >>> Do a "normal" search without komi. >>> If the minimum difference is not reached, do a new a new search with some >>> komi, but only allow the moves within the minimum value range from the >>> best >>> candidate. >>> Repeat this with progressively higher komi until the two best candidates >>> are sufficiently separated.(Or until the win rate is in a defined middle >>> region) >>> There can be some traps here, a group of moves can all accomplish the >>> same >>> critical goal. >>> But I'm sure this can be handled. The main idea is to look for a less >>> ambitious gloal when the true goal cannot be reached. >>> (Or a more ambitious goal when it is allready satisfied). By only >>> allowing >>> moves that are in a statistical tie in the 0 komi search, >>> it can be assured that short term gain doesn't compromise the long term >>> goal. >>> >>> Stefan >>> >>> _______________________________________________ >>> computer-go mailing list >>> computer-go@computer-go.org >>> http://www.computer-go.org/mailman/listinfo/computer-go/ >>> >>> >> > > > -- > Magnus Persson > Berlin, Germany > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/