On Sat, Jun 6, 2009 at 5:11 AM, Claus Reinke <claus.rei...@talk21.com>wrote:
> >The purpose of a handicap games is to allow a 50% chance of either > >contestant winning. .. Programs do not care, > > Are you sure?-) I haven't got round to moving beyond a plain MC bot yet, > where the effect is rather striking, but less naive bots also depend on > win-rate > for their move evaluations. > > The effect is that the bot is only "interested" in a narrow range of fairly > balanced games: if it gets too far behind, or too far ahead, its play > becomes > fairly random. The reason being that small differences in score may not > sufficiently affect the win/lose ratio to be distinguishable (it would take > longer sequences of such better moves to change the win/lose outcome), > so good and bad moves are getting lumped together in the evaluation. I once made the argument that MCTS bots would not play handicap games well based on this same argument. I got a fairly strong reaction from the group on this. Many said it would make no difference and that the really strong bots would play just as hard even if they were started with a technically lost game (from the handicap.) I think the counter argument was that as long as the bot didn't believe it was completely losing, it would play close to it's full potential. I still disagree and I feel that the statistics generated from the playouts are going to be more accurate when chances are nearly equal. It's not that I think a strong bot would play horribly, I just think they will play under the wrong assumptions and that their performance would not match their true strength and that the handicap system would be at least somewhat broken with respect to them. The way it would manifest itself is that a bot could not accept the full handicap indicated by it's strength difference. For instance if our bots ever became 9 dan strong, they could not give 8 stones to a 1 dan player, even though a human 9 dan would be expected to play even with a 1 dan at this handicap. I admit that the effect of this could be too small to worry about. The handicap system is imperfect anyway, it's almost a coincidence that it works as well as it does. > > > However, handicap games are rather different from non-handicap games, > and bots don't care how a balance is achieved, so one might simply use > numeric handicaps. Only that a large numeric handicap will have a bad > effect on win-rate-based bots' evaluation abilities - the weaker bot, > seeing a large numeric advantage, will play even more weakly. So one > might have to phase in awareness of handicap gradually, adjusting it > during the game (as the stronger bot gets ahead in the game, more of the > handicap gets added to keep the evaluation and gameplay interesting). > > Which does sound rather complicated. But it reminds me of a suggestion > I made a long time ago for human games(*), which might be easier to adopt > in this context: when a player/bot is far enough ahead that even a double > move by its opponent will at best catch up, the stronger player can simply > pass. At the end of the game, any difference in the number of passes is > counted as handicap, and recorded together with the plain score/result. > > It would allow stronger bots to keep weaker bots in play for longer, > when the remainder of the game would otherwise be a continued > sigh "oh, if its programmer had only implemented early resign..". The solution is to simply play for the best total territory score. The bot's would have to be modified to play this way and of course we know it makes them weaker but the game is never over until no further progress is possible. With some creativity a bot could probably be modified to play in such a way that it was not substantially weaker by doing this. I think some bots have enough knowledge imposed on them that they don't revert to random-like play, even when it's clear who will win. > > > It would also allow weaker bots and their authors to get more out of > playing stronger bots (provided that authors actually study game > records, not just result statistics) - instead of "that game was really > lost, but my bot didn't know, so it just kept playing random moves" > it would be "hmm, to make that game even close, our opponent > had to pass 5 times, lets look at the pass moves and what went > wrong to make them possible". > > Just another idea, > Claus > > (*) the context was "teaching" games: being rather a beginner myself, > and trying to interest newcomers to play the game, I never felt > confident in guessing useful handicaps, let alone explaining the > different play required to make use of handicap stones. Not to > mention that beginner strengths are rather variable. > > So what I did instead was to start out evenly, then deliberately > throw in less efficient moves when I thought I was getting too far > ahead (an entirely new source of blundering opportunities, btw;-). > This was often accompanied by trying to explain to my opponent, > within the limited range of my game knowledge, why I thought > that their moves weren't as efficient as mine, and what they > might have tried instead to achieve their aims. > > The discussions were really useful and interesting (the target > audience at the time being academics with incurable affinity > for discussing and reasoning), but the slow-downs kept the > games going for longer (also: realistic for long stretches, and > highlighting opportunities for discussions in between, at the > slow-down points). The approach also seemed to take the > emphasize away from winning, towards learning/thinking/ > having fun with exploring the game. > > > > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ >
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/