A very insightful post, I enjoyed reading it and I think it does make some sense. It's clear that a lot of energy is wasted on playouts when 99% of them are ending with the same result.
Don On Mon, Jan 9, 2012 at 10:26 AM, Vlad Dumitrescu <[email protected]> wrote: > Hi, > > On Mon, Jan 9, 2012 at 13:17, Don Dailey <[email protected]> wrote: > > Summary: > > > > I believe a more correct scoring function won't be based on how much you > win > > by OR how often you win but will incorporate some other more relevant > > concept and it will be dynamic. And it will not matter if the game is > > a handicap game or otherwise because the scoring function will always be > > relevant. The goal will be to maximize your winning chances but it > > will incorporate something more sophisticated that just counting how > often > > you win or how much you win by. > > I hope I may interfere with something that Don's nice description > revealed to me. It feels rather obvious, but since nobody stated it > explicitly, maybe it's news for at least some people here. > > MCTS is maximizing the chances of winning. These chances are largest > for a minimal score difference because this allows for making some > errors. Winning by the largest possible score has rather small chances > to happen because every move has to be perfect. > > The curve describing the probability of ending the game with a certain > score is bell-shaped and MCTS explores the area beneath it, looking > for winning moves. With handicap, the disadvantaged side is getting > less samples explored, making it less likely to discover the really > good moves. Dynamic komi shifts the bell left or right in order to > equalize the sampling on both sides, but as mentioned it isn't dynamic > enough (the curve changes after each move) and also is actually using > a different shape for the curve than the real "handicap curve". > > In theory, I think that the solution for keeping the same level of > play with handicap as without would be to make sure that the the > disadvantaged side gets just as many samples with or without handicap. > That is, use more playouts when playing with handicap. In practice, > this is probably prohibitive... > > I wonder if it might be possible to estimate the shape of this curve > after each move and use that estimate to dynamically adjust the number > of playouts. One might have to use higher precision calculations, too, > so that the noise doesn't get too loud. > > Does this make any sense? Has anyone tried something like this? > > best regards, > Vlad > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
