I like that idea - I'm not sure I've fully digested it's implications but it does deliver more information that could be put to good use.
Don On Mon, Jan 9, 2012 at 1:53 PM, Zach Wegner <[email protected]> wrote: > On Mon, Jan 9, 2012 at 7:26 AM, Vlad Dumitrescu <[email protected]> > wrote: > > Hi, > > > > On Mon, Jan 9, 2012 at 13:17, Don Dailey <[email protected]> wrote: > >> Summary: > >> > >> I believe a more correct scoring function won't be based on how much > you win > >> by OR how often you win but will incorporate some other more relevant > >> concept and it will be dynamic. And it will not matter if the game is > >> a handicap game or otherwise because the scoring function will always be > >> relevant. The goal will be to maximize your winning chances but it > >> will incorporate something more sophisticated that just counting how > often > >> you win or how much you win by. > > > > I hope I may interfere with something that Don's nice description > > revealed to me. It feels rather obvious, but since nobody stated it > > explicitly, maybe it's news for at least some people here. > > > > MCTS is maximizing the chances of winning. These chances are largest > > for a minimal score difference because this allows for making some > > errors. Winning by the largest possible score has rather small chances > > to happen because every move has to be perfect. > > > > The curve describing the probability of ending the game with a certain > > score is bell-shaped and MCTS explores the area beneath it, looking > > for winning moves. With handicap, the disadvantaged side is getting > > less samples explored, making it less likely to discover the really > > good moves. Dynamic komi shifts the bell left or right in order to > > equalize the sampling on both sides, but as mentioned it isn't dynamic > > enough (the curve changes after each move) and also is actually using > > a different shape for the curve than the real "handicap curve". > > > > In theory, I think that the solution for keeping the same level of > > play with handicap as without would be to make sure that the the > > disadvantaged side gets just as many samples with or without handicap. > > That is, use more playouts when playing with handicap. In practice, > > this is probably prohibitive... > > > > I wonder if it might be possible to estimate the shape of this curve > > after each move and use that estimate to dynamically adjust the number > > of playouts. One might have to use higher precision calculations, too, > > so that the noise doesn't get too loud. > > > > Does this make any sense? Has anyone tried something like this? > > > > best regards, > > Vlad > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > This is similar to ideas I've had to solve this. Of course, I'm not a > go programmer, so it's really just idle curiosity for me when reading > this list. Perhaps an ideal scoring function is "highest median > score". You can do this in a slow/memory intensive way by storing a > histogram of final score in each node instead of wins/losses. > Wins/losses can then be computed by summing the histogram buckets > above/below the komi level. Computing the median is simply finding the > bucket where the sum crosses 50%. Other percentages could be used here > as well, though I'm not quite sure what effect this might have. > > This solves the problem of getting far less than one bit of > information per playout when the win rate is very high or low, and > also supersedes dynamic komi--if you have the full histogram at any > node, you can compute the exact win rate for any arbitrary komi. > > Using this in practice might be fairly difficult, and would probably > involve making the histogram "lossy" in some sense. I would be > interested if somebody here could run tests with this as a limit study > though--keep the number of nodes allocated and playouts per move > constant, and see the effect on strength. > > Zach > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
