I'd like to say that the problem comes from the fact the model of the opponent in the simulations is not enough accurate in MCTS flamework. So, the solution is to make the model being more precise but this has practically no sense.
What is Komi or handicap? Since W is stronger than B, W must gain some points in future at any position in a game. Handicap can be thought as that points at the initial position (assuming handicap stones can be converted to handicap points). Hence, handicap points could be used to correct the model of the opponent. For example, assuming 7 stones handicap is equivalent to 70 points, we can use 70 for the hidden komi at the beginning and decrease it towards the end of the game. Does this make sense? Hideki Don Dailey: <5212e61a0908121411y3198e9d9m55441378fa01...@mail.gmail.com>: >The problem with MCTS programs is that they like to consolidate. You set >the komi and thereby give them a goal and they very quickly make moves which >commit to that specific goal. Commiting to less than you need to actually >win will often involve sacrificing chances to win. Sometime it won't, >but you cannot have a scalable algorithm which is this arbitrary. > >However, if the handicap is too high, the program thinks every line is a >loss and it plays randomly. That's why we even consider doing this. > >Dynamically changing komi could be of some benefit in that situation if >there is no alternative reasonable strategy, but it does not address the >real problem - which is what I call the "committal consolidation" >problem. You are giving the program an arbitrary short term goal which >may, or may not be compatible with the long term goal of winning the >game. Whether it's compatible or not is based on your own credulity - >not anything predictible or that you can scale. And as the base program >gets stronger this aspect of the program becomes more and more of a wart. > >If this can be made to work in the short term, it should be considered a >temporary hack which should be fixed as soon as possible. > >We have to think about this anyway sooner or later because if programs >continue to develop and the predictive ability of the playouts and tree >search gets several hundred ELO better, these programs may start to see >more and more positions as either dead won or dead lost. I'm sure we >will want some kind of robust mechanism for dealing with this which is >better at estimating chances that the opponent will go wrong as opposed to >doing something that is a random benefit or hindrance. > >- Don > > > > > > > >2009/8/12 terry mcintyre <terrymcint...@yahoo.com> > >> Ingo suggested something interesting - instead of changing the komi >> according to the move number, or some other fixed schedule, it varies >> according to the estimated winrate. >> >> It also, implicitly, depends on one's guess of the ability of the opponent. >> >> >> An interesting test would be to take an opponent known to be weaker, offer >> it a handicap, and tweak the dynamic komi per Ingo's suggestion. At what >> handicap does the ratio balance at 50:50? Can the number of handicap stones >> be increased with such an adaptive algorithm? >> >> Even better, play against a stronger opponent; can one increase the win >> rate versus strong opponents? >> >> The usual range of computer opponents is fairly narrow. None approach >> high-dan levels on 19x19 boards - yet. >> >> Terry McIntyre <terrymcint...@yahoo.com> >> >> We hang the petty thieves and appoint the great ones to public office. -- >> Aesop >> ------------------------------ >> *From:* Brian Sheppard <sheppar...@aol.com> >> *To:* computer-go@computer-go.org >> *Sent:* Wednesday, August 12, 2009 12:33:13 PM >> *Subject:* [computer-go] Dynamic komi at high handicaps >> >> >The small samples is probably the least of the problems with this. Do you >> >actually believe that you can play games against it and not be subjective >> in >> >your observations or how you play against it? >> >> These are computer-vs-computer games. Ingo is manually transferring moves >> between two computer opponents. >> >> The result does support Ingo's belief that dynamic Komi will help programs >> play high handicap games. Due to small sample size it isn't very strong >> evidence. But maybe it is enough to induce a programmer who actually plays >> in such games to create a more exhaustive test. >> >> _______________________________________________ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> >> >> _______________________________________________ >> computer-go mailing list >> computer-go@computer-go.org >> http://www.computer-go.org/mailman/listinfo/computer-go/ >> >---- inline file >_______________________________________________ >computer-go mailing list >computer-go@computer-go.org >http://www.computer-go.org/mailman/listinfo/computer-go/ -- g...@nue.ci.i.u-tokyo.ac.jp (Kato) _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/