Brian, do you have any experiments showing what kind of impact it has? It sounds like you have tried both with and without your ad hoc first pass approach?
2017-12-01 15:29 GMT-06:00 Brian Sheppard via Computer-go < computer-go@computer-go.org>: > I have concluded that AGZ's policy of resigning "lost" games early is > somewhat significant. Not as significant as using residual networks, for > sure, but you wouldn't want to go without these advantages. > > The benefit cited in the paper is speed. Certainly a factor. I see two > other advantages. > > First is that training does not include the "fill in" portion of the game, > where every move is low value. I see a specific effect on the move ordering > system, since it is based on frequency. By eliminating training on > fill-ins, the prioritization function will not be biased toward moves that > are not relevant to strong play. (That is, there are a lot of fill-in > moves, which are usually not best in the interesting portion of the game, > but occur a lot if the game is played out to the end, and therefore the > move prioritization system would predict them more often.) My ad hoc > alternative is to not train on positions after the first pass in a game. > (Note that this does not qualify as "zero knowledge", but that is OK with > me since I am not trying to reproduce AGZ.) > > Second is the positional evaluation is not training on situations where > everything is decided, so less of the NN capacity is devoted to situations > in which nothing can be gained. > > As always, YMMV. > > Best, > Brian > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go