Brian, do you have any experiments showing what kind of impact it has? It
sounds like you have tried both with and without your ad hoc first pass
approach?




2017-12-01 15:29 GMT-06:00 Brian Sheppard via Computer-go <
computer-go@computer-go.org>:

> I have concluded that AGZ's policy of resigning "lost" games early is
> somewhat significant. Not as significant as using residual networks, for
> sure, but you wouldn't want to go without these advantages.
>
> The benefit cited in the paper is speed. Certainly a factor. I see two
> other advantages.
>
> First is that training does not include the "fill in" portion of the game,
> where every move is low value. I see a specific effect on the move ordering
> system, since it is based on frequency. By eliminating training on
> fill-ins, the prioritization function will not be biased toward moves that
> are not relevant to strong play. (That is, there are a lot of fill-in
> moves, which are usually not best in the interesting portion of the game,
> but occur a lot if the game is played out to the end, and therefore the
> move prioritization system would predict them more often.) My ad hoc
> alternative is to not train on positions after the first pass in a game.
> (Note that this does not qualify as "zero knowledge", but that is OK with
> me since I am not trying to reproduce AGZ.)
>
> Second is the positional evaluation is not training on situations where
> everything is decided, so less of the NN capacity is devoted to situations
> in which nothing can be gained.
>
> As always, YMMV.
>
> Best,
> Brian
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to