AlphaGo's zero plane of the policy network is used as the color
feature for the value network (Extended Data Table 2, page 31).
These networks share the same architecture so that the value
network can be initialized by the policy network before
training.
Hideki
Brian Lee: :
>I've been wonder
I agree with you. It makes no sense. You'll take whatever linear
combinations you want and they'll all be zero.
Álvaro.
On Tue, Jul 18, 2017 at 6:53 AM, Brian Lee
wrote:
> I've been wondering about something I've seen in a few papers (AlphaGo's
> paper, Cazenave's resnet policy architecture),
Hi, my 2 cent:
I think it is more or less redundant for the border. Alphago has a plane
for black, white and empty. So a border point is definitely different
anyway since it will have no features set in any plane. But all points
on the board with has a 1 set in one of the three b/w/e-planes.
It does, and for the exact same reason than a plan filled with 1.
You have a lot of bias inside your networks so whatever the input you
give, you can be sure it will be transformed, be it a plan full of 0 or
a plan full of 1. As you said, it helps the network to keep the track of
the boundarie