-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

thanks a lot for sharing! I try a slightly different approach at the
moment:

I use a combined policy / value network (adding 3-5 layers with about
16 filters at the end of the policy network for the value network to
avoid overfitting) and I use the results of the games as value. My
main problem is still overfitting!

As your results seem good I will try your bigger database to get more
games into training I think.

I will keep you posted

Detlef

Am 04.03.2016 um 16:23 schrieb Hiroshi Yamashita:
> Hi,
> 
> I tried to make Value network.
> 
> "Policy network + Value network"  vs  "Policy network" Winrate
> Wins/Games 70.7%    322 / 455,    1000 playouts/move 76.6%    141 /
> 184,   10000 playouts/move
> 
> It seems more playouts, more Value network is effetctive. Games is
> not enough though. Search is similar to AlphaGo. Mixing parameter
> lambda is 0.5. Search is synchronous. Using one GTX 980. In 10000
> playouts/move, Policy network is called 175 times, Value network is
> called 786 times. Node Expansion threshold is 33.
> 
> 
> Value network is 13 layers, 128 filters. (5x5_128, 3x3_128 x10,
> 1x1_1, fully connect, tanh) Policy network is 12 layers, 256
> filters. (5x5_256, 3x3_256 x10, 3x3_1), Accuracy is 50.1%
> 
> For Value network, I collected 15804400 positions from 987775
> games. Games are from GoGoD, tygem 9d,      22477 games
> http://baduk.sourceforge.net/TygemAmateur.7z KGS 4d over, 1450946
> games http://www.u-go.net/gamerecords-4d/ (except handicaps
> games). And select 16 positions randomly from one game. One game is
> divided 16 game stage, and select one of each. 1st and 9th position
> are rotated in same symmetry. Then Aya searches with 500 playouts, 
> with Policy network. And store winrate (-1 to +1). Komi is 7.5. 
> This 500 playouts is around 2730 BayesElo on CGOS.
> 
> I did some of this on Amazon EC2 g2.2xlarge, 11 instances. It took 
> 2 days, and costed $54. Spot instance is reasonable. However 
> g2.2xlarge(GRID K520), is 3x slower than GTX 980. My Pocicy 
> network(12L 256F) takes 5.37ms(GTX 980), and 15.0ms(g2.2xlarge). 
> Test and Traing loss are 0.00923 and 0.00778. I think there is no
> big overfitting.
> 
> Value network is effective, but Aya has still fatal semeai
> weakness.
> 
> Regards, Hiroshi Yamashita
> 
> _______________________________________________ Computer-go mailing
> list Computer-go@computer-go.org 
> http://computer-go.org/mailman/listinfo/computer-go
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJW2cD5AAoJEInWdHg+Znf4KXIP/2rrfEph3VNHkrf5B4H0DJXm
abSsFbqF453SjFOucjSGXv8Ecp90lCmwz41NWkQEpBLvedjl4atjMoBiCorjqhny
ZKeFUgY6tK0HWU2euxHH9reJ6HAsDrlYgMrJKqNySdAtPxNq2buMW1qIiFrAHCsL
wCsYlwVtz4EpViJcuSXoFufreTyfUJ7p8AxrhRtuC6ALZI1wUTm+xrwrCHPQ91Bg
AKx5N2xLO2c7rHCt9FsLhR1BmXgximzmYsD7Sge4mdYMwU5nrRhxgAvX1Uj8sP8Z
2YfF+/8YmFP/rc55LqqRGzjeUwpJaX8rv1eHxl+eaNoptP7PZcFchsC5motc6XNV
fjTwOhyaeEsPlPIDylJN5PNPn2hXc75MqVDHMnUn2J+VF2DdlerKMmZhqTd1VaIu
sHz1+DN7PNZIO4cO3AKi9ynmBEHB1pQaRH4nDWkL6hdI8Zv6ZgJEjRhXjnFWyJcI
PVmErcUI6Xn1xCXHEWhxSjKwuwil/RgdVfPgywfqhj1MiuTtkcrThpUmWcPCrLRk
fxsNddSKmJcFs4nCcK/M6oO/OiZ6mn7dO4xoCWnAvds3aW71tEupTuZhYjiWx9YH
KR5p4r7JIBNCSn1ZfonD3BKMKyBv7qIJ63ITSAdy0EH3aJPt4CVmZm2dsrE3ZhtW
wMqhp4Yf8ecTiapJOcol
=oJsM
-----END PGP SIGNATURE-----
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to