-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
thanks a lot for sharing! I try a slightly different approach at the moment: I use a combined policy / value network (adding 3-5 layers with about 16 filters at the end of the policy network for the value network to avoid overfitting) and I use the results of the games as value. My main problem is still overfitting! As your results seem good I will try your bigger database to get more games into training I think. I will keep you posted Detlef Am 04.03.2016 um 16:23 schrieb Hiroshi Yamashita: > Hi, > > I tried to make Value network. > > "Policy network + Value network" vs "Policy network" Winrate > Wins/Games 70.7% 322 / 455, 1000 playouts/move 76.6% 141 / > 184, 10000 playouts/move > > It seems more playouts, more Value network is effetctive. Games is > not enough though. Search is similar to AlphaGo. Mixing parameter > lambda is 0.5. Search is synchronous. Using one GTX 980. In 10000 > playouts/move, Policy network is called 175 times, Value network is > called 786 times. Node Expansion threshold is 33. > > > Value network is 13 layers, 128 filters. (5x5_128, 3x3_128 x10, > 1x1_1, fully connect, tanh) Policy network is 12 layers, 256 > filters. (5x5_256, 3x3_256 x10, 3x3_1), Accuracy is 50.1% > > For Value network, I collected 15804400 positions from 987775 > games. Games are from GoGoD, tygem 9d, 22477 games > http://baduk.sourceforge.net/TygemAmateur.7z KGS 4d over, 1450946 > games http://www.u-go.net/gamerecords-4d/ (except handicaps > games). And select 16 positions randomly from one game. One game is > divided 16 game stage, and select one of each. 1st and 9th position > are rotated in same symmetry. Then Aya searches with 500 playouts, > with Policy network. And store winrate (-1 to +1). Komi is 7.5. > This 500 playouts is around 2730 BayesElo on CGOS. > > I did some of this on Amazon EC2 g2.2xlarge, 11 instances. It took > 2 days, and costed $54. Spot instance is reasonable. However > g2.2xlarge(GRID K520), is 3x slower than GTX 980. My Pocicy > network(12L 256F) takes 5.37ms(GTX 980), and 15.0ms(g2.2xlarge). > Test and Traing loss are 0.00923 and 0.00778. I think there is no > big overfitting. > > Value network is effective, but Aya has still fatal semeai > weakness. > > Regards, Hiroshi Yamashita > > _______________________________________________ Computer-go mailing > list Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJW2cD5AAoJEInWdHg+Znf4KXIP/2rrfEph3VNHkrf5B4H0DJXm abSsFbqF453SjFOucjSGXv8Ecp90lCmwz41NWkQEpBLvedjl4atjMoBiCorjqhny ZKeFUgY6tK0HWU2euxHH9reJ6HAsDrlYgMrJKqNySdAtPxNq2buMW1qIiFrAHCsL wCsYlwVtz4EpViJcuSXoFufreTyfUJ7p8AxrhRtuC6ALZI1wUTm+xrwrCHPQ91Bg AKx5N2xLO2c7rHCt9FsLhR1BmXgximzmYsD7Sge4mdYMwU5nrRhxgAvX1Uj8sP8Z 2YfF+/8YmFP/rc55LqqRGzjeUwpJaX8rv1eHxl+eaNoptP7PZcFchsC5motc6XNV fjTwOhyaeEsPlPIDylJN5PNPn2hXc75MqVDHMnUn2J+VF2DdlerKMmZhqTd1VaIu sHz1+DN7PNZIO4cO3AKi9ynmBEHB1pQaRH4nDWkL6hdI8Zv6ZgJEjRhXjnFWyJcI PVmErcUI6Xn1xCXHEWhxSjKwuwil/RgdVfPgywfqhj1MiuTtkcrThpUmWcPCrLRk fxsNddSKmJcFs4nCcK/M6oO/OiZ6mn7dO4xoCWnAvds3aW71tEupTuZhYjiWx9YH KR5p4r7JIBNCSn1ZfonD3BKMKyBv7qIJ63ITSAdy0EH3aJPt4CVmZm2dsrE3ZhtW wMqhp4Yf8ecTiapJOcol =oJsM -----END PGP SIGNATURE----- _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go