-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
good to start this discussion here. I had the discussion some times, and we (discussion partner and me) were not sure, against which test set the 57% was measured. If trained and tested with kgs 6d+ dataset, it seems reasonable to reach 57% (I reached >55% within 2 weaks, but than changed to GoGod), with test against GoGod I do not get near that value too (around >51% within 3 weeks on gtx 970). By the way, both get roughly the same playing strength in my case. So, if somebody is sure, it is measured against GoGod, I think a number of other go programmers have to think again. I heard them reaching 51% (e. g. posts by Hiroshi in this list) Detlef Am 23.08.2016 um 08:40 schrieb Robert Waite: > I had subscribed to this mailing list back with MoGo... and > remember probably arguing that the game of go wasn't going to be > beat for years and years. I am a little late to the game now but > was curious if anyone here has worked with supervised learning > networks like in the AlphaGo paper. > > I have been training some networks along the lines of the AlphaGo > paper and the DarkForest paper.. and a couple others... and am > working with a single 660gtx. I know... laugh... but its a fair > benchmark and i'm being cheap for the moment. > > Breaking 50% accuracy is quite challenging... I have attempted > many permutations of learning algorithms... and can hit 40% > accuracy in perhaps 4-12 hours... depending on the parameters set. > Some things I have tried are using default AlphaGo but wtih 128 > filters, 32 minibatch size and .01 learning rate, changing the > optimizer from vanilla SGD to Adam or RMSProp. Changing batching to > match DarkForest style (making sure that a minibatch contains > samples from game phases... for example beginning, middle and > end-game).Pretty much everything seems to converge at a rate that > will really stretch out. I am planning on picking a line and going > with it for an extended training but was wondering if anyone has > ever gotten close to the convergence rates implied by the > DarkForest paper. > > For comparison... Google team had 50 gpus, spend 3 weeks.. and > processed 5440M state/action pairs. The FB team had 4 gpus, spent 2 > weeks and processed 150M-200M state/action pairs. Both seemed to > get to around 57% accuracy with their networks. > > I have also been testing them against GnuGo as a baseline.. and > find that GnuGo can be beaten rather easily with very little > network training... my eye is on Pachi... but have to break 50% > accuracy i think to even worry about that. > > Have also played with reinforcement learning phase... started with > learning rate of .01... which i think was too high.... that does > take quite a bit of time on my machine.. so didnt play too much > with it yet. > > Anyway.... does anyone have any tales of how long it took to break > 50%? What is the magic bullet that will help me converge there > quickly! > > Here is a long-view graph of various attempts: > > https://drive.google.com/file/d/0B0BbrXeL6VyCUFRkMlNPbzV2QTQ/view > > Red and Blue lines are from another member that ran 32 in a > minibatch, .01 learning rate and 128 filters in the middle layers > vs. 192. They had 4 k40 gpus I believe. They also used 40000 > training pairs to 40000 validation pairs... so I imagine that is > whey they had such a spread. There is a jump in the accuracy which > was when learning rate was decreased to .001 I believe. > > Closer shot: > > https://drive.google.com/file/d/0B0BbrXeL6VyCRVUxUFJaWVJBdEE/view > > Most stay between the lines... but looking at both graphs makes me > wonder if any of the lines are approaching the convergence of > DarkForest. My gut tells me they were onto something... and am > rather curious of the playing strength of the DarkForest SL network > and the AG SL network. > > Also... a picture of the network's view on a position... this one > was trained to 41% accuracy and played itself greedily. > > https://drive.google.com/file/d/0B0BbrXeL6VyCNkRmVDBIYldraWs/view > > Oh... and another thing.... AG used KGS amateur data... FB and my > networks have been trained on pro games only. At one point I tested > the 41% network in the image (trained on pro data) and a 44% > network trained on amateur (KGS games) against GnuGo... and the pro > data network soundly won... and the amateur network soundly lost... > so I stuck with pro since. Not sure if the end result is the > same... and kinda glad AG team used amateur as that removes the > argument that it somehow learned Le Sedol's style. > > > > _______________________________________________ Computer-go mailing > list Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJXu/PCAAoJEInWdHg+Znf4WmkP/iMpRo/jI/ZYh83C0CTOh2HB 1nq4cb9CG1fubaOIfH5T7veu3pqQWTxPLUI5FOIN99BkESc0NbuSP5geoyLy0hVY IYShAju7JFuWOxarXtIcQUs27vljjddLYgBZ1sROGoYAxtaRcSa+QOup2oGJasog TxwVOf2/AqW9kIV+c1oQgaspGoh2Lj2/i4mHn1oQOl22GLnNI0Sso7ded7B7geVS 7d7k76KMIq2lNNUJpwEy5Z84NHSa1yDwwk4BbPiz1KtTrB+S5j8f5Cq/HTVUnw6c gfz37L3PkUtImKTNLSwLum9y8B0fkpkiqu4PXwgp4E8zQ5coF3tjn/quP3D+u3+3 xCbA3nCnaCPxsZdeAQOOdfLdDGaySyKTVTo5+XwgEqawFo4jaepv++OXk6nlONkw 6q2WbnwypIhPIPkJrsbliEA1Fdbv5fdqfbecl4VAvXQh6kdxvdMdZfYUfZ/JmB4A rfJPOfVe1aya5/UEyy3l6ttLL4Qhvd4i/nPaEgkmb0UfJIN6sR+uqSk6/o+89p8J z92i/8JBzxKg8mNbuq4yaXzoBWXLN7xw5kRfrnds3aIvGDsA4t26oNtdYFxJekPs Ujv6jipexiY1b68UGRU1DCVbOe+NtG8g8DMq76CH70GZfttiSd/5Qt0xw/kqbl9o ubIn+h112v+mA5HADxCc =/6ia -----END PGP SIGNATURE----- _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go