-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

good to start this discussion here. I had the discussion some times,
and we (discussion partner and me) were not sure, against which test
set the 57% was measured.

If trained and tested with kgs 6d+ dataset, it seems reasonable to
reach 57% (I reached >55% within 2 weaks, but than changed to GoGod),
with test against GoGod I do not get near that value too (around >51%
within 3 weeks on gtx 970). By the way, both get roughly the same
playing strength in my case.

So, if somebody is sure, it is measured against GoGod, I think a
number of other go programmers have to think again. I heard them
reaching 51% (e. g. posts by Hiroshi in this list)

Detlef

Am 23.08.2016 um 08:40 schrieb Robert Waite:
> I had subscribed to this mailing list back with MoGo... and
> remember probably arguing that the game of go wasn't going to be
> beat for years and years. I am a little late to the game now but
> was curious if anyone here has worked with supervised learning
> networks like in the AlphaGo paper.
> 
> I have been training some networks along the lines of the AlphaGo
> paper and the DarkForest paper.. and a couple others... and am
> working with a single 660gtx. I know... laugh... but its a fair
> benchmark and i'm being cheap for the moment.
> 
> Breaking 50% accuracy is quite challenging... I have attempted
> many permutations of learning algorithms... and can hit 40%
> accuracy in perhaps 4-12 hours... depending on the parameters set.
> Some things I have tried are using default AlphaGo but wtih 128
> filters, 32 minibatch size and .01 learning rate, changing the
> optimizer from vanilla SGD to Adam or RMSProp. Changing batching to
> match DarkForest style (making sure that a minibatch contains
> samples from game phases... for example beginning, middle and 
> end-game).Pretty much everything seems to converge at a rate that
> will really stretch out. I am planning on picking a line and going
> with it for an extended training but was wondering if anyone has
> ever gotten close to the convergence rates implied by the
> DarkForest paper.
> 
> For comparison... Google team had 50 gpus, spend 3 weeks.. and
> processed 5440M state/action pairs. The FB team had 4 gpus, spent 2
> weeks and processed 150M-200M state/action pairs. Both seemed to
> get to around 57% accuracy with their networks.
> 
> I have also been testing them against GnuGo as a baseline.. and
> find that GnuGo can be beaten rather easily with very little
> network training... my eye is on Pachi... but have to break 50%
> accuracy i think to even worry about that.
> 
> Have also played with reinforcement learning phase... started with
> learning rate of .01... which i think was too high.... that does
> take quite a bit of time on my machine.. so didnt play too much
> with it yet.
> 
> Anyway.... does anyone have any tales of how long it took to break
> 50%? What is the magic bullet that will help me converge there
> quickly!
> 
> Here is a long-view graph of various attempts:
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCUFRkMlNPbzV2QTQ/view
> 
> Red and Blue lines are from another member that ran 32 in a
> minibatch, .01 learning rate and 128 filters in the middle layers
> vs. 192. They had 4 k40 gpus I believe. They also used 40000
> training pairs to 40000 validation pairs... so I imagine that is
> whey they had such a spread. There is a jump in the accuracy which
> was when learning rate was decreased to .001 I believe.
> 
> Closer shot:
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCRVUxUFJaWVJBdEE/view
> 
> Most stay between the lines... but looking at both graphs makes me
> wonder if any of the lines are approaching the convergence of
> DarkForest. My gut tells me they were onto something... and am
> rather curious of the playing strength of the DarkForest SL network
> and the AG SL network.
> 
> Also... a picture of the network's view on a position... this one
> was trained to 41% accuracy and played itself greedily.
> 
> https://drive.google.com/file/d/0B0BbrXeL6VyCNkRmVDBIYldraWs/view
> 
> Oh... and another thing.... AG used KGS amateur data... FB and my
> networks have been trained on pro games only. At one point I tested
> the 41% network in the image (trained on pro data) and a 44%
> network trained on amateur (KGS games) against GnuGo... and the pro
> data network soundly won... and the amateur network soundly lost...
> so I stuck with pro since. Not sure if the end result is the
> same... and kinda glad AG team used amateur as that removes the
> argument that it somehow learned Le Sedol's style.
> 
> 
> 
> _______________________________________________ Computer-go mailing
> list Computer-go@computer-go.org 
> http://computer-go.org/mailman/listinfo/computer-go
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJXu/PCAAoJEInWdHg+Znf4WmkP/iMpRo/jI/ZYh83C0CTOh2HB
1nq4cb9CG1fubaOIfH5T7veu3pqQWTxPLUI5FOIN99BkESc0NbuSP5geoyLy0hVY
IYShAju7JFuWOxarXtIcQUs27vljjddLYgBZ1sROGoYAxtaRcSa+QOup2oGJasog
TxwVOf2/AqW9kIV+c1oQgaspGoh2Lj2/i4mHn1oQOl22GLnNI0Sso7ded7B7geVS
7d7k76KMIq2lNNUJpwEy5Z84NHSa1yDwwk4BbPiz1KtTrB+S5j8f5Cq/HTVUnw6c
gfz37L3PkUtImKTNLSwLum9y8B0fkpkiqu4PXwgp4E8zQ5coF3tjn/quP3D+u3+3
xCbA3nCnaCPxsZdeAQOOdfLdDGaySyKTVTo5+XwgEqawFo4jaepv++OXk6nlONkw
6q2WbnwypIhPIPkJrsbliEA1Fdbv5fdqfbecl4VAvXQh6kdxvdMdZfYUfZ/JmB4A
rfJPOfVe1aya5/UEyy3l6ttLL4Qhvd4i/nPaEgkmb0UfJIN6sR+uqSk6/o+89p8J
z92i/8JBzxKg8mNbuq4yaXzoBWXLN7xw5kRfrnds3aIvGDsA4t26oNtdYFxJekPs
Ujv6jipexiY1b68UGRU1DCVbOe+NtG8g8DMq76CH70GZfttiSd/5Qt0xw/kqbl9o
ubIn+h112v+mA5HADxCc
=/6ia
-----END PGP SIGNATURE-----
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to