Re: [Computer-go] Converging to 57%

Robert Waite Wed, 24 Aug 2016 01:13:49 -0700

Sorry to spam... this'll be the last one for now. I bet if anyone who
trained on KGS data scaled their data to my graph (pairs processed to
accuracy)... that you hit ~50% after around 25M pairs processed. That is
what the graph in the DarkForest paper implies. Which is why I was tearing
my hair out trying to see how they converged so quickly on GoGoD (or if the
implementation I am using is horribly flawed). Play strength seems to say
the network is working... so doubt the implementation is completely out of
whack.


On Wed, Aug 24, 2016 at 1:01 AM, Robert Waite <winstonwa...@gmail.com>
wrote:

> And guess for posterity... I am using what I believe to be a correct
> implementation of the SL from AG... with 12 layers, 128 filters instead of
> 192 for inner layers, 46 features (2 ladder features missing) instead of
> 48, all 8 symmetries on the input data. Adam runs had to decrease LR to
> .0001... but think that makes sense... Adam paper had batchsizes of 128 and
> LR of .001. I was using 16 in a minibatch at the time so LR generally needs
> to decrease because less accurate gradients. For example... 256 MB with LR
> of .05 in DarkForest is somewhat proportional to 16 in minibatch and .003
> LR. In the end been staying mainly with vanilla SGD because all of the
> papers seem to and the tracks seemed pretty bouncy to me.
>
> My guess is if you run a 51% accuracy GoGoD network or perhaps even a 56%
> KGS network... and dont worry too much about features or filter counts...
> you could get 3D on KGS vs. humans. With just a network evaluation... that
> is pretty dang impressive. I guess there might be weaknesses that humans
> could figure vs. just the networks without search... but still... I'd be
> pretty happy.
>
> On Wed, Aug 24, 2016 at 12:30 AM, Robert Waite <winstonwa...@gmail.com>
> wrote:
>
>> @Detlef It is comforting to hear that GoGoD data seemed to converge
>> towards 51% in your testing. When I ran KGS data... it definitely converged
>> more quickly but I stopped them short. I think it all makes sense if figure
>> 5 of the DarkForest paper is the convergence of KGS data... and it doesn't
>> seem clear... but looking at the paper now... they are comparing with
>> Maddison and makes sense they would show the numbers for the same dataset.
>>
>> @GCP the three move strength graphs looked shaky to me... it doesnt seem
>> like a clear change in strength. For the ladder issue... I think MCTS and a
>> value or fast rollout network are how AG overcame weaknesses like that. The
>> fast rollout network is actually the vaguest part to me... i have red some
>> of the ancestor papers... and can see that people in the field know what
>> they are describing mostly.. but I don't know where to begin to get the
>> pattern counts listed in the AG tables at the end of the paper
>>
>> @David Have you matched your network vs. GnuGo? I think accuracy and loss
>> are indicators of model health... but playing strength seems diff. The AG
>> paper only mentions beating Pachi at 100k rollouts with the RL... not the
>> SL... at 85% winrate. The DarkForest paper shows more data with winrates...
>> KGS network vs Pachi 10k won ~23% of games.. but GoGoD trained won ~59%.
>> They also tacked on extended features and 3 step prediction... so who knows.
>>
>> I am actually feeling a million times better about 51% being the heavy
>> zone for GoGoD data. Makes my graphs make more sense.
>>
>> Graphs now:
>>
>> https://drive.google.com/file/d/0B0BbrXeL6VyCZEJuMG5nVG9NYkU/view
>>
>> https://drive.google.com/file/d/0B0BbrXeL6VyCR3ZxaUVGNU5pVDQ/view
>>
>> Gonna keep going with the magenta and black line... figure I can get to
>> 48 percent. I can run 10 million pairs in a day... so the graph width is 1
>> week. Lol... feel so happy if 57 isnt expected on GoGoD. 51 looks fine and
>> approachable on my graphs.
>>
>> For the game phase batched data... the DarkForest paper explicitly calls
>> out that they got stuck in poor minima without it. I figured that
>> randomness was fine... but you could definitely get some skews... like no
>> beginning moves in a minibatch of size 16 like AG. Their paper didn't
>> elaborate... but did mention 16 threads... to generate a pair.. i select
>> one random game from all of the available sgf files... and split the game
>> into 16 sections. I am using threading too.. so more to that... but
>> basically 16 sets of 16 makes for a 256 minibatch like DarkForest team.
>>
>> Think the only way to beat Zen or CrazyStone is to get the value network
>> or fast-rollout with MCTS. Of course... CrazyStone is evolving too... so
>> maybe not a goal.
>>
>>
>>
>> On Tue, Aug 23, 2016 at 11:17 PM, David Fotland <fotl...@smart-games.com>
>> wrote:
>>
>>> I train using approximately the same training set as AlphaGo, but so far
>>> without the augmentation with rotations and reflection. My target is about
>>> 55.5%, since that's what Alphago got on their training set without
>>> reinforcement learning.
>>>
>>> I find I need 5x5 in the first layer, at least 12 layers, and at least
>>> 96 filters to get over 50%. My best net is 55.3%, 18 layers by 96 filters.
>>> I use simple SGD with a 64 minibatch, no momentum, 0.01 learning rate until
>>> it flattens out, then 0.001. I have two 980TI, and the best nets take about
>>> 5 days to train (about 20 epochs on about 30M positions). The last few
>>> percent is just trial and error. Sometimes making the net wider or deeper
>>> makes it weaker. Perhaps it's just variation from one training run to
>>> another. I haven’t tried training the same net more than once.
>>>
>>> David
>>>
>>> > -----Original Message-----
>>> > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On
>>> Behalf
>>> > Of Gian-Carlo Pascutto
>>> > Sent: Tuesday, August 23, 2016 12:42 AM
>>> > To: computer-go@computer-go.org
>>> > Subject: Re: [Computer-go] Converging to 57%
>>> >
>>> > On 23-08-16 08:57, Detlef Schmicker wrote:
>>> >
>>> > > So, if somebody is sure, it is measured against GoGod, I think a
>>> > > number of other go programmers have to think again. I heard them
>>> > > reaching 51% (e. g. posts by Hiroshi in this list)
>>> >
>>> > I trained a 128 x 14 network for Leela 0.7.0 and this gets 51.1% on
>>> > GoGoD.
>>> >
>>> > Something I noticed from the papers is that the prediction percentage
>>> > keeps going upwards with more epochs, even if slowly, but still clearly
>>> > up.
>>> >
>>> > In my experience my networks converge rather quickly (like >0.5% per
>>> > epoch after the first), get stuck, get one more 0.5% gain if I lower
>>> the
>>> > learning rate (by a factor 5 or 10) and don't gain any more regardless
>>> > of what I do thereafter.
>>> >
>>> > I do use momentum. IIRC I tested without momentum once and it was
>>> worse,
>>> > and much slower.
>>> >
>>> > I did not find any improvement in playing strength from doing
>>> Facebook's
>>> > 3 move prediction. Perhaps it needs much bigger networks than 128 x 12.
>>> >
>>> > Adding ladder features also isn't good enough to (consistently) keep
>>> the
>>> > network from playing into them. (And once it's played the first move,
>>> > you're totally SOL because the resulting positions aren't in the
>>> > training set and you'll get 99% confidence for continuing the losing
>>> > ladder moves)
>>> >
>>> > I'm currently doing a more systematic comparison of all methods (and
>>> > GoGoD vs KGS+GoGoD) on 128 x 12, and testing the resulting strength
>>> > (rather than looking at prediction %). I'll post the results here, if
>>> > anything definite comes out of it.
>>> >
>>> > --
>>> > GCP
>>> > _______________________________________________
>>> > Computer-go mailing list
>>> > Computer-go@computer-go.org
>>> > http://computer-go.org/mailman/listinfo/computer-go
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>>
>

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Converging to 57%

Reply via email to