Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

Detlef Schmicker Thu, 04 Feb 2016 13:44:52 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> One possibility is that 0=loss, 1=win, and the number they are
quoting is
> sqrt(average((prediction-outcome)^2)).



this makes perfectly sense for figure 2. even playouts seem reasonable.

But figure 2 is not consistent with the numbers in section 3 would be
0.234 (test set of the self-play data base. The figure looks more like
0.3 - 0.35 or even higher...



Am 04.02.2016 um 21:43 schrieb Álvaro Begué:
> I just want to see how to get 0.5 for the initial position on the
> board with some definition.
> 
> One possibility is that 0=loss, 1=win, and the number they are
> quoting is sqrt(average((prediction-outcome)^2)).
> 
> 
> On Thu, Feb 4, 2016 at 3:40 PM, Hideki Kato
> <hideki_ka...@ybb.ne.jp> wrote:
> 
>> I think the error is defined as the difference between the output
>> of the value network and the average output of the simulations
>> done by the policy network (RL) at each position.
>> 
>> Hideki
>> 
>> Michael Markefka:
>> <CAJg7PAN9G2_htRs0mfKuFi82yef7gNFCsouE4ez4f37_pK= 
>> k...@mail.gmail.com>:
>>> That sounds like it'd be the MSE as classification error of the
>>> eventual
>> result.
>> 
>>> 
>> 
>>> I'm currently not able to look at the paper, but couldn't you
>>> use a
>> 
>>> softmax output layer with two nodes and take the probability
>> 
>>> distribution as winrate?
>> 
>>> 
>> 
>>> On Thu, Feb 4, 2016 at 8:34 PM, Álvaro Begué
>>> <alvaro.be...@gmail.com>
>> wrote:
>> 
>>>> I am not sure how exactly they define MSE. If you look at the
>>>> plot in
>> figure
>> 
>>>> 2b, the MSE at the very beginning of the game (where you
>>>> can't possibly
>> know
>> 
>>>> anything about the result) is 0.50. That suggests it's
>>>> something else
>> than
>> 
>>>> your [very sensible] interpretation.
>> 
>>>> 
>> 
>>>> Álvaro.
>> 
>>>> 
>> 
>>>> 
>> 
>>>> 
>> 
>>>> On Thu, Feb 4, 2016 at 2:24 PM, Detlef Schmicker
>>>> <d...@physik.de> wrote:
>> 
>>>>> 
>> 
>>> 
>>>>>>>> Since all positions of all games in the dataset are
>>>>>>>> used, winrate
>>> 
>>>>>>>> should distributes from 0% to 100%, or -1 to 1, not
>>>>>>>> 1. Then, the
>>> 
>>>>>>>> number 70% could be wrong.  MSE is 0.37 just means
>>>>>>>> the average
>>> 
>>>>>>>> error is about 0.6, I think.
>>> 
> 
>>> 
> 0.6 in the range of -1 to 1,
>>> 
> 
>>> 
> which means -1 (eg lost by b) games -> typical value -0.4
>>> 
> and +1 games -> typical value +0.4 of the value network
>>> 
> 
>>> 
> if I rescale -1 to +1 to  0 - 100% (eg winrate for b) than I get
> about
>>> 
> 30% for games lost by b and 70% for games won by B?
>>> 
> 
>>> 
> Detlef
>>> 
> 
>>> 
> 
>>> 
> Am 04.02.2016 um 20:10 schrieb Hideki Kato:
>>> 
>>>>>>> Detlef Schmicker: <56b385ce.4080...@physik.de>: Hi,
>>> 
>>>>>>> 
>>> 
>>>>>>> I try to reproduce numbers from section 3: training the
>>>>>>> value
>>> 
>>>>>>> network
>>> 
>>>>>>> 
>>> 
>>>>>>> On the test set of kgs games the MSE is 0.37. Is it
>>>>>>> correct, that
>>> 
>>>>>>> the results are represented as +1 and -1?
>>> 
>>>>>>> 
>>> 
>>>>>>>> Looks correct.
>>> 
>>>>>>> 
>>> 
>>>>>>> This means, that in a typical board position you get a
>>>>>>> value of
>>> 
>>>>>>> 1-sqrt(0.37) = 0.4  --> this would correspond to a win
>>>>>>> rate of 70%
>>> 
>>>>>>> ?!
>>> 
>>>>>>> 
>>> 
>>>>>>>> Since all positions of all games in the dataset are
>>>>>>>> used, winrate
>>> 
>>>>>>>> should distributes from 0% to 100%, or -1 to 1, not
>>>>>>>> 1. Then, the
>>> 
>>>>>>>> number 70% could be wrong.  MSE is 0.37 just means
>>>>>>>> the average
>>> 
>>>>>>>> error is about 0.6, I think.
>>> 
>>>>>>> 
>>> 
>>>>>>>> Hideki
>>> 
>>>>>>> 
>>> 
>>>>>>> Is it really true, that a typical kgs 6d+ position is
>>>>>>> judeged with
>>> 
>>>>>>> such a high win rate (even though it it is overfitted,
>>>>>>> so the test
>>> 
>>>>>>> set number is to bad!), or do I misinterpret the MSE
>>>>>>> calculation?!
>>> 
>>>>>>> 
>>> 
>>>>>>> Any help would be great,
>>> 
>>>>>>> 
>>> 
>>>>>>> Detlef
>>> 
>>>>>>> 
>>> 
>>>>>>> Am 27.01.2016 um 19:46 schrieb Aja Huang:
>>> 
>>>>>>>>>> Hi all,
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> We are very excited to announce that our Go
>>>>>>>>>> program, AlphaGo,
>>> 
>>>>>>>>>> has beaten a professional player for the first
>>>>>>>>>> time. AlphaGo
>>> 
>>>>>>>>>> beat the European champion Fan Hui by 5 games to
>>>>>>>>>> 0. We hope
>>> 
>>>>>>>>>> you enjoy our paper, published in Nature today.
>>>>>>>>>> The paper and
>>> 
>>>>>>>>>> all the games can be found here:
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> http://www.deepmind.com/alpha-go.html
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> AlphaGo will be competing in a match against Lee
>>>>>>>>>> Sedol in
>>> 
>>>>>>>>>> Seoul, this March, to see whether we finally have
>>>>>>>>>> a Go
>>> 
>>>>>>>>>> program that is stronger than any human!
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> Aja
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> PS I am very busy preparing AlphaGo for the
>>>>>>>>>> match, so
>>> 
>>>>>>>>>> apologies in advance if I cannot respond to all
>>>>>>>>>> questions
>>> 
>>>>>>>>>> about AlphaGo.
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> 
>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Computer-go
>>> 
>>>>>>>>>> mailing list Computer-go@computer-go.org
>>> 
>>>>>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>>>>>>>>>
>>>
>>>>>>>>
>>>>>>>>>> 
_______________________________________________ Computer-go
>>> 
>>>>>>>> mailing list Computer-go@computer-go.org
>>> 
>>>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>> 
>> 
>>>>> _______________________________________________
>> 
>>>>> Computer-go mailing list
>> 
>>>>> Computer-go@computer-go.org
>> 
>>>>> http://computer-go.org/mailman/listinfo/computer-go
>> 
>>>> 
>> 
>>>> 
>> 
>>>> 
>> 
>>>> _______________________________________________
>> 
>>>> Computer-go mailing list
>> 
>>>> Computer-go@computer-go.org
>> 
>>>> http://computer-go.org/mailman/listinfo/computer-go
>> 
>>> _______________________________________________
>> 
>>> Computer-go mailing list
>> 
>>> Computer-go@computer-go.org
>> 
>>> http://computer-go.org/mailman/listinfo/computer-go
>> -- Hideki Kato <mailto:hideki_ka...@ybb.ne.jp> 
>> _______________________________________________ Computer-go
>> mailing list Computer-go@computer-go.org 
>> http://computer-go.org/mailman/listinfo/computer-go
>> 
> 
> 
> 
> _______________________________________________ Computer-go mailing
> list Computer-go@computer-go.org 
> http://computer-go.org/mailman/listinfo/computer-go
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJWs8ZGAAoJEInWdHg+Znf4D4sP/Rr7HPtRpE0rgzIjzSvI4NtM
EZMldUdsEyJ8u6C4o8cWUfHX7TChfgjUpDpJL/uvmAgiunvB3RXccT3DKLWAbo8G
t9QUsMgd791g4RkFsJ5ZZWJ/bGrchov9bXIcPO9QzJ1FJRrVuwMfH43SBnPItee7
Z9QH7FF6jgyBjFxeNChhF8FMOD55+uuu8/o3htMCAHBZ6Y4aMEdFQYQHmHdGUYHF
Vtgy++yRIP9V0BiiBqNCKxT41cK5kaEzbUYgIoLs0kHpxTzJd/WAiLxSHAPyYnLY
WL9/NU1/dW6/Ef7wbi8I68lDz+COfIaZ8KMH75Q4O90OIta+O7eBznNBEC3Ei5iH
3BvlfzPZ+fHZb6Yw7MrbVJFfPJzXRM0C/C9uHjDcdi6wZTpoEhWiYFKeGogRRSg3
2Y+xJrFh/p+akLjo70BcD48TwJIYdDdgFUfgj5vvyru3H9oZ/fJKLX6WPx1brCnj
RXtmH+k+G6Gi+WRACKEgtw59Rm5h7F/sQv3apqXFii8QnHcChNnsXcn/mCYBqlnM
W4e2fk6+HJbth0bLobAG4DaM+j9C/gde0ruUhTtYIap4iC5hkf8zrZTZzzVdsCcc
tBv8CFXif8cjAAQwYIhMt/VDMbIoPwwczCsJS6XXr7j7vzoKiiMCrSLZ8DF+IXEi
0nKF0PbVS4JPpajYpGkL
=tLXF
-----END PGP SIGNATURE-----
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Mastering the Game of Go with Deep Neural Networks and Tree Search (value network)

Reply via email to