I see the same dynamics that you do, Darren. The 400-game match always has some
probability of being won by the challenger. It is just much more likely if the
challenger is stronger than the champion.
-Original Message-
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Be
I could be drawing wrong inferences from incomplete information, but as
Darren pointed out, this paper does leave the impression Alpha Zero is not
as strong as the real AlphaGo Zero, in which case it would be clearer to
say so explicitly. Of course the chess and shogi results are impressive
regardl
>> One of the changes they made (bottom of p.3) was to continuously
>> update the neural net, rather than require a new network to beat
>> it 55% of the time to be used. (That struck me as strange at the
>> time, when reading the AlphaGoZero paper - why not just >50%?)
Gian wrote:
> I read that a
Requiring a margin > 55% is a defense against a random result. A 55% score in a
400-game match is 2 sigma.
But I like the AZ policy better, because it does not require arbitrary
parameters. It also improves more fluidly by always drawing training examples
from the current probability distributi
The chess result is 64-36: a 100 rating point edge! I think the Stockfish open
source project improved Stockfish by ~20 rating points in the last year. Given
the number of people/computers involved, Stockfish’s annual effort level seems
comparable to the AZ effort.
Stockfish is really, reall
On Wed, Dec 06, 2017 at 09:57:42AM -0800, Darren Cook wrote:
> > Mastering Chess and Shogi by Self-Play with a General Reinforcement
> > Learning Algorithm
> > https://arxiv.org/pdf/1712.01815.pdf
>
> One of the changes they made (bottom of p.3) was to continuously update
> the neural net, rather
On 6/12/2017 19:48, Xavier Combelle wrote:
> Another result is that chess is really drawish, at the opposite of shogi
We sort-of knew that, but OTOH isn't that also because the resulting
engine strength was close to Stockfish, unlike in other games?
--
GCP
___
> The AlphaZero paper shows it out-performs AlphaGoZero, but they are
> comparing to the 20-block, 3-day version. Not the 40-block, 40-day
> version that was even stronger.
> As papers rarely show failures, can we take it to mean they couldn't
> out-perform their best go bot, do you think? ...
>
>
On 6/12/2017 18:57, Darren Cook wrote:
>> Mastering Chess and Shogi by Self-Play with a General Reinforcement
>> Learning Algorithm
>> https://arxiv.org/pdf/1712.01815.pdf
>
> One of the changes they made (bottom of p.3) was to continuously update
> the neural net, rather than require a new networ
Another result is that chess is really drawish, at the opposite of shogi
Le 06/12/2017 à 18:50, Richard Lorentz a écrit :
> One chess result stood out for me, namely, just how much easier it was
> for AlphaZero to win with white (25 wins, 25 draws, 0 losses) rather
> than with black (3 wins, 47 d
"Joshua Shriver" asked:
> What about arimaa?
My personal impression: Arimaa should be rather easy for the
AlphaZero approach.
My questions:
* How well does the AlphaZero approach
perform in Non-zero-sum games?
(or in games with more than two players)
* How well does the AlphaZero approach
perf
> Mastering Chess and Shogi by Self-Play with a General Reinforcement
> Learning Algorithm
> https://arxiv.org/pdf/1712.01815.pdf
One of the changes they made (bottom of p.3) was to continuously update
the neural net, rather than require a new network to beat it 55% of the
time to be used. (That s
One chess result stood out for me, namely, just how much easier it was
for AlphaZero to win with white (25 wins, 25 draws, 0 losses) rather
than with black (3 wins, 47 draws, 0 losses).
Maybe we should not give up on the idea of White to play and win in chess!
On 12/06/2017 01:24 AM, Hiroshi Y
What about arimaa?
On Wed, Dec 6, 2017 at 9:28 AM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> wrote:
> It seems, we are living in extremely
> heavy times ...
>
> I want to go to bed now and meditate for threee days.
>
>> DeepMind makes strongest Chess and Shogi programs with AlphaGo Zero method.
>> Ma
Hex:
https://arxiv.org/pdf/1705.08439.pdf
This is not on a 19x19 board, and it was not tested against the current
state of the art (Mohex 1.0 was the state of the art at its time, but is at
least several years old now, I think), but they do get several hundred elo
points stronger than this old ver
2017-12-06 13:52 GMT+00:00 Gian-Carlo Pascutto :
> On 06-12-17 11:47, Aja Huang wrote:
> > All I can say is that first-play-urgency is not a significant
> > technical detail, and what's why we didn't specify it in the paper.
>
> I will have to disagree here. Of course, it's always possible I'm
> m
My hand-wavy argument succumbs to experimental data. And to a better
argument. :)
I stand corrected.
Thanks,
Álvaro.
On Wed, Dec 6, 2017 at 8:52 AM, Gian-Carlo Pascutto wrote:
> On 06-12-17 11:47, Aja Huang wrote:
> > All I can say is that first-play-urgency is not a significant
> > technica
It seems, we are living in extremely
heavy times ...
I want to go to bed now and meditate for threee days.
> DeepMind makes strongest Chess and Shogi programs with AlphaGo Zero method.
> Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning
> Algorithm
> https://arxiv.or
On 06-12-17 11:47, Aja Huang wrote:
> All I can say is that first-play-urgency is not a significant
> technical detail, and what's why we didn't specify it in the paper.
I will have to disagree here. Of course, it's always possible I'm
misunderstanding something, or I have a program bug that I'm
Thanks for letting us know the situation Aja. It must be hard for an
engineer to not be able to discuss the details of his work!
As for the first-play-urgency value, if we indulge in some reading between
the lines: It's possible to interpret the paper as saying
first-play-urgency is zero. After re
2017-12-06 9:23 GMT+00:00 Gian-Carlo Pascutto :
> On 03-12-17 17:57, Rémi Coulom wrote:
> > They have a Q(s,a) term in their node-selection formula, but they
> > don't tell what value they give to an action that has not yet been
> > visited. Maybe Aja can tell us.
>
> FWIW I already asked Aja this
Hi,
It appears AlphaZero surpasses AlphaGo Zero at Go, Stockfish at Chess and
Elmo at Shogi in a few hours of self play...
https://arxiv.org/pdf/1712.01815.pdf
Best,
Tristan.
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-
Hi,
DeepMind makes strongest Chess and Shogi programs with AlphaGo Zero method.
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning
Algorithm
https://arxiv.org/pdf/1712.01815.pdf
AlphaZero(Chess) outperformed Stockfish after 4 hours,
AlphaZero(Shogi) outperformed elmo
https://arxiv.org/abs/1712.01815
Best
Magnus
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go
On 03-12-17 17:57, Rémi Coulom wrote:
> They have a Q(s,a) term in their node-selection formula, but they
> don't tell what value they give to an action that has not yet been
> visited. Maybe Aja can tell us.
FWIW I already asked Aja this exact question a bit after the paper came
out and he told m
On 03-12-17 17:57, Rémi Coulom wrote:
> They have a Q(s,a) term in their node-selection formula, but they
> don't tell what value they give to an action that has not yet been
> visited. Maybe Aja can tell us.
FWIW I already asked Aja this exact question a bit after the paper came
out and he told m
26 matches
Mail list logo