Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Brian Sheppard via Computer-go Thu, 07 Dec 2017 05:16:12 -0800

The conversation on Stockfish's mailing list focused on how the match was 
imbalanced.

- AZ's TPU hardware was estimated at several times (7 times?) the computational 
power of Stockfish's.
- Stockfish's transposition table size (1 GB) was considered much too small for 
a 64 core machine.
- Stockfish's opening book is disabled, whereas AZ has, in effect, memorized a 
huge opening book.
- The match was against SF 8 (one year old) rather than the latest dev version.

To this I would add that the losses of Stockfish that I played through seemed 
to be largely self-similar, so it is possible that Stockfish has a relatively 
limited number of weaknesses that AZ does not, but the format of the match 
amplifies the issue.

So the attitude among the SF core is pretty competitive. Which is great news 
for continued development.

My concern about many of these points of comparison is that they presume how AZ 
scales. In the absence of data, I would guess that AZ gains much less from 
hardware than SF. I am basing this guess on two known facts. First is that AZ 
did not lose a game, so the upper bound on its strength is perfection. Second, 
AZ is a knowledge intensive program, so it is counting on judgement to a larger 
degree.

But I could be wrong. Maybe AZ falls apart tactically without 80K pops. There 
is no data, so all WAGs are valid.

-----Original Message-----
From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of 
Gian-Carlo Pascutto
Sent: Thursday, December 7, 2017 4:13 AM
To: computer-go@computer-go.org
Subject: Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a 
General Reinforcement Learning Algorithm

On 06-12-17 22:29, Brian Sheppard via Computer-go wrote:
> The chess result is 64-36: a 100 rating point edge! I think the
> Stockfish open source project improved Stockfish by ~20 rating points in
> the last year.

It's about 40-45 Elo FWIW.

> AZ would dominate the current TCEC. 

I don't think you'll get to 80 knps with a regular 22 core machine or
whatever they use. Remember that AZ hardware is about 16 x 1080 Ti's.
You'll lose that (70 - 40 = 30 Elo) advantage very, very quickly.

IMHO this makes it all the more clear how silly it is that so much
attention is given to TCEC with its completely arbitrary hardware choice.

> The Stockfish team will have some self-examination going forward for
> sure. I wonder what they will decide to do.

Probably the same the Zen team did. Ignore a large part of the result
because people's actual computers - let alone mobile phones - can't run
a neural net at TPU speeds.

The question is if resizing the network makes the resulting program more
competitive, enough to overcome the speed difference. And, aha, in which
direction are you going to try to resize? Bigger or smaller?

-- 
GCP
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Reply via email to