Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-31 Thread Ryan Hayward
; Date: Wednesday, 30 March 2016 at 18:59 > To: "computer-go@computer-go.org" > Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical > significance of results) > > Hey Simon, > > I only now remembered: > > we actually experimented on the effect &

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-31 Thread Lucas, Simon M
g>" mailto:computer-go@computer-go.org>> Date: Wednesday, 30 March 2016 at 18:59 To: "computer-go@computer-go.org<mailto:computer-go@computer-go.org>" mailto:computer-go@computer-go.org>> Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical significa

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread uurtamo .
Or, if it's lopsided far from 1/2, Wilson's is just as good, in my experience. On Mar 30, 2016 10:29 AM, "Olivier Teytaud" wrote: > don't use asymptotic normality with a sample size 5, use Fisher's exact > test > > the p-value for the rejection of > "P(alpha-Go wins a given game against Lee Sedol

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread Olivier Teytaud
don't use asymptotic normality with a sample size 5, use Fisher's exact test the p-value for the rejection of "P(alpha-Go wins a given game against Lee Sedol)<.5" might be something like 3/16 (under the "independent coin" assumption!) this is not 0.05, but still quite an impressive result :-) wi

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread Ryan Hayward
Hey Simon, I only now remembered: we actually experimented on the effect of making 1 blunder (random move instead of learned/searched move) in Go and Hex "Blunder Cost in Go and Hex" so this might be a starting point for your question of measuring player strength by measuring all move strengths

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread Lucas, Simon M
In my original post I put a link to the relevant section of the MacKay book that shows exactly how to calculate the probability of superiority assuming the game outcome is modelled as a biased coin toss: http://www.inference.phy.cam.ac.uk/itila/ I was making the point that for this and for o

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread djhbrown .
"do they have positive or negative correlation?" intriguing question, Petri. Intuitively, we might arbitrarily divide the human population into two groups; one which is discouraged by failure, and the other which takes the Lady MacBeth attitude of "screw your courage to the sticking point, and we

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread Petri Pitkanen
Since there are only two possible outcomes it pretty much normal. Actually binomial which will converge to normal given enough samples Only thing that cans distort is that consecutive games are not independent (which is probably the case but do they have positive or negative correlation?) 2016-03

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread Рождественский Дмитрий
I think the error here is that the game outcome is not a normaly distributed random value. Dmitry 30.03.2016, 12:57, "djhbrown ." : > Simon wrote: "I was discussing the results with a colleague outside > of the Game AI area the other day when he raised > the question (which applies to nearly all

[Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-30 Thread djhbrown .
Simon wrote: "I was discussing the results with a colleague outside of the Game AI area the other day when he raised the question (which applies to nearly all sporting events, given the small sample size involved) of statistical significance - suggesting that on another week the result might have b

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-23 Thread Nick Wedd
On 22 March 2016 at 21:43, Darren Cook wrote: < snip > > > C'mon DeepMind, put that same version on KGS, set to only play 9p > players, with the same time controls, and let's get 40 games to give it > a proper ranking. (If 5 games against Lee Sedol are useful, 40 games > against a range of playe

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-23 Thread Robert Jasiek
On 23.03.2016 15:32, Petr Baudis wrote: these are beautiful posts. https://massgoblog.wordpress.com/2016/03/11/lee-sedols-strategy-and-alphagos-weakness/ Before you become too excited, also read my comments on the commentary: http://www.lifein19x19.com/forum/viewtopic.php?p=200539#p200539 --

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-23 Thread Petr Baudis
Thank you, these are beautiful posts. I enjoyed very much reading a writeup by Go professional who also took the effort to understand the principles behind MCTS programs as well as develop a basic intution of the gameplay artifacts, strengths and weaknesses of MCTS. It also nicely describes the

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Chun Sun
FYI. We have translated 3 posts by Li Zhe 6p into English. https://massgoblog.wordpress.com/2016/03/11/lee-sedols-strategy-and-alphagos-weakness/ https://massgoblog.wordpress.com/2016/03/11/game-2-a-nobody-could-have-done-a-better-job-than-lee-sedol/ https://massgoblog.wordpress.com/2016/03/15/bef

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Darren Cook
> ... > Pro players who are not familiar with MCTS bot behavior will not see this. I stand by this: >> If you want to argue that "their opinion" was wrong because they don't >> understand the game at the level AlphaGo was playing at, then you can't >> use their opinion in a positive way either.

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Ingo Althöfer
Hi Darren, "Darren Cook" > ... But, there were also numerous moves where > the 9-dan pros said, that in *their* opinion, the moves were weak/wrong. > E.g. wasting ko threats for no reason. Moves even a 1p would never make. > > If you want to argue that "their opinion" was wrong because they don't

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Ingo Althöfer
"Lucas, Simon M" > my point is that I *think* we can say more (for example > by not treating the outcome as a black-box event, > but by appreciating the skill of the individual moves) * Human professional players were full of praise for some of AlphaGo's moves, for instance move 37 in game 2.

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Darren Cook
> ... we witnessed hundreds of moves vetted by 9dan players, especially > Michael Redmond's, where each move was vetted. This is a promising approach. But, there were also numerous moves where the 9-dan pros said, that in *their* opinion, the moves were weak/wrong. E.g. wasting ko threats for no

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread uurtamo .
This is somewhat moot - if any moves had been significantly and obviously weak to any observers, the results wouldn't have been 4-1. I.e. One bad move out of 5 games would give roughly the same strength information as one loss out of 5 games; consider that the kibitzing was being done in real time

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Jim O'Flaherty
I think you are reinforcing Simon's original point; i.e. using a more fine grained approach to statically approximate AlphaGo's ELO where fine grained is degree of vetting per move and/or a series of moves. That is a substantially larger sample size and each sample will have a pretty high degree of

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Thomas Wolf
Behalf Of Álvaro Begué Sent: 22 March 2016 17:21 To: computer-go Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)   A very simple-minded analysis is that, if the null hypothesis is that AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Jeffrey Greenberg
Given the minimal sample size, bothering over this question won't amount to much. I think the proper response is that no one thought we'd see this level of play at this point in our AI efforts and point to the fact that we witnessed hundreds of moves vetted by 9dan players, especially Michael Redmo

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Ryan Hayward
another interesting question is to judge the bot's strength by watching the facial gestures and body language of Lee Sedol with each move... On Tue, Mar 22, 2016 at 11:46 AM, Álvaro Begué wrote: > > > On Tue, Mar 22, 2016 at 1:40 PM, Nick Wedd wrote: > >> On 22 March 2016 at 17:20, Álvaro Begué

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Álvaro Begué
On Tue, Mar 22, 2016 at 1:40 PM, Nick Wedd wrote: > On 22 March 2016 at 17:20, Álvaro Begué wrote: > >> A very simple-minded analysis is that, if the null hypothesis is that >> AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we >> observed or better 15.625% of the time. Tha

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Lucas, Simon M
games would go. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Álvaro Begué Sent: 22 March 2016 17:21 To: computer-go Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results) A very simple-minded analysis is that, if the null

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Nick Wedd
On 22 March 2016 at 17:20, Álvaro Begué wrote: > A very simple-minded analysis is that, if the null hypothesis is that > AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we > observed or better 15.625% of the time. That's a p-value that even social > scientists don't get exci

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Álvaro Begué
A very simple-minded analysis is that, if the null hypothesis is that AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we observed or better 15.625% of the time. That's a p-value that even social scientists don't get excited about. :) Álvaro. On Tue, Mar 22, 2016 at 12:48 PM

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Jason House
Statistical significance requires a null hypothesis... I think it's probably easiest to ask the question of if I assume an ELO difference of x, how likely it's a 4-1 result? Turns out that 220 to 270 ELO has a 41% chance of that result. >= 10% is -50 to 670 ELO >= 1% is -250 to 1190 ELO My numbers

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread uurtamo .
> I'm not sure if we can say with certainty that AlphaGo is significantly > better Go player than Lee Sedol at this point. What we can say with > certainty is that AlphaGo is in the same ballpark and at least roughly > as strong as Lee Sedol. To me, that's enough to be really huge on its > own ac

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Lucas, Simon M
Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results) > I'm not sure if we can say with certainty that AlphaGo is significantly > better Go player than Lee Sedol at this point. What we can say with > certainty is that AlphaGo is in the same ballpar

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Petr Baudis
On Tue, Mar 22, 2016 at 04:00:41PM +, Lucas, Simon M wrote: > With AlphaGo winning 4 games to 1, from a simplistic > stats point of view (with the prior assumption of a fair > coin toss) you'd not be able to claim much statistical > significance, yet most (me included) believe that > AlphaGo i

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread uurtamo .
Simon, There's no argument better than evidence, and no evidence available to us other than *all* of the games that alphago has played publicly. Among two humans, a 4-1 result wouldn't indicate any more or less than this 4-1 result, but we'd already have very strong elo-type information about bot

Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results)

2016-03-22 Thread Lucas, Simon M
Hi all, I was discussing the results with a colleague outside of the Game AI area the other day when he raised the question (which applies to nearly all sporting events, given the small sample size involved) of statistical significance - suggesting that on another week the result might have been 4