I made a similar attempt as Alvaro to predict final ownership. You can find the code here: https://github.com/jmgilmer/GoCNN/. It's trained to predict final ownership for about 15000 professional games which were played until the end (didn't end in resignation). It gets about 80.5% accuracy on a held out test set, although the accuracy greatly varies based on how far through the game you are. Can't say how well it would work in a go player. -Justin
On Tue, Feb 23, 2016 at 7:00 AM, <computer-go-requ...@computer-go.org> wrote: > Send Computer-go mailing list submissions to > computer-go@computer-go.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://computer-go.org/mailman/listinfo/computer-go > or, via email, send a message with subject or body 'help' to > computer-go-requ...@computer-go.org > > You can reach the person managing the list at > computer-go-ow...@computer-go.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Computer-go digest..." > > > Today's Topics: > > 1. Re: Congratulations to Zen! (Robert Jasiek) > 2. Move evalution by expected value, as product of expected > winrate and expected points? (Michael Markefka) > 3. Re: Move evalution by expected value, as product of expected > winrate and expected points? (Álvaro Begué) > 4. Re: Move evalution by expected value, as product of expected > winrate and expected points? (Robert Jasiek) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 22 Feb 2016 19:13:20 +0100 > From: Robert Jasiek <jas...@snafu.de> > To: computer-go@computer-go.org > Subject: Re: [Computer-go] Congratulations to Zen! > Message-ID: <56cb4fc0.4010...@snafu.de> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Aja, sorry to bother you with trivialities, but how does Alphago avoid > power or network failures and such incidents? > > -- > robert jasiek > > > ------------------------------ > > Message: 2 > Date: Tue, 23 Feb 2016 11:36:57 +0100 > From: Michael Markefka <michael.marke...@gmail.com> > To: computer-go@computer-go.org > Subject: [Computer-go] Move evalution by expected value, as product of > expected winrate and expected points? > Message-ID: > < > cajg7papu_gbhvny3cv+d-p238_hkqkv5pojxozjly4nsqas...@mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > Hello everyone, > > in the wake of AlphaGo using a DCNN to predict expected winrate of a > move, I've been wondering whether one could train a DCNN for expected > territory or points successfully enough to be of some use (leaving the > issue of win by resignation for a more in-depth discussion). And, > whether winrate and expected territory (or points) always run in > parallel or whether there are diverging moments. > > Computer Go programs play what are considered slack or slow moves when > ahead, sometimes being too conservative and giving away too much of > their potential advantage. If expected points and expected winrate > diverge, this could be a way to make the programs play in a more > natural way, even if there were no strength increase to be gained. > Then again there might be a parameter configuration that might yield > some advantage and perhaps this configuration would need to be > dynamic, favoring winrate the further the game progresses. > > > As a general example for the idea, let's assume we have the following > potential moves generated by our program: > > #1: Winrate 55%, +5 expected final points > #2: Winrate 53%, +15 expected final points > > Is the move with higher winrate always better? Or would there be some > benefit to choosing #2? Would this differ depending on how far along > the game is? > > If we knew the winrate prediction to be perfect, then going by that > alone would probably result in the best overall performance. But given > some uncertainty there, expected value could be interesting. > > > Any takers for some experiments? > > > -Michael > > > ------------------------------ > > Message: 3 > Date: Tue, 23 Feb 2016 06:44:04 -0500 > From: Álvaro Begué <alvaro.be...@gmail.com> > To: computer-go <computer-go@computer-go.org> > Subject: Re: [Computer-go] Move evalution by expected value, as > product of expected winrate and expected points? > Message-ID: > <CAF8dVMWLPQBhD-Q07YeLZwqV9M9JCW+_VbSRVp= > evj9cn6w...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > I have experimented with a CNN that predicts ownership, but I found it to > be too weak to be useful. The main difference between what Google did and > what I did is in the dataset used for training: I had tens of thousands of > games (I did several different experiments) and I used all the positions > from each game (which is known to be problematic); they used 30M positions > from independent games. I expect you can learn a lot about ownership and > expected number of points from a dataset like that. Unfortunately, > generating such a dataset is infeasible with the resources most of us have. > > Here's an idea: Google could make the dataset publicly available for > download, ideally with the final configurations of the board as well. There > is a tradition of making interesting datasets for machine learning > available, so I have some hope this may happen. > > The one experiment I would like to make along the lines of your post is to > train a CNN to compute both the expected number of points and its standard > deviation. If you assume the distribution of scores is well approximated by > a normal distribution, maximizing winning probability can be achieved by > maximizing (expected score) / (standard deviation of the score). I wonder > if that results in stronger or more natural play than making a direct model > for winning probability, because you get to learn more about each position. > > Álvaro. > > > > On Tue, Feb 23, 2016 at 5:36 AM, Michael Markefka < > michael.marke...@gmail.com> wrote: > > > Hello everyone, > > > > in the wake of AlphaGo using a DCNN to predict expected winrate of a > > move, I've been wondering whether one could train a DCNN for expected > > territory or points successfully enough to be of some use (leaving the > > issue of win by resignation for a more in-depth discussion). And, > > whether winrate and expected territory (or points) always run in > > parallel or whether there are diverging moments. > > > > Computer Go programs play what are considered slack or slow moves when > > ahead, sometimes being too conservative and giving away too much of > > their potential advantage. If expected points and expected winrate > > diverge, this could be a way to make the programs play in a more > > natural way, even if there were no strength increase to be gained. > > Then again there might be a parameter configuration that might yield > > some advantage and perhaps this configuration would need to be > > dynamic, favoring winrate the further the game progresses. > > > > > > As a general example for the idea, let's assume we have the following > > potential moves generated by our program: > > > > #1: Winrate 55%, +5 expected final points > > #2: Winrate 53%, +15 expected final points > > > > Is the move with higher winrate always better? Or would there be some > > benefit to choosing #2? Would this differ depending on how far along > > the game is? > > > > If we knew the winrate prediction to be perfect, then going by that > > alone would probably result in the best overall performance. But given > > some uncertainty there, expected value could be interesting. > > > > > > Any takers for some experiments? > > > > > > -Michael > > _______________________________________________ > > Computer-go mailing list > > Computer-go@computer-go.org > > http://computer-go.org/mailman/listinfo/computer-go > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://computer-go.org/pipermail/computer-go/attachments/20160223/700a08a3/attachment-0001.html > > > > ------------------------------ > > Message: 4 > Date: Tue, 23 Feb 2016 12:54:22 +0100 > From: Robert Jasiek <jas...@snafu.de> > To: computer-go@computer-go.org > Subject: Re: [Computer-go] Move evalution by expected value, as > product of expected winrate and expected points? > Message-ID: <56cc486e.1030...@snafu.de> > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 23.02.2016 11:36, Michael Markefka wrote: > > whether one could train a DCNN for expected territory > > First, some definition of territory must be chosen or stated. Second, > you must decide if territory according to this definition can be > determined by a neural net meaningfully at all. Third, if yes, do it. > > Note that there are very different definitions of territory. The most > suitable definition for positional judgement (see Positional Judgement 1 > - Territory) is sophisticated and requires a combination of expert rules > (specifying for what to detemine, and how to read to determine it) and > reading. > > A weak definition could predict whether a particular intersections will > be territory in the game end's scoring position. Such can be fast for MC > or NN, and maybe such is good enough as a very rough approximation for > programs. For humans, such is very bad because it neglects different > degrees of safety of (potential) territory and the strategic concepts of > sacrifice and exchange. > > I have also suggested other definitions, but IMO they are less > attractive for NN. > > -- > robert jasiek > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > > ------------------------------ > > End of Computer-go Digest, Vol 73, Issue 42 > ******************************************* >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go