Working on it :) David
> -----Original Message----- > From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf > Of "Ingo Althöfer" > Sent: Tuesday, February 23, 2016 7:56 AM > To: computer-go@computer-go.org > Subject: *****SPAM***** Re: [Computer-go] Move evalution by expected > value, as product of expected winrate and expected points? > > My 1.5 cent: > > David Fotland has a nice score-estimator in his (old) ManyFaces bot. > The score estimator is still from the days before the Monte Carlo > version. > > Perhaps, David can improve on this estimator with help of CNNs. > > Ingo. > > > > Gesendet: Dienstag, 23. Februar 2016 um 16:41 Uhr Von: "Justin .Gilmer" > <jmgil...@gmail.com> An: computer-go@computer-go.org > Betreff: Re: [Computer-go] Move evalution by expected value, as product > of expected winrate and expected points? > > I made a similar attempt as Alvaro to predict final ownership. You can > find the code here: https://github.com/jmgilmer/GoCNN/. It's trained to > predict final ownership for about 15000 professional games which were > played until the end (didn't end in resignation). It gets about 80.5% > accuracy on a held out test set, although the accuracy greatly varies > based on how far through the game you are. Can't say how well it would > work in a go player. -Justin On Tue, Feb 23, 2016 at 7:00 AM, > <computer-go-requ...@computer-go.org[computer-go-request@computer- > go.org]> wrote:Send Computer-go mailing list submissions to > computer-go@computer-go.org[computer-go@computer-go.org] > > To subscribe or unsubscribe via the World Wide Web, visit > http://computer-go.org/mailman/listinfo/computer-go[http://computer- > go.org/mailman/listinfo/computer-go] > or, via email, send a message with subject or body 'help' to > computer-go-requ...@computer-go.org[computer-go-requ...@computer-go.org] > > You can reach the person managing the list at computer-go- > ow...@computer-go.org[computer-go-ow...@computer-go.org] > > When replying, please edit your Subject line so it is more specific than > "Re: Contents of Computer-go digest..." > > > Today's Topics: > > 1. Re: Congratulations to Zen! (Robert Jasiek) 2. Move evalution > by expected value, as product of expected winrate and expected > points? (Michael Markefka) 3. Re: Move evalution by expected value, > as product of expected winrate and expected points? ( lvaro Begu ) > 4. Re: Move evalution by expected value, as product of expected > winrate and expected points? (Robert Jasiek) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 22 Feb 2016 19:13:20 +0100 > From: Robert Jasiek <jas...@snafu.de[jas...@snafu.de]> > To: computer-go@computer-go.org[computer-go@computer-go.org] > Subject: Re: [Computer-go] Congratulations to Zen! > Message-ID: <56cb4fc0.4010...@snafu.de[56cb4fc0.4010...@snafu.de]> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Aja, sorry to bother you with trivialities, but how does Alphago avoid > power or network failures and such incidents? > > -- > robert jasiek > > > ------------------------------ > > Message: 2 > Date: Tue, 23 Feb 2016 11:36:57 +0100 > From: Michael Markefka > <michael.marke...@gmail.com[michael.marke...@gmail.com]> > To: computer-go@computer-go.org[computer-go@computer-go.org] > Subject: [Computer-go] Move evalution by expected value, as product of > expected winrate and expected points? > Message-ID: > <CAJg7PAPU_gbHvNy3Cv+D- > p238_hkqkv5pojxozjly4nsqas...@mail.gmail.com[CAJg7PAPU_gbHvNy3Cv%2BD- > p238_hkqkv5pojxozjly4nsqas...@mail.gmail.com]> > Content-Type: text/plain; charset=UTF-8 > > Hello everyone, > > in the wake of AlphaGo using a DCNN to predict expected winrate of a > move, I've been wondering whether one could train a DCNN for expected > territory or points successfully enough to be of some use (leaving the > issue of win by resignation for a more in-depth discussion). And, > whether winrate and expected territory (or points) always run in > parallel or whether there are diverging moments. > > Computer Go programs play what are considered slack or slow moves when > ahead, sometimes being too conservative and giving away too much of > their potential advantage. If expected points and expected winrate > diverge, this could be a way to make the programs play in a more natural > way, even if there were no strength increase to be gained. > Then again there might be a parameter configuration that might yield > some advantage and perhaps this configuration would need to be dynamic, > favoring winrate the further the game progresses. > > > As a general example for the idea, let's assume we have the following > potential moves generated by our program: > > #1: Winrate 55%, +5 expected final points > #2: Winrate 53%, +15 expected final points > > Is the move with higher winrate always better? Or would there be some > benefit to choosing #2? Would this differ depending on how far along the > game is? > > If we knew the winrate prediction to be perfect, then going by that > alone would probably result in the best overall performance. But given > some uncertainty there, expected value could be interesting. > > > Any takers for some experiments? > > > -Michael > > > ------------------------------ > > Message: 3 > Date: Tue, 23 Feb 2016 06:44:04 -0500 > From: lvaro Begu <alvaro.be...@gmail.com[alvaro.be...@gmail.com]> > To: computer-go <computer-go@computer-go.org[computer-go@computer- > go.org]> > Subject: Re: [Computer-go] Move evalution by expected value, as > product of expected winrate and expected points? > Message-ID: > <CAF8dVMWLPQBhD- > Q07YeLZwqV9M9JCW+_VbSRVp=evj9cn6w...@mail.gmail.com[evj9cn6w...@mail.gma > il.com]> > Content-Type: text/plain; charset="utf-8" > > I have experimented with a CNN that predicts ownership, but I found it > to be too weak to be useful. The main difference between what Google did > and what I did is in the dataset used for training: I had tens of > thousands of games (I did several different experiments) and I used all > the positions from each game (which is known to be problematic); they > used 30M positions from independent games. I expect you can learn a lot > about ownership and expected number of points from a dataset like that. > Unfortunately, generating such a dataset is infeasible with the > resources most of us have. > > Here's an idea: Google could make the dataset publicly available for > download, ideally with the final configurations of the board as well. > There is a tradition of making interesting datasets for machine learning > available, so I have some hope this may happen. > > The one experiment I would like to make along the lines of your post is > to train a CNN to compute both the expected number of points and its > standard deviation. If you assume the distribution of scores is well > approximated by a normal distribution, maximizing winning probability > can be achieved by maximizing (expected score) / (standard deviation of > the score). I wonder if that results in stronger or more natural play > than making a direct model for winning probability, because you get to > learn more about each position. > > lvaro. > > > > On Tue, Feb 23, 2016 at 5:36 AM, Michael Markefka < > michael.marke...@gmail.com[michael.marke...@gmail.com]> wrote: > > > Hello everyone, > > > > in the wake of AlphaGo using a DCNN to predict expected winrate of a > > move, I've been wondering whether one could train a DCNN for expected > > territory or points successfully enough to be of some use (leaving the > > issue of win by resignation for a more in-depth discussion). And, > > whether winrate and expected territory (or points) always run in > > parallel or whether there are diverging moments. > > > > Computer Go programs play what are considered slack or slow moves when > > ahead, sometimes being too conservative and giving away too much of > > their potential advantage. If expected points and expected winrate > > diverge, this could be a way to make the programs play in a more > > natural way, even if there were no strength increase to be gained. > > Then again there might be a parameter configuration that might yield > > some advantage and perhaps this configuration would need to be > > dynamic, favoring winrate the further the game progresses. > > > > > > As a general example for the idea, let's assume we have the following > > potential moves generated by our program: > > > > #1: Winrate 55%, +5 expected final points > > #2: Winrate 53%, +15 expected final points > > > > Is the move with higher winrate always better? Or would there be some > > benefit to choosing #2? Would this differ depending on how far along > > the game is? > > > > If we knew the winrate prediction to be perfect, then going by that > > alone would probably result in the best overall performance. But given > > some uncertainty there, expected value could be interesting. > > > > > > Any takers for some experiments? > > > > > > -Michael > > _______________________________________________ > > Computer-go mailing list > > Computer-go@computer-go.org[Computer-go@computer-go.org] > > http://computer-go.org/mailman/listinfo/computer-go[http://computer-go > > .org/mailman/listinfo/computer-go] > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: <http://computer-go.org/pipermail/computer- > go/attachments/20160223/700a08a3/attachment-0001.html[http://computer- > go.org/pipermail/computer-go/attachments/20160223/700a08a3/attachment- > 0001.html]> > > ------------------------------ > > Message: 4 > Date: Tue, 23 Feb 2016 12:54:22 +0100 > From: Robert Jasiek <jas...@snafu.de[jas...@snafu.de]> > To: computer-go@computer-go.org[computer-go@computer-go.org] > Subject: Re: [Computer-go] Move evalution by expected value, as > product of expected winrate and expected points? > Message-ID: <56cc486e.1030...@snafu.de[56cc486e.1030...@snafu.de]> > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 23.02.2016 11:36, Michael Markefka wrote: > > whether one could train a DCNN for expected territory > > First, some definition of territory must be chosen or stated. Second, > you must decide if territory according to this definition can be > determined by a neural net meaningfully at all. Third, if yes, do it. > > Note that there are very different definitions of territory. The most > suitable definition for positional judgement (see Positional Judgement 1 > - Territory) is sophisticated and requires a combination of expert rules > (specifying for what to detemine, and how to read to determine it) and > reading. > > A weak definition could predict whether a particular intersections will > be territory in the game end's scoring position. Such can be fast for MC > or NN, and maybe such is good enough as a very rough approximation for > programs. For humans, such is very bad because it neglects different > degrees of safety of (potential) territory and the strategic concepts of > sacrifice and exchange. > > I have also suggested other definitions, but IMO they are less > attractive for NN. > > -- > robert jasiek > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org[Computer-go@computer-go.org] > http://computer-go.org/mailman/listinfo/computer-go > > ------------------------------ > > End of Computer-go Digest, Vol 73, Issue 42 > *******************************************_____________________________ > __________________ Computer-go mailing list Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go[http://computer- > go.org/mailman/listinfo/computer-go] > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go