[computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Denis fidaali


 Hi.

Is there any theoretical reasons for the Mogo Opening being built out of
self play, rather than by spending time increasing the number of simulations
at the root, and after a time, keeping what seems to be the best ?

 Obviously one could object that increasing the number of simulations
would lead to out of memory. But then it seems that there are ways
of killing the less explored branches. Another thing also, is that
it may be far easier to get a massive parallel processing of self play.

 So i really wondered what would be best, both for time efficiency, 
and strength : spending a lot of computer power refining the tree,
or using self play from a position, to assess its value ? 
What was the reason behind the choice of the Mogo team ?

_
Inédit ! Des Emoticônes Déjantées! Installez les dans votre Messenger ! 
http://www.ilovemessenger.fr/Emoticones/EmoticonesDejantees.aspx___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Olivier Teytaud
>
> Is there any theoretical reasons for the Mogo Opening being built out of
> self play, rather than by spending time increasing the number of
> simulations
> at the root, and after a time, keeping what seems to be the best ?
>


There are practical reasons: our approach can be used with humans or other
programs as opponent as well;
we can benefit from games launched for other purposes than opening-building;
and we can easily parallelize
the algorithm on grids.

No other reason, at least for me - but these reasons are enough I guess. The
alternate approach is nice,
but is difficult to use for tenths of years of CPU - whereas using
preemptable mode in grids, we can have access
to a huge computational power.

>From a more technical point of view, I think that the idea of using results
of games of strong versions of mogo
is better for avoiding biases in the MC. But it's only a conjecture.
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Denis fidaali

Thanks a lot for your quick answer.

By conjecture, i suppose you mean that
no experiments yet has been ran as
to assess this hypothesis ? 

I think Sylvain (and maybe just everyone else) has tried
at some point to use a UCT decision bot, as a way to
get the simulation done. Then using those high level
simulations in an other UCT decision tree (or AMAF,
or FirstMove wining stats)
>From what i recall, the results were disappointing.
Also don dailey tried to build an AMAF over a bot
that would use AMAF as a decision with very little
expectation that this would lead to anything worthy.
I don't know how hard Sylvain tried at his time. 

Yet you have this feeling that using High level mogo
games as a way to get a simulation done could lead
to interesting results. I also have this feeling.
For example, it is well known that Mogo-style decision
(or crazy stone) lead to very poor understanding of seki
(and/or semeai ?) Would'nt the use of high level game
as simulation get to better understanding of those
really nasty situations ?

 Then, i guess that if nobody has ever run
any experiments, as to get measure of
the efficiency of increasing the UCT tree
against using high-level-simulation, there must be a reason ...
Is it that that it is known it would consume to much time and resources ?
Is it that knowing the results of this measure would prove of little value ?

If there is a point where high-level-simulations really give a stronger 
evaluation
function, wouldn't it be good to know about it ? 

Date: Sun, 30 Nov 2008 10:10:14 +0100
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Subject: Re: [computer-go] Mogo Opening, Building Strategy ?



Is there any theoretical reasons for the Mogo Opening being built out of

self play, rather than by spending time increasing the number of simulations
at the root, and after a time, keeping what seems to be the best ?





There are practical reasons: our approach can be used with humans or other 
programs as opponent as well;

we can benefit from games launched for other purposes than opening-building; 
and we can easily parallelize

the algorithm on grids.



No other reason, at least for me - but these reasons are enough I guess. The 
alternate approach is nice,

but is difficult to use for tenths of years of CPU - whereas using preemptable 
mode in grids, we can have access

to a huge computational power.



>From a more technical point of view, I think that the idea of using results of 
>games of strong versions of mogo

is better for avoiding biases in the MC. But it's only a conjecture.

Olivier

 


_
Email envoyé avec Windows Live Hotmail. Dites adieux aux spam et virus, passez 
à Hotmail ! C'est gratuit !
http://www.windowslive.fr/hotmail/default.asp___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Olivier Teytaud
>
>
> By conjecture, i suppose you mean that
> no experiments yet has been ran as
> to assess this hypothesis ?
>

Yes. The other reasons were sufficient :-)


I think Sylvain (and maybe just everyone else) has tried
> at some point to use a UCT decision bot, as a way to
> get the simulation done. Then using those high level
> simulations in an other UCT decision tree (or AMAF,
> or FirstMove wining stats)
> From what i recall, the results were disappointing.
>

At least, it has been clearly established that
"replacing the random player by a stronger player"
does not imply
"the Monte-Carlo program built on top of the random player becomes stronger"
(even with fixed number of simulations)

But, it is also clearly established that the building of the opening book by
self-play
clearly works, whereas it is roughly the same idea. I guess the reason is
the
difference of strength of the player - a MCTS (Monte-Carlo Tree Search - I
don't
write UCT here as it is not UCT in mogo) built on top of a perfect player
should
be a perfect player (this is formally obvious). So perhaps for huge
computational
power, this approach (building MCTS on top of MCTS) is consistent.


>
> For example, it is well known that Mogo-style decision
> (or crazy stone) lead to very poor understanding of seki
> (and/or semeai ?) Would'nt the use of high level game
> as simulation get to better understanding of those
> really nasty situations ?
>

I hope so, at least for 9x9 and with really huge computational power.
(by the way I'm afraid we have to patch semeai manually)




>
> Is it that that it is known it would consume to much time and resources ?
>

I think it is really difficult to do that - you have to dump your results
unless you have
a machine for a huge time without any reboot, also if you want to have a
huge computational
effort you have to parallelize it - I think this is really hard.

But perhaps trying mogo against mogo built on top of a mogo
with e.g. 1s/move would be fine... this is easy to organize. Well, it
requires
some time, and time is always expensive :-)

Best regards,
Olivier
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Don Dailey
On Sun, 2008-11-30 at 13:38 +0100, Olivier Teytaud wrote:
> But, it is also clearly established that the building of the opening
> book by self-play
> clearly works, whereas it is roughly the same idea. I guess the reason
> is the 
> difference of strength of the player - a MCTS (Monte-Carlo Tree Search
> - I don't 
> write UCT here as it is not UCT in mogo) built on top of a perfect
> player should
> be a perfect player (this is formally obvious). So perhaps for huge
> computational
> power, this approach (building MCTS on top of MCTS) is consistent. 

I've always had this idea that the best way to build an opening book is
the best way to build a general playing engine.   You are trying to
solve the same exact problem - what is the best move in this position?

- Don



signature.asc
Description: This is a digitally signed message part
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] RAVE formula of David Silver (reposted)

2008-11-30 Thread Mark Boon
Indeed, the scaling question is very important. Even though I think I  
have AMAF/ RAVE working now, it's still not so clear-cut what it's  
worth. With just 2,000 playouts I'm seeing a 88% win-rate against  
plain old UCT tree-search without RAVE. At 10,000 playouts this win- 
rate drops to 75%. With 50,000 to 69%. All of these results have a  
margin of error of a few points, but the trend is obvious. UCT plays  
weaker than UCT+RAVE but it scales a little better. This doesn't  
necessarily mean they converge. From the few data-points that I have  
it looks like UCT+RAVE might converge to a winning rate of 66%  
against plain UCT search with playouts in the hundred-thousands or  
millions. Is that about 100 ELO points? That in itself would be  
justification enough to keep it. But there's a computation-cost as  
well. Plus, as soon as you start to introduce other move-selection  
procedures it may eat into the gain RAVE provides even further.


Anyhow, the way I have it set now I can easily switch between using  
AMAF information to compute RAVE or not. There are also still some  
parameters to tune. So this is not the last word on it by far. It's  
more like a good starting point. Also, even if it's not something to  
use in a final playing-engine, it's good to have a 'base-line' that  
provides the best possible time/strength combination to run quick  
tests against.


Is there actually a well-understood basis for the diminishing return  
of UCT+RAVE vs. UCT? I have given it a little thought, but it's not  
entirely obvious to me why UCT+RAVE wouldn't scale better than what  
I'm seeing.


I've also run into a few 'fluke' results. Winning streaks of a dozen  
games in a row (or more) happen between equally strong programs. So  
to be reasonably sure I'd like to get about 1,000 games. If you want  
to make sure two implementations are equivalent, like in case of the  
ref-bots, I'd recommend 10,000 games.


If all I want to know is whether something is an improvement or not,  
then I usually settle for fewer games. If after a (few) hundred games  
I see a win-rate of 50% or less I decide it's not an improvement (not  
one worth anything anyway), if I see a winning-rate of around 60% or  
more I keep it. Anything in between I might decide to let it run a  
bit more. The improvements that I keep I run with longer thinking  
times overnight to see if they scale. After all, the only real test  
worth anything is under realistic playing circumstances.


Mark

On 29-nov-08, at 11:32, Don Dailey wrote:


On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote:


 From my own experience, an important testing case whenever trying
to get AMAF to work, is the scaling study.



No truer words ever spoken.  This is one of the secrets to strong
programs, if they scale, they are probably soundly designed.  I do  
that
with Chess.  I find that some program changes scale up, particular  
sound
algorithms that reduce the branching factor.  I have to run tests  
pretty
fast in order to get results I can interpret, but I also plot the  
result

visually with gnuplot.

As many here will recall,  my own Fatman study vs Leela showed that
Leela scaled better with increasing depth than Fatman.Nothing  
like a

graph to reveal this very clearly, although you can also look at the
numbers if you are careful.

It's important to point out that you will be completely mislead if you
don't get enough samples.  It's very rare that 100 or 200 games are
enough to draw any conclusions (unless it's really lopsided.)  I
remember once thinking I had found a clear scalable improvement but
decided that it must run longer - but I was hopeful.  When the
improvement started to decline, I discovered that I had by accident  
been

running the same exact program against itself.

The point is that it is not uncommon to get really "lucky", and  
have an

equal program look substantially superior - for  a while.

- Don



 My prototype was quite strong considering that it used only 1000
light playout
(and score 25-30 % win against gnugo lvl 0), but it seemed to not get
much
over that as the number of playout grew ... (also there had a serious
exponential complexity problem, which i never get into the trouble of
investigating :) )

 I know that Zoe was about 2000 elo with i think 50k simulations,
and ... never
got any real better as the number of simulations increased.

Both prototype were toying with AMAF, so i really think you need a  
bit

of scalability
study whenever trying to asses an engine employing it. (although it
could very well be
that the scalability trouble came out of some nasty bugs. Both
aforementioned prototype
where quite messy ...)


From: [EMAIL PROTECTED]
Subject: Re: [computer-go] RAVE formula of David Silver (reposted)
Date: Sat, 29 Nov 2008 03:39:58 -0200
To: computer-go@computer-go.org
CC:


On 28-nov-08, at 17:28, [EMAIL PROTECTED] wrote:


I would be very interested to see the RAVE code from Valkyria.

I'm

sure others woul

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread terry mcintyre
> From: Don Dailey <[EMAIL PROTECTED]>
> 
> I've always had this idea that the best way to build an opening book is
> the best way to build a general playing engine.   You are trying to
> solve the same exact problem - what is the best move in this position?

When building an opening book, you have the advantage of not playing against 
the clock. In fact, a good opening book ( one which your game-playing engine 
knows how to use ) can shave time during the game itself.

"Knows how to use" can be subtle; many joseki depend on knowing the exact line 
of play, the move choices depend on knowing the exact ( not approximate ) 
results of ladders and semeai. Against better players, approximate answers tend 
toward disaster. A book move might work sometimes, not others, and the program 
won't know the difference. 

I think the "opening book" and "general playing engine" solve similar problems: 
what is the best move which can be discovered with finite resources? The 
opening problem must solve an additional side constraint: it must suggest moves 
which can be correctly exploited by the playing engine, which may have less 
time and computational power available. A sufficiently broad and deep book can 
make up for lack of computational resources during the game; such a book needs 
to know the best refutations for each of many possible plays. Joseki and fuseki 
books for humans are full of annotations: "move a
is used when the ladder works; otherwise, b is recommended; c is a
mistake, which can be exploited by ..." A book for computer programs
might need similar annotations.

Some programs may need different books, depending on whether they have a fast 
or slow clock, one or many processors.


  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RAVE formula of David Silver (reposted)

2008-11-30 Thread Jason House
On Nov 30, 2008, at 11:49 AM, Mark Boon <[EMAIL PROTECTED]>  
wrote:


Indeed, the scaling question is very important. Even though I think  
I have AMAF/ RAVE working now, it's still not so clear-cut what it's  
worth. With just 2,000 playouts I'm seeing a 88% win-rate against  
plain old UCT tree-search without RAVE. At 10,000 playouts this win- 
rate drops to 75%. With 50,000 to 69%. All of these results have a  
margin of error of a few points, but the trend is obvious. UCT plays  
weaker than UCT+RAVE but it scales a little better. This doesn't  
necessarily mean they converge. From the few data-points that I have  
it looks like UCT+RAVE might converge to a winning rate of 66%  
against plain UCT search with playouts in the hundred-thousands or  
millions. Is that about 100 ELO points? That in itself would be  
justification enough to keep it. But there's a computation-cost as  
well. Plus, as soon as you start to introduce other move-selection  
procedures it may eat into the gain RAVE provides even further.


Anyhow, the way I have it set now I can easily switch between using  
AMAF information to compute RAVE or not. There are also still some  
parameters to tune. So this is not the last word on it by far. It's  
more like a good starting point. Also, even if it's not something to  
use in a final playing-engine, it's good to have a 'base-line' that  
provides the best possible time/strength combination to run quick  
tests against.


Is there actually a well-understood basis for the diminishing return  
of UCT+RAVE vs. UCT? I have given it a little thought, but it's not  
entirely obvious to me why UCT+RAVE wouldn't scale better than what  
I'm seeing.


I've also run into a few 'fluke' results. Winning streaks of a dozen  
games in a row (or more) happen between equally strong programs. So  
to be reasonably sure I'd like to get about 1,000 games. If you want  
to make sure two implementations are equivalent, like in case of the  
ref-bots, I'd recommend 10,000 games.


If all I want to know is whether something is an improvement or not,  
then I usually settle for fewer games. If after a (few) hundred  
games I see a win-rate of 50% or less I decide it's not an  
improvement (not one worth anything anyway), if I see a winning-rate  
of around 60% or more I keep it. Anything in between I might decide  
to let it run a bit more. The improvements that I keep I run with  
longer thinking times overnight to see if they scale. After all, the  
only real test worth anything is under realistic playing  
circumstances.


You've claimed to be non-statistical, so I'm hoping the following is  
useful... You can compute the likelihood that you made an improvement  
as:

erf(# of standard deviations)
Where # of standard deviations =
(win rate - 0.5)/sqrt(#games)

Erf is ill-defined, and in practice, people use lookup tables to  
translate between standard deviations and confidence levels. In  
practice, people set a goal confidence and directly translate it to a  
number of standard deviations (3.0 for 99.85%). This situation  
requires the one-tailed p test.


After about 20 or 30 games, this approximation is accurate and can be  
used for early termination of your test.







Mark

On 29-nov-08, at 11:32, Don Dailey wrote:


On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote:


From my own experience, an important testing case whenever trying
to get AMAF to work, is the scaling study.



No truer words ever spoken.  This is one of the secrets to strong
programs, if they scale, they are probably soundly designed.  I do  
that
with Chess.  I find that some program changes scale up, particular  
sound
algorithms that reduce the branching factor.  I have to run tests  
pretty
fast in order to get results I can interpret, but I also plot the  
result

visually with gnuplot.

As many here will recall,  my own Fatman study vs Leela showed that
Leela scaled better with increasing depth than Fatman.Nothing  
like a

graph to reveal this very clearly, although you can also look at the
numbers if you are careful.

It's important to point out that you will be completely mislead if  
you

don't get enough samples.  It's very rare that 100 or 200 games are
enough to draw any conclusions (unless it's really lopsided.)  I
remember once thinking I had found a clear scalable improvement but
decided that it must run longer - but I was hopeful.  When the
improvement started to decline, I discovered that I had by accident  
been

running the same exact program against itself.

The point is that it is not uncommon to get really "lucky", and  
have an

equal program look substantially superior - for  a while.

- Don



My prototype was quite strong considering that it used only 1000
light playout
(and score 25-30 % win against gnugo lvl 0), but it seemed to not  
get

much
over that as the number of playout grew ... (also there had a  
serious
exponential complexity problem, which i never get into the trouble  
of

inves

Re: [computer-go] RAVE formula of David Silver (reposted)

2008-11-30 Thread Don Dailey
On Sun, 2008-11-30 at 14:49 -0200, Mark Boon wrote:
> Indeed, the scaling question is very important. Even though I think I  
> have AMAF/ RAVE working now, it's still not so clear-cut what it's  
> worth. With just 2,000 playouts I'm seeing a 88% win-rate against  
> plain old UCT tree-search without RAVE. At 10,000 playouts this win- 
> rate drops to 75%. With 50,000 to 69%. All of these results have a  
> margin of error of a few points, but the trend is obvious. UCT plays  
> weaker than UCT+RAVE but it scales a little better. 

Not necessarily.  Start with the UCT and UCT+RAVE and find a level where
they are equivalent in strength.  Whatever point this is, the UCT
version will take much longer to search.  Now, test a version with
some constant factor more playouts (or time.)   For instance test a
version of each that does 3X more work.   Which is stronger now?  That
is how you tell if it scales better or not.   You must start from the
same point ELO-wise.  

My idea is that if it doesn't scale,  there may be some fundamental
improvement still waiting to be found.In the worst case you can
gradually phase it out. 

Of course as you say they may converge, perhaps in som asymptotic way so
that at any given level of effort, the RAVE enhancement always helps,
but less and less so.

- Don


> This doesn't  
> necessarily mean they converge. From the few data-points that I have  
> it looks like UCT+RAVE might converge to a winning rate of 66%  
> against plain UCT search with playouts in the hundred-thousands or  
> millions. Is that about 100 ELO points? That in itself would be  
> justification enough to keep it. But there's a computation-cost as  
> well. Plus, as soon as you start to introduce other move-selection  
> procedures it may eat into the gain RAVE provides even further.
> 
> Anyhow, the way I have it set now I can easily switch between using  
> AMAF information to compute RAVE or not. There are also still some  
> parameters to tune. So this is not the last word on it by far. It's  
> more like a good starting point. Also, even if it's not something to  
> use in a final playing-engine, it's good to have a 'base-line' that  
> provides the best possible time/strength combination to run quick  
> tests against.
> 
> Is there actually a well-understood basis for the diminishing return  
> of UCT+RAVE vs. UCT? I have given it a little thought, but it's not  
> entirely obvious to me why UCT+RAVE wouldn't scale better than what  
> I'm seeing.
> 
> I've also run into a few 'fluke' results. Winning streaks of a dozen  
> games in a row (or more) happen between equally strong programs. So  
> to be reasonably sure I'd like to get about 1,000 games. If you want  
> to make sure two implementations are equivalent, like in case of the  
> ref-bots, I'd recommend 10,000 games.
> 
> If all I want to know is whether something is an improvement or not,  
> then I usually settle for fewer games. If after a (few) hundred games  
> I see a win-rate of 50% or less I decide it's not an improvement (not  
> one worth anything anyway), if I see a winning-rate of around 60% or  
> more I keep it. Anything in between I might decide to let it run a  
> bit more. The improvements that I keep I run with longer thinking  
> times overnight to see if they scale. After all, the only real test  
> worth anything is under realistic playing circumstances.
> 
> Mark
> 
> On 29-nov-08, at 11:32, Don Dailey wrote:
> 
> > On Sat, 2008-11-29 at 11:58 +0100, Denis fidaali wrote:
> >>
> >>  From my own experience, an important testing case whenever trying
> >> to get AMAF to work, is the scaling study.
> >>
> >
> > No truer words ever spoken.  This is one of the secrets to strong
> > programs, if they scale, they are probably soundly designed.  I do  
> > that
> > with Chess.  I find that some program changes scale up, particular  
> > sound
> > algorithms that reduce the branching factor.  I have to run tests  
> > pretty
> > fast in order to get results I can interpret, but I also plot the  
> > result
> > visually with gnuplot.
> >
> > As many here will recall,  my own Fatman study vs Leela showed that
> > Leela scaled better with increasing depth than Fatman.Nothing  
> > like a
> > graph to reveal this very clearly, although you can also look at the
> > numbers if you are careful.
> >
> > It's important to point out that you will be completely mislead if you
> > don't get enough samples.  It's very rare that 100 or 200 games are
> > enough to draw any conclusions (unless it's really lopsided.)  I
> > remember once thinking I had found a clear scalable improvement but
> > decided that it must run longer - but I was hopeful.  When the
> > improvement started to decline, I discovered that I had by accident  
> > been
> > running the same exact program against itself.
> >
> > The point is that it is not uncommon to get really "lucky", and  
> > have an
> > equal program look substantially superior - for  a while.
> >

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Don Dailey
It's true that building an opening book in an automated way can be done
off-line which gives us more resources.   That's really the basis for
this thought that we are trying to solve the same problem.  

As a thought experiment,  imagine some day in the future,  when
computers are 1 thousand times faster.  What would you do with that
extra "time" to make your program more scalable?You might be able to
use your book building algorithm,  whatever it might be,  to find moves
for you.  If there is a better way, then you would use it instead.  But
if there is a better way why are you not using it for creating book
moves?  

Of course the details of how to do it may not be exactly the same -
because we may be forced to simulate in memory hash tables with disk
space for instance, or to deal with other real-world constraints.
However I think that at least in principle, we should be doing the same
thing.   

MCTS really feels to me like a superb book building algorithm.
Computer Chess books (at least the automated part) are built essentially
by taking millions of games from master play and picking out the ones
that seem to work best.   Those games are like playouts.   The moves
that score the best are played the most.   We have a kind of MCTS here.

- Don



On Sun, 2008-11-30 at 09:15 -0800, terry mcintyre wrote:
> > From: Don Dailey <[EMAIL PROTECTED]>
> > 
> > I've always had this idea that the best way to build an opening book is
> > the best way to build a general playing engine.   You are trying to
> > solve the same exact problem - what is the best move in this position?
> 
> When building an opening book, you have the advantage of not playing against 
> the clock. In fact, a good opening book ( one which your game-playing engine 
> knows how to use ) can shave time during the game itself.
> 
> "Knows how to use" can be subtle; many joseki depend on knowing the exact 
> line of play, the move choices depend on knowing the exact ( not approximate 
> ) results of ladders and semeai. Against better players, approximate answers 
> tend toward disaster. A book move might work sometimes, not others, and the 
> program won't know the difference. 
> 
> I think the "opening book" and "general playing engine" solve similar 
> problems: what is the best move which can be discovered with finite 
> resources? The opening problem must solve an additional side constraint: it 
> must suggest moves which can be correctly exploited by the playing engine, 
> which may have less time and computational power available. A sufficiently 
> broad and deep book can make up for lack of computational resources during 
> the game; such a book needs to know the best refutations for each of many 
> possible plays. Joseki and fuseki books for humans are full of annotations: 
> "move a
> is used when the ladder works; otherwise, b is recommended; c is a
> mistake, which can be exploited by ..." A book for computer programs
> might need similar annotations.
> 
> Some programs may need different books, depending on whether they have a fast 
> or slow clock, one or many processors.
> 
> 
>   
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/


signature.asc
Description: This is a digitally signed message part
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread terry mcintyre


- Original Message 
> From: Don Dailey <[EMAIL PROTECTED]>
> 
> MCTS really feels to me like a superb book building algorithm.
> Computer Chess books (at least the automated part) are built essentially
> by taking millions of games from master play and picking out the ones
> that seem to work best.   Those games are like playouts.   The moves
> that score the best are played the most.   We have a kind of MCTS here.


Interesting, the book moves are not generated by random playouts, but by 
professional ( or highly skilled ) players. 

In the Chess world, what is meant by "picking out the ones that seem to work 
best?"

My impression is that computer go programs do not, at this stage in their 
evolution, make good use of professional book moves; too much professional 
knowledge is actually in the part of the tree which is almost never played in 
pro-pro games - the "how to beat up on mistakes" and "what not to do when 
playing against a pro" parts of the tree. 


  
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Don Dailey
On Sun, 2008-11-30 at 14:33 -0800, terry mcintyre wrote:
> 
> - Original Message 
> > From: Don Dailey <[EMAIL PROTECTED]>
> > 
> > MCTS really feels to me like a superb book building algorithm.
> > Computer Chess books (at least the automated part) are built essentially
> > by taking millions of games from master play and picking out the ones
> > that seem to work best.   Those games are like playouts.   The moves
> > that score the best are played the most.   We have a kind of MCTS here.
> 
> 
> Interesting, the book moves are not generated by random playouts, but by 
> professional ( or highly skilled ) players. 

Many chess opening books are created by a statistical analysis of lots
of human games.   Some books are created by hand.  Very tediously.
Even the ones created with the aid of human games are sometimes modified
by hand, at least the top notch books.

But in the context of this discussion we note that books can and are
created solely from databases of top quality games.   You can get a
reasonable book that way.   


> In the Chess world, what is meant by "picking out the ones that seem to work 
> best?"

What I mean is that you look at the statistics of the moves and base
your opening book on the moves that gave the best results.  You can also
go by the moves that are played the most - with the assumption that if
they are played a lot they must be good.   It is typical to do a
combination of both - if it's played a lot and also scores good, use it.

I think some have tried mini-max too.   It's possible that a move seems
to has great success in general, but not if it's responded to in a
certain way.   It could be that in recent months or years a refutation
has been found, and that a move that used to work really well has been
found to be bad.   


> 
> My impression is that computer go programs do not, at this stage in their 
> evolution, make good use of professional book moves; too much professional 
> knowledge is actually in the part of the tree which is almost never played in 
> pro-pro games - the "how to beat up on mistakes" and "what not to do when 
> playing against a pro" parts of the tree. 

Even in chess, despite the awesome strength of the programs,  human
knowledge of the openings still reigns supreme, although it's now the
case that computers are helping to build opening theory by finding new
moves - in chess these are called "theoretical novelties" and computers
have produced many of them from what I understand.

- Don


> 
>   
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/


signature.asc
Description: This is a digitally signed message part
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Mogo Opening, Building Strategy ?

2008-11-30 Thread Michael Williams

You can read about some such novelties found using Rybka here:  
http://www.rybkachess.com/index.php?auswahl=Rybka+3+book


Don Dailey wrote:

On Sun, 2008-11-30 at 14:33 -0800, terry mcintyre wrote:

- Original Message 

From: Don Dailey <[EMAIL PROTECTED]>

MCTS really feels to me like a superb book building algorithm.
Computer Chess books (at least the automated part) are built essentially
by taking millions of games from master play and picking out the ones
that seem to work best.   Those games are like playouts.   The moves
that score the best are played the most.   We have a kind of MCTS here.


Interesting, the book moves are not generated by random playouts, but by professional ( or highly skilled ) players. 


Many chess opening books are created by a statistical analysis of lots
of human games.   Some books are created by hand.  Very tediously.
Even the ones created with the aid of human games are sometimes modified
by hand, at least the top notch books.

But in the context of this discussion we note that books can and are
created solely from databases of top quality games.   You can get a
reasonable book that way.   




In the Chess world, what is meant by "picking out the ones that seem to work 
best?"


What I mean is that you look at the statistics of the moves and base
your opening book on the moves that gave the best results.  You can also
go by the moves that are played the most - with the assumption that if
they are played a lot they must be good.   It is typical to do a
combination of both - if it's played a lot and also scores good, use it.

I think some have tried mini-max too.   It's possible that a move seems
to has great success in general, but not if it's responded to in a
certain way.   It could be that in recent months or years a refutation
has been found, and that a move that used to work really well has been
found to be bad.   



My impression is that computer go programs do not, at this stage in their evolution, make good use of professional book moves; too much professional knowledge is actually in the part of the tree which is almost never played in pro-pro games - the "how to beat up on mistakes" and "what not to do when playing against a pro" parts of the tree. 


Even in chess, despite the awesome strength of the programs,  human
knowledge of the openings still reigns supreme, although it's now the
case that computers are helping to build opening theory by finding new
moves - in chess these are called "theoretical novelties" and computers
have produced many of them from what I understand.

- Don


  
___

computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


Re: [computer-go] RAVE formula of David Silver (reposted)

2008-11-30 Thread Mark Boon


On 30-nov-08, at 16:51, Jason House wrote:


You've claimed to be non-statistical, so I'm hoping the following  
is useful... You can compute the likelihood that you made an  
improvement as:

erf(# of standard deviations)
Where # of standard deviations =
(win rate - 0.5)/sqrt(#games)

Erf is ill-defined, and in practice, people use lookup tables to  
translate between standard deviations and confidence levels. In  
practice, people set a goal confidence and directly translate it to  
a number of standard deviations (3.0 for 99.85%). This situation  
requires the one-tailed p test.


After about 20 or 30 games, this approximation is accurate and can  
be used for early termination of your test.




Lately I use twogtp for my test runs. It computes the winning  
percentage and puts a ± value after it in parenthesis. Is that the  
value of one standard deviation? (I had always assumed so.) Even  
after a 1,000 games it stays in the 1.5% neighbourhood.


Maybe 20-30 games is usually an accurate approximation. But if you  
perform tests often, you'll occasionally bump into that unlikely  
event where what you thought was a big improvement turned out to be  
no improvement at all. Or the other way around. Only when I see 20+  
games with a zero winning percentage do I stop it, assuming I made a  
mistake.


Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


[computer-go] Test set for reference bots

2008-11-30 Thread Mark Boon
I think I've seen some discussion a while ago about a set of tests  
used to test playing engines. I suppose it would make sense to  
compile such a set for the reference bots. Matching a win-rate and  
game-length after 1,000,000 playouts is a quick first milestone for  
verification that can be done quickly. A winning rate of something  
very close to 50% after thousands of games takes hours if not a whole  
day. A set of situations with their expected result can provide  
something in between. Maybe the original set posted by Yamato can be  
used as a starting point. But I'd also include some very simple  
situations that check whether the engines properly implement things  
like ko, super-ko, seki, end of game etc...


So far I've been using unit-tests for things like this. But that's  
not very easy to transfer to other engine implementations.


I haven't used the GoGui regression-test application yet but I'm  
planning to look into it. The one thing I did notice is that the  
problem diagram and the problem's solution are separated. Wouldn't it  
have been easier to include the latter into the SGF? A simple format  
like "#c? [result-set]" where c is the color to play, similar as the  
format already used. Put that as a comment in the first or last SGF  
node should be easy enough to parse. Or you could even define a  
special SGF property for this information.


Maybe there's a well thought out reason why it is the way it is?

Mark

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/