Re: [computer-go] UCT vs MC

sylvain . gelly Mon, 01 Jan 2007 03:56:20 -0800

Hello and happy new year!


> I have some questions about your paper...
Whouah that's a lot of questions :). I'll try to answer well.


> I thought that the Monte Carlo evaluation of a position is done by making
> many random games from that position, and averaging the win/loss result. 
> So the evaluation of a position would be a number between 0 and 1.  I
> thought several thousand random games would be used for one evaluation.
No, we consider one evaluation as one simulation, so the evaluation function 
is a bernoulli random variable. Now you are mainly interested in its 
expectation, and it is why you can think making a lot of simulations and 
averaging, trying to approximate the expectation by the empirical average. 

> In your paper, say that each UCT leaf node is evaluated by exactly one
> random game (simulation), with a result of 0 or 1.  Is this true?
Yes it is true. We try having more that one simulation per leaf node, and:
-with the same number of nodes, the improvement is very small;
-with the same number of total simulations the level is much weaker.

This seems counter intuitive, but in fact it is not. At each simulation we add 
a node. So a node often visited will have a lot of descendants, so will 
average a lot of simulations. The key idea of UCT is that the value in a node 
is the average of its children's values weigthed by the frequency of visits 
for its children.

> I think 
> you say Mogo does 70,000 random games per move.  Does this mean that the
> UCT tree for a move has 70,000 nodes?  When you say 70,000 games per move,
> does that mean total game move made, or game per node evaluation?
That means per MoGo's move. So yes, UCT tree for a move has 70000 nodes. It is 
the total number of simulations.
I use the same count for the CGOS versions of MoGo. MoGo_xxx_10k uses 10000 
simulations for each move, or if you prefer, 10000 nodes in its tree. That 
means that if the CGOS game finished after 40 MoGo moves, then MoGo has 
computed 400 000 random simulations for the complete game. (There is also a 
5000 simulations per move version on CGOS).


> How many simulations (random games with patterns) does Mogo do per second?
On a P4 3.4Ghz, 4500 in 9x9, 1000 in 19x19.

> How do you back up values in the UCT tree?  There are values in the example
> tree, but I can't see how they are calculated.

As in the UCT algorithm. For each node for the root to the leaf of the current 
sequence you simply add the 0/1 result to a variable, and 1 to the count of 
the number of simulations.

> Your code says that the value is backed up by sum and negation (line 26,
> value := -value).  But I don't see any negative values in your sample tree,
> or values greater than one.  How do you actually back up values to the
> root?
Sorry, it is value := 1-value. Thank you for pointing out the mistake.


> on page 5 you say that UCB1-TUNED performs better and you use that formula.
> In the code for the algorithm, you use UCB (line 16).  Which is correct?

Since the beginning we used UCB1-TUNED and it performed better. Now with all 
other improvements, and with a fine parameters tuning the difference is very 
small. UCB1-TUNED has the advantage that it does not need a parameters tuning 
to performs well in Go (the famous sqrt(1/10) constant Remi Coulom posted in 
this list).

> In your paper you show win rates against GnuGo of about 50%, depending on
> the parameters.  The current Mogo beats GnuGo over 90%.  What changed?  Are
> you doing more simulations, or do you have more go knowledge in your
> patterns?  

The results near 50% was with the uniform random simulations. The 80% is with 
the improved simulations. In the current MoGo there are new improvements not 
yet published. Currently, against gnugo 3.6 level 8 with 70000 simulations, 
the result is 92.5%.
MoGo which plays on tournaments makes more than 70000 simulations (plays on a 
quadri processors), but gnugo is also stronger (3.7.10 with level > 8).

> Does Mogo have an opening book? 
Yes, it is an opening book generated using a "meta-UCT", that means using UCT 
algorithm from the empty position, and using MoGo playing against itself as 
the evaluation function. However the opening book is very small (3 ply in 
average, 5 ply maximum).

I hope that makes things clearer.

Sylvain

> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of
> > [EMAIL PROTECTED]
> > Sent: Sunday, December 31, 2006 11:50 AM
> > To: computer-go
> > Subject: Re: [computer-go] UCT vs MC
> >
> >
> > Hello,
> >
> > > It looks to me that the strength of the top programs, like Mogo, is
> > > mostly due to the new UCT search algorithm.
> >
> > It depends what you compare to.
> > If you compare UCT against no tree, this makes a lot of difference.
> > If you compare UCT to former Remi Coulom's tree search
> > algorithm, Remi can
> > answer better than me. He posted a message some months ago, and (if I
> > remember well) his experiments showed an improvement like
> > 5-10% against gnugo
> > comparing to his previous tree search algorithm (which was already an
> > improvement over other tree search algorithms).
> > If you compare uniform-MC/UCT against top programs in 9x9,
> > the difference is
> > huge. MoGo was using UCT since its first version on CGOS and
> > was ranked at
> > 1650. So with UCT and uniform MC you are very far from the
> > top. UCT alone is
> > far to be enough.
> >
> > > I wonder what would happen if we took a tradional strong
> >
> > computer-go
> >
> > > evaluation function and combined it with UCT?
> >
> > This should be an interesting experiment. My bet is that it
> > should not change
> > a lot the strength. May be this can be worst than alpha-beta.
> > Perhaps the
> > gnugo team already tried?
> > UCT has its advantages and drawbacks, and I think its
> > performance against
> > alpha-beta depends on the evaluation function. UCT is
> > efficient for MC
> > evaluation function. It is not obvious it is efficient with
> > "traditional"
> > evaluation function. I would be very happy to know the result.
> >
> > Sylvain
> >
> > _______________________________________________
> > computer-go mailing list
> > computer-go@computer-go.org
> > http://www.computer-go.org/mailman/listinfo/computer-go/
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] UCT vs MC

Reply via email to