Re: [computer-go] Positions illustrative of computer stupidity ?

Don Dailey Thu, 23 Nov 2006 08:26:26 -0800

Sylvain,

Yes, I meant UCT,  I have been working on converting my chess program to
use the UCI protocol and so it was on my mind!   (Incidentally, I think
this is a better protocol that GTP, but I don't want to start a war
here!)


I can't help but feel we are missing something.   With UCT we miss those
wonderful beta cutoff's that you get with straightforward alpha beta
pruning - but with alpha beta pruning you are still exploring a lot more
useless nodes.    It seems the 2 are just not very compatible.

Currently UCT does seem like the best choice by far.  

I did some experiments long ago with straight alpha/beta using
monte/carlo as an evaluation function.   I discovered that the search
goes a long way towards minimizing the variance errors of monte carlo.
In other words you can get by with less simulations at the leaf nodes of
deeper tree's.    When I say "get by" I mean that it's actually
advantageous to decrease the number of simulations in order to get more
depth given the same amount of total CPU work.    It's a matter of
finding the correct balance:

   1.  You always improve by going deeper.
   2.  You always improve by doing more simulation at the same depth.
   
At some point, you get serious diminishing returns by doing more
simulations, but you have to do a lot of them before the decline is
"serious."   However, there is also a  point at which it is better to
just do an extra ply with much less simulations rather than increasing
the simulations at the same depth.   

My main discovery was simply my surprise at how soon you should go ahead
and do an extra ply of search even at the sacrifice of simulations per
evaluation.     

I was not able to study this effect with very deep searches of course.
I think 4 or 5 ply was the limit of my experiments since it takes
forever to play a 5 ply game even with only a few simulations.   I'm
trying to find my data - but I think it has been lost.  That's a shame
because I would like to actually report the numbers.

But UCT may be a rather extreme example of this phenomenon - better to
go deep even without a huge number of simulations.   Of course with UCT
none of the evaluations are completely wasted, they all have an impact
on potential root choices. 


- Don


On Thu, 2006-11-23 at 16:51 +0100, [EMAIL PROTECTED] wrote:
> Hello Don,
> 
> > The improvement over a given opponent should be measured by ELO points,
> > not win percentage unless you do the extra math.  I cannot quite tell if
> > you were considering that or not - if so then ignore this.   Going from
> > 50% wins to 60% with is a modest improvement, but going from 80% to 90%
> > is a MAJOR improvement.
> Yes, I totally agree, and I am sorry I did not emphasis that more. It is also 
> a reason why you need much more games to see a statistically significant 
> improvement when you are close to 90% (even if the variance is lower) because 
> 1% is very important.
> 
> > UCI is completely scalable, but may not be optimally scalable.  It may
> > turn out that there are other ways to handle the best first search.   My
> > fear with UCI type methods is that it might be progressively less
> > efficient as you get much faster.   It is probably a matter of finding
> > just the right formula for allocating work.
> > Or it might turn out that some form of alpha/beta search using
> > monte/carlo as an evaluator is the right thing given enough power -
> > which suggests some kind of hybrid approach.   UCI may be the best way
> > to handle nodes closer to the leaf and perhaps be viewed as a kind of
> > quies search for a standard alpha/beta engine.   I'm just taking a wild
> > guess here.
> 
> I assume by "UCI" you mean UCT. 
> I think with a lot of computation power, when the tree is very deep, alpha 
> beta would work better than UCT. For example, if you have enough computation 
> power to explore all the minmax tree (which can be the case at the end of the 
> game), alpha beta will give the results faster than UCT at least if you don't 
> make it more exploratory (with the "famous" 1/10 constant in the formulae, 
> UCT makes more exploitation). It is because before going deeper on one move, 
> if the reward by the MC simulation is bad, you have to wait until the "log n" 
> term saves you.
> So if we had a lot more power, or at in the end of the game alpha beta or a 
> variant would be a better choice. Still at the end of the game, UCT with MC 
> is already very good. 
> Perhaps an hybrid approach would be good. However I don't think it is where 
> we 
> must put the higher priority. In 19x19 we are far from having enough power, 
> so other solutions have to be found.
> Perhaps for playing almost perfectly on a game like 7x7, hybrid approaches 
> might be great.
> 
> Sylvain
> 
> 
> > On Thu, 2006-11-23 at 13:24 +0100, [EMAIL PROTECTED] wrote:
> > > > I think they will play very strong. Sofar all my tests indicates nice
> > > > scaling, but I admit I have not tried a proper experiment for a long
> > > > time since I do not have any extra hardware. Perhaps the Mogo team
> > > > could do something but the problem is that Mogo is so strong it would
> > > > beat most programs 100% with modest increases in computation time on
> > > > 9x9.
> > >
> > > What we can say from experiments is that the scaling with time is very
> > > good with few simulations, but becomes less interesting with a lot of
> > > simulations. With the same settings (not the best, but the ones for which
> > > we have the most number of results), against gnugo at level 0 (s/m ==
> > > simulations/move): 3000 s/m : 35%, 10000 s/m : 60%, 70000 s/m : 90%.
> > > Against gnugo at level 8 (default) it gives respectively 50% and 80% for
> > > 10k s/m and 70k s/m. MoGo on cgos plays with something like 300k s/m, but
> > > I don't think it is much better than with 70 k s/m. Quick experiments
> > > showed that the improvement was only few % against gnugo. However, I saw
> > > that the improvement is larger against MC based programs (classical non
> > > transitivity of the results), and against itself it is huge.
> > > I also saw that after each improvement, the number of simulations was
> > > less important than before, so the scaling is less impressive.
> > > Perhaps it comes from the fact that now the opening moves are those where
> > > MoGo loses most of its games and, as Magnus said, the number of
> > > simulations are not so important in the opening. We did not investigate
> > > that.
> > >
> > > Sylvain
> > >
> > > _______________________________________________
> > > computer-go mailing list
> > > computer-go@computer-go.org
> > > http://www.computer-go.org/mailman/listinfo/computer-go/
> 
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Positions illustrative of computer stupidity ?

Reply via email to