Re: [computer-go] Slightly improved MC algorithm

Heikki Levanto Wed, 28 Feb 2007 00:01:46 -0800

On Tue, Feb 27, 2007 at 05:13:12PM -0500, Don Dailey wrote:
> This is very similar to what AnchorMan on CGOS does.  At the end of 
> each random simulation I keep the same statistcs on each point of
> the board and I use it to improve the move selection algorithm.  I
> call this special board an "ownership map" for obvious reasons.  I
> just divide each value by the number of simulations run.


"Ownership map" is a good term!

> It's not clear from your email what you do with this information
> however.  AnchorMan uses the ownership map  to give the final 
> move choice a little bias.   In my testing the use of an ownership
> map improved the program significantly but it was tricky getting
> it right.

I use it to calculate the final value of the series of simulations.
Instead of just counting wins and losses, I count which move ends up
with the greatest ownership. This is a bit like following the final
score of the simulation, instead of just who won - and that is said to
be not so good. But I add some non-linearity in it, in the way I weigh
unsettled points. This seems to help a lot (200 elo).

I also use it to veto silly endgame moves, when some parts of the board
are 100% owned by one part. I veto two kind of moves:
  - Moves into 100% enemy territory
  - Moves into 100% own territory that are not on enemy liberties
This seems to shorten the game by quite many moves, when there is no
need to split own territory into one-point eyes, nor to try evey
possible move inside enemy territory.

This I do at the move-selection stage, not in the lightweight
simulations.
  
> For reference, AnchorMan does 5,000 simulations per move and is
> rated exactly 1500.0.    

That is better than what I can do. I will have to include more tricks.

>       I bias edge moves, self-atari, captures and moves into 
>       heavily owned territory, but these biases are carefully
>       applied based on the ownership map.  For example I do not
>       bias ALL self-atari moves, just ones that cross a certain
>       threshold of enemy ownership.  The bias is not a veto, it's
>       a gentle encouragement or discouragement to making certain
>       moves under certain situations.

This sounds like stuff I will have to try.

>   2.  I use a hybrid form of all-moves-as-first and others have
>       reported no improvement.   The behavior I get is that it
>       plays much stronger at low simulations and in extensive
>       testing I could not find a high enough level for the version
>       that does not do all-as-first to make it play as strongly.
>       So for me there is no reason whatsoever to stop using 
>       all-as-first, even at ridiculous levels. 

I have never really understood the idea in all-in-first. That seems to
fly in the face of the common sense idea that the order of moves is
important. But perhaps I have got it wrong in my mind. I will have to
study more.

> AnchorMan even beats Lazarus which is about 1800 strength at fast
> levels.  For instance Lazarus cannot even begin to compete with 
> AnchorMan if it's playing as fast as AnchorMan on CGOS.  
> Unfortunately, this type of algorithms is not scalable,  AnchorMan
> will never get much better (I think it gets close to 1600 if it's given
> something like 200,000 simulations.)

This is good news to me - I had set a goal to beat AnchorMan "in its own
game", pure MC. I can afford to do more simulations if I need to, and
probably will give different moves different number of simulations to
get it more effecive (try 1000 sims for every move, prune the worst 25%
and repeat until all sims used or only one move left).

> There are some interesting enhancements that can push this a little
> farther without getting into the more complex UCT stuff.   I can't
> remember who did it, but someone produced something over 1600 doing
> basic monte carlo but simulating a 2 ply search 

That sounds interesting. It should get rid of the worst self-ataris and
other silly moves MC is so happy about.
 
I may do that as an experiment before I start on "proper" search, like
UCT.


Interesting stuff, all the same


   - Heikki

-- 
Heikki Levanto   "In Murphy We Turst"     heikki (at) lsd (dot) dk

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Slightly improved MC algorithm

Reply via email to