Re: [computer-go] Way MC plays

Jacques Basaldúa Sat, 01 Mar 2008 06:05:38 -0800

Don Dailey wrote:

I personally have serious doubts about knowledge extraction from human
games, but I hope you have success.    I think you can get more from
computer games of strong players even though the level is weaker. Here
is why I say that:

1.  A strong computer still plays a lot of good moves - so the delta
between human and computer games is not as high as you think.

2. A certain consistency in computer games that humans don't possess.

3.  You have access to the internals, such as a score that quantifies
moves.


My idea combines offline knowledge with MC methods. So there is a lot
of online computer generated knowledge. I don't pretend to make a dumb
savant program, just playing the only move it finds in a joseki database.

But if the program considers the joseki stored in (state, action[i-1])-> action[i] records. It will immediately start searching a tree whose

first move at all depths is the joseki sequence. If the joseki depends
on a ladder, it will find that ladder much earlier and if it is not good

it will hopefully play something else. At blitz time settings, playingthe wrong joseki is bad but not terrible. That would be about the levelof a human dan making a mistake in a blitz game, still better than currentprograms.

In this context, human knowledge produces a priori values for UCT. (Theexact formula of the MC tree search is not yet decided.) There is enoughcomputer generated knowledge in the search. When the end of the gameis still 200+ moves ahead, the search alone does not find enoughdifference between good and not so good moves.


The knowledge, I think, makes sense to extract is:

1. (state, last move) -> (next moves list) records (joseki or places
to play when the last move is somewhere else.)
2. urgency of shapes

3. distribution of statistics (e.g. proportion of times tenuki is playedat move n) that helps making playouts more humanlike while still fast.I finally abandoned full board Bradley-Terry scored random playouts. Idraw a random number to decide if tenuki should be played, if true Iinvent a random move and use it as if it was the previous move. Else,I play a Bradley-Terry scored random move in the 40 neighbors of thelast move. That is almost immediate in my hologram board system because

I store the mask. On other implementations it may not be so good.

About urgency of shapes:

The urgency of a BT adjusted 40 neighbors shape hits 40% in a 10 M+ samplewith about 160 K patterns. (Overlearning should not be very important

given the lack of degrees of liberty.) It hits 66% with the first 5 moves.

Hits of move  1 =  0.402048 acc =  0.402048
Hits of move  2 =  0.117102 acc =  0.519151
Hits of move  3 =  0.065450 acc =  0.584600
Hits of move  4 =  0.045592 acc =  0.630193
Hits of move  5 =  0.035190 acc =  0.665382

That sounds great news. But there is bad news of course. That doesn't go anyfurther. To reach 80% it needs 14 moves


Hits of move 14 =  0.006206 acc =  0.801095

And to reach 90% it needs 96 moves!

Hits of move 96 =  0.001323 acc =  0.900994

Shape is a factor that explains, perhaps 2/3 of the moves (shape alone about40%, but shape "in the right place" a little more than 66%). Not less and_not more_ than that. It is naive to think that the 12th move in terms ofshape is better than the 50th. But, imagine there are 250 legal moves, the248th is a really bad move.Can shape make a difference? I don't know. If its slow, surely not. If itis almost free in terms of performance as in my board system, I hope it does.



Jacques.


_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Way MC plays

Reply via email to