Here is the summary of my proposal.

Please read carefully before thrashing.


Leon.

Go is not much different from chess - 0.1

    For long time it was impossible to correctly evaluate move values in the 
game of go.

    With applied statistics there is an elegant solution. With help of
statistics and semi random play first you normalize the winrates in [0.2,08] 
with properly sized window that possible outcomes fall into, and next move 
values are easily calculated from winrates.

    When searching from terminal node (as in game tree) created with
known techniques you can estimate point value, standard deviation, and
other ...

    Basic idea is to create a window a little larger of possible
outcomes [a,b] such that a < result < b. For each possible
moves (you can use expert knowledge), you semi-randomly play enough
games to be statistically significant. In each playout when you
reach final state you count the game and get result r. Pick a random x
number between a and b. If x>r it is a win, if r<x it is a loss. If play-out is 
done with confidence (better engine winrate will be close to best play (random 
engine gives average). With proper windows size winrate is always in [0.2,0.8] 
or similar, as in the middle there is smallest noise.

    From window and winrate you can calculate expected best play from
that node. From standard deviation you can see if results vary. From
statistics about removable/unremovable string you can see strenght of
strings. In that light you decide either to explore further or apply result 
upstrem.

    Creating the tree is matter either of brute force or use of
expert knowledge. Depth is limited with memory and speed. Prunning
the tree and adding nodes are well known techniques from chess.

    How it works:

    In the endgame with only one outcome r, and window [a,b] you get
winrate w=(b-r)/(b-a). Which translates to:
        result = b - w * (b-a) = r, which is correct value
    When you have possible outcomes o in [A,B] winrate gives values
between [ wl=(A-a)/(b-a), wh=(b-B)/(b-a) ]. Better engine gives winrate
closer to wh. Winrate translates to values between
        [wl=(b-r)/(b-a,wh=(b-r)/(b-a] in [0.2,0.8] or similar
    With small deviation expected result r is close to best play. Else
you can always expand the node and repeat the process.


    In the endgame you can expand the tree to the final nodes with
constant winrate outcome and just apply min/max.

    In the beginning when there is a lot of noise, but with big window,
expert knowledge and good play-out engine you expect to play non-losing
moves.

    In the middle game with good fighting skills (good engine, good statical 
evaluation) you can look deaper into tree and with good prunning
and move proposer define stile of play. Looking for big standard
deviation and such.

    When you are losing for many point you can choose safe moves (based
on expert knowledge).

    When you are losing you can play best engame or you can try to trick.

    You can start with basic static evaluator, but as you build tree and
your confidence grows you can update board evaluation.


Leon Matoh.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to