Re: [computer-go] Hybrid theory

Jason House Fri, 01 Feb 2008 13:28:35 -0800

On Feb 1, 2008 4:18 PM, terry mcintyre <[EMAIL PROTECTED]> wrote:

> UCT is based on a theory of a multi-armed bandit, with uncertain knowledge
> about which "arms" would be most productive. Is it possible to graft various
> sources of knowledge into a sort of meta-bandit algorithm?



IIRC, MoGo already does some of this.  They use RAVE (worth up to 3000
sims?) and they also use some TD to get an initual heuristic worth about 50
sims.




> As to fusing top-level knowledge with random playouts, I love the idea,
> and am trying to imagine  how to implement it. One idea is this: certain
> moves in certain situations might trigger a forced reply. Playing one of a
> pair of miai, for instance, should result in a high probability of the
> matching move played as a response - especially if failure to do so would
> result in killing a group.
> Analysis of a group could conclude that life depends on certain external
> liberties, or the ability to play one of two alternate moves, yadda yadda;
> those threats then trigger appropriate automatic responses with high
> probability.


Remember that the arms at leaves correspond to monte carlo playouts.
There's no real restriction of forcing other knowledge into those playouts.
I too have imagined miai pairs.  MoGo plays out local exchanges based on
patterns before going to a more random sim.  CrazyStone uses move
probability heuristics to select moves (some responses are high probability
and others are much less).

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Hybrid theory

Reply via email to