> I suspect that for very long time controls we would be better off
> turning UCT (with, say 10K playouts)  into an evaluation function and
> then using alpha-beta on top of it.
That is an interesting idea.  Usually, when you have to resort to things
like this it means that we need a new way of thinking about things.    
It will turn out at some point that there is some unifying abstraction
that we are missing.   When it's discovered we will slap our heads and
wonder why we didn't think of it sooner.

Perhaps we should be using alpha-beta all along.    Alpha-beta with hash
tables starts to look more like a best-first search.   The hash tables
keep the tree in memory.   Old fashion alpha-beta (without transposition
tables) is rather different. 

MTF(f)  looks more like what we are doing.   Instead of iterating in the
traditional way you may do a 7 ply search several times - bringing back
information each time.    So perhaps there is an interesting formulation
of tree search that will essentially be doing what you are suggesting.  

The whole business is rather ugly in my opinon, despite the fact that it
works so well.    All we are doing is controlling the play-outs.    We
are starting at the current position and playing to the end of the game
a bunch of times.    The only reason we do the "play-outs" is due to
memory (and perhaps performance) constraints.     So we have this
artificial wall between play-outs and tree.     In fact most of the
improvements in MC has been trying to make the play-outs act more like
the tree!  

- Don


>
> Álvaro.
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to