On Fri, 2007-04-06 at 12:43 -0400, [EMAIL PROTECTED] wrote:
> Alpha/Beta cutoffs only make sense when calling the evaluation 
> function twice on the exact same position can be guaranteed to
> provide 
> the exact same value. This is obviously not the case for MC 
> evaluation, hence the success of UCT. 

I don't know if any of this is true.  You can apply alpha beta
cutoffs whether the evaluation function is deterministic or not.

UCT calls the "random play-out" only once on any given position
and there is no reason in principle that it couldn't be a 
deterministic evaluation function instead.  I have a theory
that the search is more robust with some randomness since
deterministic evaluation functions have misconceptions.

It's all in how you choose to view things - but I see the
two search techniques as being more similar than different.   
Of course you can choose to emphasize the differences and 
view them as more different if you want to.

Here is how they are similar:

  1.  Both use global search.
  2.  Both use full width search.
  3.  Both use an evaluation function.


UCT uses a "different" search - but they are both essentially
mini-max search.   

Someone might argue that UCT is NOT full width, but it is.  
Chrilly used the terminology "soft-pruning" which means a
pruning decision is not taken "forever" (to use his definition.)

To me, a true selective program cuts of a line forever.   Otherwise
it is brute force (or full width.)   Of course this is my definition,
and possibly not the accepted definition.

When Chrilly says alpha beta he doesn't mean just classical
alpha beta - modern alpha beta is full of speculative cutoffs,
but like UCT they "are not taken forever."

It's not relevant whether Chrilly's program happens to be weaker
or stronger than Crazy Stone at the moment - because there is lot of
black art in making everything work whether it's alpha beta or
UCT.   Note that the UCT programs are not the same - they widely
vary in how strong they are.

I have this idea that perhaps a good evaluation function could
replace the play-out portion of the UCT programs.  The evaluation
function would return a value between 0 and 1 and would be an
estimate of the odds of winning.

It only comes down to whether the inherent randomness is critical
to the success of UCT or not.   If it isn't,  then a play-out is
nothing more than just an evaluation function.

- Don







 

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to