I copied this from another post someone made:
Here is a summary of how it works:
- Use probability of winning as score, not territory
- Use the average outcome as position value
- Select the move that maximizes v + sqrt((2*log(t))/(10*n))
v is the value of the move (average outcome, betw
Hi, Don
Don Dailey wrote:
> v + sqrt((2*log(t))/(10*n)) ..
> .. n the number of simulations of this move
1. Does that mean the number in any branch?
Do you store an array with the number of times
each move is played, no matter in what branch?
2. Do you have some explanation for this expression
On Tue, 2007-01-16 at 20:23 +0100, Łukasz Lew wrote:
> There is a good argument why 100 is ok.
> When You have about 50 children, then waiting 100 playouts before
> start of attaching them results only in 2 playouts per child loss, so
> I guess even higher threshold should be OK.
And I haven't bot
There is a good argument why 100 is ok.
When You have about 50 children, then waiting 100 playouts before
start of attaching them results only in 2 playouts per child loss, so
I guess even higher threshold should be OK.
Lukasz
On 1/16/07, Don Dailey <[EMAIL PROTECTED]> wrote:
I have been doing
I have been doing a lot of experiments with the scalability and
memory usage of UCT.I'm using the exact scheme that was
described like this in a previous posting by someone:
Here is a summary of how it works:
- Use probability of winning as score, not territory
- Use the average outcome a