To bias RAVE by distance, you need to store credit in floating point. 

    RAVEDenominator[move] += weight;

    double biased_credit;

    if (credit == 1)
        biased_credit = 1 – d * w; //ranged on [1,0.5]
    else if (credit == 0)
        biased_credit = 1 + d * w; //ranged on [0,0.5]

    RAVENumerator[move] += biased_credit * weight

Note that you have to prevent “overflow”. For example, if credit is 1, then 
biased_credit should not be smaller than 0.5.

Aja


From: Brian Sheppard 
Sent: Friday, August 12, 2011 4:37 PM
To: 'Aja' ; [email protected] 
Subject: RE: [Computer-go] Aja's PhD thesis: Question about RAVE+Distance 
heuristic

I want to be sure that I understand how distance affects Erica's RAVE heuristic.

The generic code for updating a weighted average would look something like this:

     RAVEDenominator[move] += weight;
     RAVENumerator[move] += credit * weight;

In standard RAVE, credit is a Win or Loss (1 or 0) and weight = 1.

In your thesis, the description says "If the simulation outcome is 1, then the 
updated outcome is 1-d*w; if the simulation outcome is 0 then the updated 
outcome is 0+d*w"

How would I understand this in terms of credit and weight in the weighted 
average code above?

Thanks,
Brian



--------------------------------------------------------------------------------
From: [email protected] [mailto:[email protected]] 
On Behalf Of Aja
Sent: Wednesday, July 27, 2011 8:27 AM
To: [email protected]
Subject: [Computer-go] Aja's PhD thesis


Dear all,

If you are interested, my PhD thesis, entitled "New Heuristics for Monte Carlo 
Tree Search Applied to the Game of Go", can be found in the following link.

http://www.grappa.univ-lille3.fr/~coulom/Aja_PhD_Thesis.pdf

Due to some personal reasons, I am sorry to announce that the sharing of 
Erica's binary is indefinitely postponed.

Best regards,
Aja
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to