To bias RAVE by distance, you need to store credit in floating point.
RAVEDenominator[move] += weight;
double biased_credit;
if (credit == 1)
biased_credit = 1 – d * w; //ranged on [1,0.5]
else if (credit == 0)
biased_credit = 1 + d * w; //ranged on [0,0.5]
RAVENumerator[move] += biased_credit * weight
Note that you have to prevent “overflow”. For example, if credit is 1, then
biased_credit should not be smaller than 0.5.
Aja
From: Brian Sheppard
Sent: Friday, August 12, 2011 4:37 PM
To: 'Aja' ; [email protected]
Subject: RE: [Computer-go] Aja's PhD thesis: Question about RAVE+Distance
heuristic
I want to be sure that I understand how distance affects Erica's RAVE heuristic.
The generic code for updating a weighted average would look something like this:
RAVEDenominator[move] += weight;
RAVENumerator[move] += credit * weight;
In standard RAVE, credit is a Win or Loss (1 or 0) and weight = 1.
In your thesis, the description says "If the simulation outcome is 1, then the
updated outcome is 1-d*w; if the simulation outcome is 0 then the updated
outcome is 0+d*w"
How would I understand this in terms of credit and weight in the weighted
average code above?
Thanks,
Brian
--------------------------------------------------------------------------------
From: [email protected] [mailto:[email protected]]
On Behalf Of Aja
Sent: Wednesday, July 27, 2011 8:27 AM
To: [email protected]
Subject: [Computer-go] Aja's PhD thesis
Dear all,
If you are interested, my PhD thesis, entitled "New Heuristics for Monte Carlo
Tree Search Applied to the Game of Go", can be found in the following link.
http://www.grappa.univ-lille3.fr/~coulom/Aja_PhD_Thesis.pdf
Due to some personal reasons, I am sorry to announce that the sharing of
Erica's binary is indefinitely postponed.
Best regards,
Aja_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go