On Thu, Jun 26, 2008 at 11:20 AM, <[EMAIL PROTECTED]> wrote: > You can use a windowed average where the window is a fixed fraction (say > the last third) of the total times the move was made. I have often used an > IIR filter and have never yet been able to prove that it actually helped. If > I could write a decent Kalman filter, I would give that a try. > > - Dave Hillis
Ooh, interesting idea! I couldn't keep myself from working it out for the simplest of case of considering a single move in isolation with well-behaved noise. At the end I show that this approach boils down to simple averaging of all samples when the drift is assumed to be zero. I also give equations for updates that could be dropped into existing code. Assuming a model of: sample(n+1) = win_rate(n) + sample_noise win_rate(n+1) = win_rate(n) + drift sample = result of one monte carlo sim, 0 or 1 For simplicity, assume sample_noise and drift are stationary random processes Most literature on UCT I see assumes variance(sample_noise) = 0.25 For the equations below, I'm going to use the following short hand: n = sample number s = sample(n+1) q = variance(sample_noise) (q for quantization) w = (estimated) win rate d = variance(drift) Other variables internal to the Kalman Filter: P = estimation variance K = gain That yields the following updates following each sample P = P + d K = P / (P + q) w = w + (s-w)*K P = (1-K)*P If we assume d=0, and assume P has the form q/n before the sample, this becomes: w = w + (s-w)/n = (nw+s)/(n+1) = average of all samples P = q/(n+1)
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/