You can use a windowed average where the window is a fixed fraction (say the last third) of the total times the move was made. I have often used an IIR filter and have never yet been able to prove that it actually helped. If I could write a decent Kalman filter, I would give that a try.
- Dave Hillis -----Original Message----- From: Peter Drake <[EMAIL PROTECTED]> To: computer-go <computer-go@computer-go.org> Sent: Thu, 26 Jun 2008 11:06 am Subject: Re: [computer-go] UCB/UCT and moving targets On Jun 26, 2008, at 1:35 AM, Magnus Persson wrote:? ? > Quoting Peter Drake <[EMAIL PROTECTED]>:? >? >> UCB (and hence UCT) would treat the following sequences of wins >> (1) and? >> losses (0) the same:? >>? >> 01010101010101010101010101010101? >> 00000000000000001111111111111111? >> 11111111111111110000000000000000? >? > I have two comments. Isn't the problem here that UCT will not > search the > second sequence at all because it is so bad initially?? ? Well, UCT never really discards a branch, just samples it less and less often. It would eventually go back to sequence 2 and discover that it had become good.? ? > So will there ever be a situation like this? UCT will more likely > continue > to sample the first and third sequenc settling for the > first if their are > no change again. And when it finally discovers > that sequene 2 is good it > will quickly choose it.? >? > The second thing is that in practice you will rarely get any clear > patterns > like this so how would you be able to detect any recency?? ? I'm thinking of a situation where move A looks good when we're assuming a random response from the opponent, but once the child node starts using UCB, it is discovered that the opponent has a very good response, so now runs through A start losing. Later, runs through A start winning again because we discover a good counterresponse...? ? > One simple trick is to always replay a move in the tree if it won > the last > time the position was visited, and only use UCT for > positions where the > last played moved lost. Should'nt this work > like a dream if sequence 2 > truly goes to 100% winrate directly > after the wirst win is scored?? ? Probably, but we rarely get such pure victory branches. I have a number of schemes in mind, though.? ? Has anyone tried this empirically?? ? Peter Drake? http://www.lclark.edu/~drake/? ? ? _______________________________________________? computer-go mailing list? [EMAIL PROTECTED] http://www.computer-go.org/mailman/listinfo/computer-go/?
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/