In a paper published a while ago, Remi Coulom showed that 64 MC trials (i.e., just random, no tree) was a useful predictor of move quality.
In particular, Remi counted how often each point ended up in possession of the side to move. He then measured the probability of being the best move as a function of the frequency of possession. Remi found that if the possession frequency was around 1/3 then the move was most likely to be best, with decreasing probabilities elsewhere. I have been trying to extract more information from each trial, since it seems to me that we are discarding useful information when we use only the result of a trial. So I tried to implement Remi's idea in a UCT program. This is very different from Remi's situation, in which the MC trials are done before the predictor is used in a tree search. Here, we will have a tree search going on concurrently with collecting data about point ownership. My implementation used the first N trials of each UCT node to collect point ownership information. After the first M trials, it would use that information to bias the RAVE statistics. That is, in the selectBest routine I had an expression like this: for all moves { // Get the observed RAVE values: nRAVE = RAVETrials[move]; wRAVE = RAVEWins[move]; // Dynamically adjust according to point ownership: if (trialCount < M) { ; // Do nothing. } else if (Ownership[move] < 0.125) { nRAVE += ownershipTrialsParams[0]; wRAVE += ownershipWinsParams[0]; } else if (Ownership[move] < 0.250) { nRAVE += ownershipTrialsParams[1]; wRAVE += ownershipWinsParams[1]; } else if (Ownership[move] < 0.375) { nRAVE += ownershipTrialsParams[2]; wRAVE += ownershipWinsParams[2]; } else if (Ownership[move] < 0.500) { nRAVE += ownershipTrialsParams[3]; wRAVE += ownershipWinsParams[3]; } else if (Ownership[move] < 0.625) { nRAVE += ownershipTrialsParams[4]; wRAVE += ownershipWinsParams[4]; } else if (Ownership[move] < 0.750) { nRAVE += ownershipTrialsParams[5]; wRAVE += ownershipWinsParams[5]; } else if (Ownership[move] < 0.875) { nRAVE += ownershipTrialsParams[6]; wRAVE += ownershipWinsParams[6]; } else { nRAVE += ownershipTrialsParams[7]; wRAVE += ownershipWinsParams[7]; } // Now use nRAVE and wRAVE to order the moves for expansion.... } The bottom line is that the result was negative. In the test period, Pebbles won 69% (724 out of 1039) of CGOS games when not using this feature and less than 59% when using this feature. I tried a few parameter settings. Far from exhaustive, but mostly in line with Remi's paper. The best parameter settings showed 59% (110 out of 184, which is 2.4 standard deviations lower). But maybe you can learn from my mistakes and figure out how to make it work. I have no idea why this implementation doesn't work. Maybe RAVE does a good job already of determining where to play, so ownership information is redundant. Maybe different parameter settings would work. Maybe just overhead (but I doubt that; the overhead wouldn't account for such a significant drop). Anyway, if you try something like this, please let me know how it works out. Or if you have other ideas about how to extract more information from trials. Best, Brian _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/