The "Last Good Reply" approach is similar (although not identical) to this. We (Orego) got an improvement from it. Some others have, some haven't.
https://webdisk.lclark.edu/drake/publications/baier-drake-ieee-2010.pdf On Fri, Mar 29, 2013 at 10:40 AM, Alexander Kozlovsky < [email protected]> wrote: > Hi! > > I know that RAVE data typically used during tree traversing. > But is it possible to use it during random playout, in order to > increase playout quality? > > On the first sight it seems as dangerous idea, because > RAVE statistics are incrementally gathered from the same > playouts, and this can lead to problematic positive feedback > loop, as in saying "The rich get richer and the poor get poorer". > That is, random initial fluctuation can get stronger with time > and statistics become skewed, because good moves which > receive unfortunate initial RAVE data will be ignored > in future random playout. > > But what if we see move selection during random playout > as a typical multiarm bandit problem? Then the algorithm > of next playout move selection can be the next: > > 1) select several (say, 4) valid candidate moves for the playout. > > 2) choose the next move using multiarm bandit formula. > We can do this, because for each candidate move we > know (a) number of rave wins for this move, (b) number > of playouts with this move, (c) total number of playouts > (all of this numbers are tied to current UCT node) > > I think, this should add exploration element to next move > selection and prevent skewing of RAVE statistics. > I suspect using RAVE data can improve playout strength > significantly. > > Has anybody trying something like this, or it is just crazy idea? > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > -- Peter Drake https://sites.google.com/a/lclark.edu/drake/
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
