The idea sounds pretty much like PoolRave proposed in "Biasing Monte-Carlo Simulations through RAVE Values" by Rimmel et al. -- Francois van Niekerk Email: [email protected] | Twitter: @francoisvn Cell: +2784 0350 214 | Website: http://leafcloud.com
On 29 March 2013 19:46, Peter Drake <[email protected]> wrote: > The "Last Good Reply" approach is similar (although not identical) to this. > We (Orego) got an improvement from it. Some others have, some haven't. > > https://webdisk.lclark.edu/drake/publications/baier-drake-ieee-2010.pdf > > > On Fri, Mar 29, 2013 at 10:40 AM, Alexander Kozlovsky > <[email protected]> wrote: >> >> Hi! >> >> I know that RAVE data typically used during tree traversing. >> But is it possible to use it during random playout, in order to >> increase playout quality? >> >> On the first sight it seems as dangerous idea, because >> RAVE statistics are incrementally gathered from the same >> playouts, and this can lead to problematic positive feedback >> loop, as in saying "The rich get richer and the poor get poorer". >> That is, random initial fluctuation can get stronger with time >> and statistics become skewed, because good moves which >> receive unfortunate initial RAVE data will be ignored >> in future random playout. >> >> But what if we see move selection during random playout >> as a typical multiarm bandit problem? Then the algorithm >> of next playout move selection can be the next: >> >> 1) select several (say, 4) valid candidate moves for the playout. >> >> 2) choose the next move using multiarm bandit formula. >> We can do this, because for each candidate move we >> know (a) number of rave wins for this move, (b) number >> of playouts with this move, (c) total number of playouts >> (all of this numbers are tied to current UCT node) >> >> I think, this should add exploration element to next move >> selection and prevent skewing of RAVE statistics. >> I suspect using RAVE data can improve playout strength >> significantly. >> >> Has anybody trying something like this, or it is just crazy idea? >> >> _______________________________________________ >> Computer-go mailing list >> [email protected] >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > > > -- > Peter Drake > https://sites.google.com/a/lclark.edu/drake/ > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
