On Tue, Jul 23, 2013 at 8:50 AM, Hideki Kato <[email protected]> wrote:
> Thanks Lukasz, > > For introducing such an interesting paper. > > I have a quesion, though. The second algorithm in Figures 1, 2 and 3 > is termed UCB2 but is apparently called MOSS in Sections 5 (and 1). Do > you know which algorithm is actually used in the numerical > experiments? > I don't know, but you might mail the author. > > BTW, I guess for MC Go programs, possibly the least "risky" algorithm be > the best in practice, isn't it? > I won't speculate. Only experiments can tell. > > Hideki > > ukasz Lew: < > capxt8e4pmwmvkiituyhhpbvavgeupgqlnnodyjoamfgo0uo...@mail.gmail.com>: > >KL-UCB algorithm > >http://arxiv.org/pdf/1102.2490v4.pdf > > > >"Thus, KL-UCB is optimal for Bernoulli distributions and strictly > dominates > >a-UCB for any > >bounded reward distributions." > >http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18) > -- > Hideki Kato <mailto:[email protected]> > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > -- Łukasz
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
