Re: [Computer-go] FYI KL-UCB

Hideki Kato Tue, 23 Jul 2013 01:39:51 -0700

Hi,

That's an error :).  Olivier Cappe (one of the authors) replied very 
quickly and gave me another link to the correct version, 
http://jmlr.org/proceedings/papers/v19/garivier11a/garivier11a.pdf .  
Also note that there is a typo (misplaced inf sign) in Eq. (1) and 
(2).


Hideki

ukasz Lew: 
<CAPXT8E4ODjD07Qwci+eOuZ-Eozthjpcf2XM=wgMPF-a=re0...@mail.gmail.com>:
>On Tue, Jul 23, 2013 at 8:50 AM, Hideki Kato <[email protected]> wrote:
>
>> Thanks Lukasz,
>>
>> For introducing such an interesting paper.
>>
>> I have a quesion, though.  The second algorithm in Figures 1, 2 and 3
>> is termed UCB2 but is apparently called MOSS in Sections 5 (and 1).  Do
>> you know which algorithm is actually used in the numerical
>> experiments?
>>
>
>I don't know, but you might mail the author.
>
>
>>
>> BTW, I guess for MC Go programs, possibly the least "risky" algorithm be
>> the best in practice, isn't it?
>>
>
>I won't speculate. Only experiments can tell.
>
>
>>
>> Hideki
>>
>>  ukasz Lew: <
>> capxt8e4pmwmvkiituyhhpbvavgeupgqlnnodyjoamfgo0uo...@mail.gmail.com>:
>> >KL-UCB algorithm
>> >http://arxiv.org/pdf/1102.2490v4.pdf
>> >
>> >"Thus, KL-UCB is optimal for Bernoulli distributions and strictly
>> dominates
>> >a-UCB for any
>> >bounded reward distributions."
>> >http://www.princeton.edu/~sbubeck/SurveyBCB12.pdf (page 18)
>> --
>> Hideki Kato <mailto:[email protected]>
>> _______________________________________________
>> Computer-go mailing list
>> [email protected]
>> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>
-- 
Hideki Kato <mailto:[email protected]>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] FYI KL-UCB

Reply via email to