[Computer-go] Multi-armed bandit problem theory

Petr Baudis Wed, 26 Oct 2011 02:24:09 -0700

  Hi!

  Does anyone have a good source for understanding the theory behind
the multi-armed bandit problem, i.e. the proof behind the exponential
arm play bounds etc.? My only source so far is Auer et al., 2002:
Finite-time Analysis of the Multiarmed Bandit Problem - but I suspect
its description of the original bound is incomplete and/or simplified
with some implicit assumptions (i.e. in case of optimal arm, the bound
would involve division by zero?).


  Everyone refers to Lai & Robbins, 1985 and Agrawal, 1995, but I'm
unable to find these papers anywhere (my university JTOR subscription
somehow magically doesn't seem to cover Agrawal, 1995). I'm hoping
that maybe I could grasp the details if I read those, does anyone have
a copy?

  Thanks,

-- 
                                Petr "Pasky" Baudis
We live on an island surrounded by a sea of ignorance. As our island
of knowledge grows, so does the shore of our ignorace. -- J. A. Wheeler
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] Multi-armed bandit problem theory

Reply via email to