Interesting looking paper: "On correlation and budget constraints in
model-based bandit optimization with application to automatic machine
learning", Hoffman, Shahriari, de Freitas,  AISTATS 2014

I can't say I've entirely understood yet, but I *think* that:
- targets scenario where there are many more arms than we can try, ie
trying an arm is quite expensive.
  - I think this sounds like the situation somewhere around depth 2 to
4 of an MCTS tree, where we only have time to try a couple of
different moves?
- using Gaussian Processes to model correlations between the arms
   - means that they dont actually have to try all arms
   - though I haven't figured out yet where they are getting this
correlation information from yet, if they havent tried the arms yet...
- in their experiments, their UGap compares favorably with UCBE, which
is a variant of UCB (and which UCB is the basis of UCT, right?)

From the point of view of UCT, I'm thinking:
- might be more principled approach to deciding what to do when there
are still few children of an MCTS node
    - my understanding is that currently we are using 'magic numbers',
like 'if there are less than 30 children, then always explore, dont
use UCT'?

My own interest in this line of thought initially is slightly
different angle from replacing UCT.  It seems to me that there are
lots of hyper-parameters we need to optimize, or choose from, lots of
"magic numbers", like the "< 30" magic number for applying UCT, but
also things like:
- what is the value of adding RAVE?
- what is the value of different numbers of playoff?
- what if I vary the maximum playoff depth?
- etc ...

Searching through all these hyper-parameters seems to me to be
tedious, and time-consuming, and unprincipled, and I'm looking for a
better way, which has taken me through a bunch of papers such as :
- "Algorithms for hyper-parameter optimization", Bergstra, Bardenet,
Bengio, Kegl, NIPS 2007
- "Random Search for Hyper-Parameter Optimization, Bergstra, Bengio, JMLR 2012
- "Hyperopt: A Python Library for Optimizing the Hyperparameters of
Machine Learning Algorithms", bergstra, yamins, cox, scipy 2013
- (and now) "On correlation and budget constraints in model-based
bandit optimization with application to automatic machine learning",
Hoffman, Shahriari, de Freitas, AISTATS 2014
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to