Re: [Computer-go] win rate bias and CLOP

Rémi Coulom Wed, 04 Jan 2012 15:34:33 -0800

On 4 janv. 2012, at 16:22, Olivier Teytaud wrote:

> 
> BTW, I imagine that "CLOP" could be any fully automated parameter tuning
> solution. That is, nothing here is really specific to CLOP. It just happens
> that CLOP is the first fully automated parameter tuning system that I have
> made to work.
> 
> 
> Hi Brian and others;
> I'm not sure that CLOP could be any fully automated parameter tuning solution.
> The point is that there are roughly two families of algorithms:
> 1) those in which exploration is roughly equal to recommendation; we test 
> parameters close to our current approximation
>     of the optimum;
> 2) those in which exploration is really different from recommendation: we 
> test parameters possibly far from the optimum,
>     because the variance is higher and we get more information there.
> I assume that CLOP is in category 2.


I don't think CLOP fits in any of these categories. CLOP tries to balance bias 
and variance optimally by sampling over an area that is as small as possible 
(to avoid regression bias because the function to be optimized might not be 
quadratic or symmetric), but big enough to be confident that values on the 
border are inferior to the optimal win rate (otherwise we can have no idea 
where the maximum is).

The big deal with CLOP, compared to methods such as UCT or CEM, is not about 
sampling far or close. The sample distribution of CLOP is very similar to the 
sample distribution of Gaussian CEM. The big deal with CLOP is that it is an 
order of magnitude faster when the function to be optimized is smooth (bounded 
second-order derivative, say). That is because it is not a comparison-based 
algorithm (like population-based methods). It seems that comparison-based 
approaches cannot do better than 1/sqrt(N), whereas CLOP can do N^{-2/3} 
asymptotic expected simple regret.

The function to be optimized does not have to be symmetric. I know optimizing 
symmetric functions is very popular in some theoretical literature of 
stochastic optimization, but I don't expect these results have much practical 
value, because functions to be optimized in practice are never symmetric around 
the maximum. Lack of symmetry really changes the nature of the problem.

The really fundamental result in terms of asymptotic speed of convergence is 
that paper:
@article{
 Chen-1988a,
 author = "Hung Chen",
 title = "Lower Rate of Convergence for Locating a Maximum of a Function",
 journal = "The Annals of Statistics",
 volume = 16,
 number = 3,
 pages = "1330--1334",
 year = 1988
}
in short: bounded d-th derivative => O(N^{d/(d+1)}) lower bound on the 
asymptotic rate of convergence. This is a universal result. It applies to _any_ 
algorithm. I must admit I don't really understand the theorem, but I 
intuitively agree with it.

Rémi
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] win rate bias and CLOP

Reply via email to