> delta_komi = 10^(K * (number_of_empty_points / 400 - 1)),
> where K is 1 if winnig and is 2 if loosing.  Also, if expected
> winning rate is between 45% and 65%, Komi is unmodified.

There's one thing I don't like at all, there: you could get positive
evaluation when losing, and hence play conservatively when you should
not. That could especially be true near the end, when the situation
drifted from bad to less bad. I think that if this happens you're
doomed... The  other way around would probably be painful too.

After all, the aim of tinkering with komi is to avoid that the computer
plays nonsensical moves, but it should know whether he must fight or
calm down.

Here is a suggestion: give a new komi at each move (not a drift, though
it would probably be a good idea that you *upper bound the komi change*,
maybe by ten times the delta_komi you use now).
For choosing it, punish the fact of being away from true komi
proportionnally to | komi_aya - true_komi |. Call lambda the
proportionnality constant.
Punish the winning rate through (winning_rate - .5)^2.
Total loss function is then:
loss = lambda * |komi_aya - true_komi| + (winning_rate - .5)^2


Suppose the change in winning rate is proportional to change in komi,
that is take as hypothesis:
winning_rate(komi_aya) = winning_rate(normal_komi) + C (komi_aya -
normal_komi) 

REMARK: C is negative !

Take the change in komi that minimizes the punishment (a real), and
truncate it to get an integer (truncation also means you have a tendancy
to have delta_komi near 0).

If you knew beforehand the winning rates for each komi, this would
ensure you generally have a winning rate on the right side of .5, while
it would not move komi if winning rate is somewhat near .5. 

Solving the degree two equation yields:
change_in_winning_rate_by_point_komi_change =

new_komi_aya =
    truncation[(.5 - old_winning_rate) +
    C * old_komi_aya - lambda / (2 * C)]^{+}

if the estimated winning rate with true komi is more than .5, and

new_komi_aya =
    truncation[(.5 - old_winning_rate) +
    C * old_komi_aya + lambda / (2 * C)]^{-}
                     ^                    ^
if the estimated winning rate with true komi is less than .5.


The estimated winning rate with true komi is given by
winning_rate - C old_komi_aya.

You could take 
lambda = K / (1 - number_of_empty_points / 400)
if you want to be nearer true komi during yose. Less human-like, but
less error-prone !
Now that's probably not necessary since the constant C is probably much
bigger when entering yose.


The problem here in this formula is to find a good C, since this
constant  change during the game.
Here is an idea from the data in the game: use the last time the komi
was changed and take:
C = 
(winning_rate_after_last_change - winning_rate_before_last_change) /
(komi_aya_after_last_change - komi_aya_before_last_change)
[* f(number_empty_points_last_change, number_empty_points_now) if you want to 
refine]

Now this estimation might be unstable, especially since it cannot tell if
that's the evolution of the situation that made the winning rate change,
or the change in komi. I'm sure ou can find ways to stabilize. The
easiest I can think of would be a formula like:
new_C = .3 * C_formula_above + .7 old_C


END OF THE DELIRIUM 
Jonas
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to