Hideki Kato wrote:
> [EMAIL PROTECTED]: <[EMAIL PROTECTED]>:
>   
>>> delta_komi = 10^(K * (number_of_empty_points / 400 - 1)),
>>> where K is 1 if winnig and is 2 if loosing.  Also, if expected
>>> winning rate is between 45% and 65%, Komi is unmodified.
>>>       
>> There's one thing I don't like at all, there: you could get positive
>> evaluation when losing, and hence play conservatively when you should
>> not. That could especially be true near the end, when the situation
>> drifted from bad to less bad. I think that if this happens you're
>> doomed... The  other way around would probably be painful too.
>>     
>
>  From my observaion, mc chooses good moves if and only if the winning 
> rate is near 50%.  Once it gets loosing, it plays bad moves.  Surely 
> it's an illusion but it helps to prevent them.
>
>   
I don't see that,  but then again I am not a very strong player
myself.   What I notice is that it plays very "normal"  until it's
pretty obvious that it's losing,  not just when it varies slightly from
50% but when it doesn't vary much from zero.   However, it does play
more desperately once it varies significantly from 50% but certainly not
"meaninglessly."  

I don't like using the words "good" and "bad" when describing the
quality of the moves because I try to use terminology that's more
descriptive (although I fail miserably many times.)    In a lost
position how do you distinguish one move from another when they all
lose?     It sounds funny to me when you say (in so many words) that
once the program is losing it starts playing "bad moves."  

Since this is a subjective quality can we use a subjective term such as
"normal" to describe moves that are cosmetically appealing to us?    
And perhaps "ugly" to describe moves that are not?

My feeling is that in lost positions,  the only thing we are trying to
accomplish is to make the moves more cosmetically appealing (normal) and
at best improve the programs chances of winning against weak players.   
After all, if the program is in bad shape,   then to be completely
realistic it's probably going to lose to the player that put it in this
bad shape.   Unless of course the program is being upset by a much
weaker player which can occasionally happen too.    We can't reasonably
expect that if a program is quite sure that it is losing that the
program that it is beating it is not going to be aware of this too.   

It's also a bad mistake in my opinion to try to coerce it into playing
moves that are "normal" when an increasing amount of "desperation" is
indeed called for.    I have presented anecdotes before about how chess
players have won games based on not playing as if things are normal when
they are losing, but instead suddenly playing differently which usually
consists of violating general principles and "normal" play.

Again, I feel that this effect of moves that are not normal kick in
mostly when the position is very close to 0 or 1.     So what we are
looking for is AT BEST a very minor improvement and we are wasting a lot
of energy on this.    If the goal is to make the moves more cosmetically
appealing I can respect that more - that is realistic and probably even
easy to accomplish (and then the goal is to do it without weakening the
program too much.)

It's also being considered to use this to cover over some other weakness
such as nakade where the program doesn't understand the actual end of
the game and is thinking it has lost by 2 or 3 stones when in fact it
has a win.    Aside from the fact that this is a fairly rare
occurrence,  I believe it should be addressed directly,  not with a
superficial treatment of the symptoms. 

So if you can make it win slightly more lost games by playing as if
nothing is wrong,  then more power to you.   It doesn't seem reasonable 
to me that you should be able to do this by feeding the program false
information.   You are effectively saying, "you are losing, be happy
with that."    

By the way, if this is to work (for instance for cosmetic reasons) I
don't think you can apply this gradually or based on previous
information.   What if you are losing and the opponent plays a
blunder?   After all, this is what has to happen since the program is
losing.    You have to apply this based on information learned from the
current move you are searching.  You can't gradually fold it in as the
game progresses and expect anything useful.

>> After all, the aim of tinkering with komi is to avoid that the computer
>> plays nonsensical moves, but it should know whether he must fight or
>> calm down.
>>     
>
> Agree.  So, it's important _when_ adjust komi or apply my method.  My 
> object is to keep winning rate around 50%, which yields good moves.
>   
First of all, you won't keep the rate at 50% no matter what you do.   At
some point the programs are able to completely resolve the position and
this happens surprisingly early in many cases with good programs.     
If it's actually winning,  then if you deduct a komi to convince it is
losing, you greatly increase the chances that it really will lose.    If
you increase the komi to make it "try harder" to win a won game,   it
won't start playing meaningful moves and you risk losing.  

You see, the problem is that once the score is significantly extreme in
either direction, there is no much you can do anyway,  a single komi
point will change it suddenly to the OTHER extreme.   But this is really
where most of the action is,  so you have a catch-22.

Go ahead, try this experiment:  When the program is winning by over
95%,  see what happens when you tell it to "go for more" and see if
makes it win even more games.

I honestly believe you are barking up the wrong tree if you are looking
for program strength improvement. 

- Don


>   
>> Here is a suggestion: give a new komi at each move (not a drift, though
>> it would probably be a good idea that you *upper bound the komi change*,
>> maybe by ten times the delta_komi you use now).
>> For choosing it, punish the fact of being away from true komi
>> proportionnally to | komi_aya - true_komi |. Call lambda the
>> proportionnality constant.
>> Punish the winning rate through (winning_rate - .5)^2.
>> Total loss function is then:
>> loss = lambda * |komi_aya - true_komi| + (winning_rate - .5)^2
>>
>> Suppose the change in winning rate is proportional to change in komi,
>> that is take as hypothesis:
>> winning_rate(komi_aya) = winning_rate(normal_komi) + C (komi_aya -
>> normal_komi) 
>>
>> REMARK: C is negative !
>>
>> Take the change in komi that minimizes the punishment (a real), and
>> truncate it to get an integer (truncation also means you have a tendancy
>> to have delta_komi near 0).
>>
>> If you knew beforehand the winning rates for each komi, this would
>> ensure you generally have a winning rate on the right side of .5, while
>> it would not move komi if winning rate is somewhat near .5. 
>>
>> Solving the degree two equation yields:
>> change_in_winning_rate_by_point_komi_change =
>>
>> new_komi_aya =
>>    truncation[(.5 - old_winning_rate) +
>>    C * old_komi_aya - lambda / (2 * C)]^{+}
>>
>> if the estimated winning rate with true komi is more than .5, and
>>
>> new_komi_aya =
>>    truncation[(.5 - old_winning_rate) +
>>    C * old_komi_aya + lambda / (2 * C)]^{-}
>>                     ^                    ^
>> if the estimated winning rate with true komi is less than .5.
>>
>>
>> The estimated winning rate with true komi is given by
>> winning_rate - C old_komi_aya.
>>
>> You could take 
>> lambda = K / (1 - number_of_empty_points / 400)
>> if you want to be nearer true komi during yose. Less human-like, but
>> less error-prone !
>> Now that's probably not necessary since the constant C is probably much
>> bigger when entering yose.
>>
>>
>> The problem here in this formula is to find a good C, since this
>> constant  change during the game.
>> Here is an idea from the data in the game: use the last time the komi
>> was changed and take:
>> C = 
>> (winning_rate_after_last_change - winning_rate_before_last_change) /
>> (komi_aya_after_last_change - komi_aya_before_last_change)
>> [* f(number_empty_points_last_change, number_empty_points_now) if you want 
>> to refine]
>>
>> Now this estimation might be unstable, especially since it cannot tell if
>> that's the evolution of the situation that made the winning rate change,
>> or the change in komi. I'm sure ou can find ways to stabilize. The
>> easiest I can think of would be a formula like:
>> new_C = .3 * C_formula_above + .7 old_C
>>
>>
>> END OF THE DELIRIUM
>>     
>
> :) 
>
> Thanks.  I will try it later as I'm now rewriting my code radically
> which will take so long.
>
> # One question: where _aya_ comes from or stands for?  If my guess is 
> correct, you are confusing Hiroshi, author of Aya, and I, Hideki, 
> author of GGMC :).  I'm sorry if I'm wrong.
>
> -Hideki
> --
> [EMAIL PROTECTED] (Kato)
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>   
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to