>
>>
>> Heavier playouts have been shown to be far superior,  but just placing
>> stronger moves in the playouts does not seem to be the right
>> formula.     My guess is that if you place arbitrary knowledge into the
>> playouts, you create terrible imbalances.   Perhaps the secret (I'm
>> entering uncharted territory here)  is that it's ok to create imbalance
>> if it's the type the tree is easily able to correct.    This is my
>> working theory,  let's call it "maximum information gain."      The
>> early Mogo paper showed that it's good to look at defending opponents
>> atari.    Defending an atari tends to be either one of the better or one
>> of the worst moves on the board.    It's a horrible move if it cannot be
>> defended.    HOWEVER,  what you can say is that Mogo's playout strategy
>> pushes the search right away in the direction of  "finding out for
>> sure"   so this appears to cause the playouts to create a hypothesis
>> that is either quickly refuted or quickly confirmed.    I really think
>> you want to push towards positions that "work out" what is really going
>> on, even if you have to insert bias into the the playouts to get this
>> (which you probably do.)
>>
>
> I don't know about this. 
I don't know about it either - I'm still trying to figure it out.  
Let's say this is just a theory of mine for now.
> I'm pretty sure MoGo checks if the stone can make at least two
> liberties (ladder problem) in which case it can still be horrible but
> very seldomly worse than random. 
I don't know what Mogo does today,  but a successful early version of
Mogo checks for a random move that defends.    Defends in the temporary
sense that it either creates 2 liberties directly as you say or captures
one of the surrounding groups.    

But it doesn't "defend" absolutely of course.   This means you might
continue to defend a group that is certain to die eventually via a
ladder.      The game is virtually over if you continue to defend a long
ladder and fail. 

About whether it's "worse than random" or not:     I don't think that's
a good litmus test.        You are sure to have terrible playouts if one
side plays "better than random" and the other side doesn't.     As long
as one side, or one type of move is emphasized without a corresponding
balance,   the playouts are going to be heavily biased and consistently
give the wrong answer. 

Just for illustration purposes,  pretend that we find some really
wonderful heuristic that can find the very best move but for some odd
reason it only works for WHITE.    We could have a playout that plays
half of the moves perfectly,  far better than random.     The obvious
problem with such an algorithm is that it would almost always show white
winning the game!    It would be virtually worthless in the playouts!

Evidently,   the way we add knowledge to playouts can produce a similar
type of effect,  putting some kind of bias into the playouts that makes
it return win/loss information that is not as useful as even random
playouts in some cases.      I can only guess that any knowledge that
makes the playouts play much better than random, and yet makes the total
program play worse, must have some kind of systematic bias that gives
some types of positions much better chances than they deserve.  


> It seems to me that in principle whenever you have a 'pattern' that
> has a higher success-rate than a random move over a large number of
> games, then it should always be preferred over the random choice. 
Intuition fails us in this case I think.     Your goal is not to find
the best move,   it is to "fairly" decide who has the better
chances.     It's probably no good if your pattern set just happens to
favor one side or the other in a specific position.      I'm not sure
why YOUR patterns are failing, because I would assume using an automated
procedure like you are doing would be fair and give overall even
coverage compared to using careful hand crafted patterns (which would
surely be biased.)  

I think Mogo used 8 or 12 patterns  that were carefully chosen to cover
really broad principles and not highly specific cases.     

> I think it's the lack of randomness that's the problem. The large
> number of playouts tend to converge to a mistake if it's
> deterministic. It doesn't even have to be totally deterministic I think.
I think you are very likely correct.    The more patterns you have, the
more deterministic your program is.    Your playouts should  not always
win (for one side) in a fairly even positions due to small changes in
the position.   Perhaps this is a general principle that could be tested
somehow?


>
>>>
>>> The second observation I think may have been caused by not enough
>>> randomness. But that means I first have to find an answer to how much
>>> randomness I need to put into the patterns. I'm first looking at this
>>> question with some hand-crafted patterns to get a better handle on
>>> this issue.
There is at least one program I have heard of that tries to play moves
directly proportionate to their "value" somehow.     In other words, the
worst move on the board has some chance, if very small, of being
executed in the playouts.     You would not play the worst move on the
board if you  knew that it has zero chance of being best,  but if you
thought it had a 1% chance you might want to play it 1% of the time.

- Don


>> Let us know.   The whole issue is pretty interesting.    I think
>> randomness is required only to counteract systematic biases because
>> obviously if you playouts played perfectly,  there would be no need for
>> randonmess.   And yet better than random playouts can also lead to worse
>> play in general.
>>
>>
>> - Don
>
>
>
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to