I highly doubt that learning refutations to pertinent bad shape patterns is
much more exponential or combinatorially huge than learning good shape
already is, if you could somehow be appropriately selective about it. There
are enough commonalities that it should "compress" very well into what the
neural net learns.

To draw an analogy with humans, for example humans seem to have absolutely
no issue learning refutations to lots of bad shapes right alongside good
shapes, and don't seem to find it orders of magnitude harder to apply their
pattern recognition abilities to that task. Indeed people learn bad shape
refutations all the time from performing "training updates" based on their
own reading about what works or not every time they read things in a game,
(almost certainly a thing the human brain "does" to maximize the use of
data), as well as the results of their own mistakes, and also learn and
refutations of non-working tactics all the time if they do things like
tsumego. But the distribution on bad moves that people devote their
attention to learning to refute is extremely nonuniform - e.g. nobody
studies what to do if the opponent plays the 1-2 point against your 4-4
point - and I bet that's important too.

I'm also highly curious if there's anyone who has experimented with or
found any ways to mitigate this issue!

----------

If you want an example of this actually mattering, here's example where
Leela makes a big mistake in a game that I think is due to this kind of
issue. Leela is white.

The last several moves to reach the screenshotted position were:
16. wG6 (capture stone in ladder)
17. bQ13 (shoulder hit + ladder breaker)
18. wR13 (push + restore ladder)
19. bQ12 (extend + break ladder again)
20. wR12 (Leela's mistake, does not restore ladder)

You can see why Leela makes the mistake at move 20. On move 21 it pretty
much doesn't consider escaping at H5. The policy net puts it at 1.03%,
which I guess is just a little too low for it to get some simulations here
even with a lot of thinking time. So Leela thinks black is behind,
considering moves with 44% and 48% overall win rates. However, if you play
H5 on the board, Leela almost instantly changes its mind to 54% for black.
So not seeing H5 escape on move 21 leads it to blunder on move 20. If you
go back on move 20 and force Leela to capture the ladder, it's happy with
the position for white, so it's also not as if Leela dislikes the ladder
capture, it's just that it wrongly thinks R12 is slightly better due to not
seeing the escape.

To some degree this maybe means Leela is insufficiently explorative in
cases like this, but still, why does the policy net not put H5 more than
1.03%. After all, it's vastly more likely than 1% that that a good player
will see the ladder works here and try escaping in this position. I would
speculate that this is because of biases in training. If Leela is trained
on high amateur and pro games, then one would expect that in nearly all
training games, conditional on white playing a move like R12 and ignoring
the ladder escape, the ladder is often not too important and therefore
Black should usually not escape. By contrast, when the ladder escape is
important, then in such games White will capture at H5 instead of playing
R12 and not allow the escape in the first place. So there is rarely a
training example that allows the policy net to learn that the ladder escape
is likely for black to play here.


On Mon, Apr 17, 2017 at 7:04 AM, Jim O'Flaherty <jim.oflaherty...@gmail.com>
wrote:

> It seems chasing down good moves for bad shapes would be an explosion of
> "exception cases", like combinatorially huge. So, while you would be saving
> some branching in the search space, you would be ballooning up the number
> of patterns for which to scan by orders of magnitude.
>
> Wouldn't it be preferable to just have the AI continue to make the better
> move emergently and generally from probabilities around win placements as
> opposed to what appears to be a focus on one type of local optima?
>
>
> On Mon, Apr 17, 2017 at 5:07 AM, lemonsqueeze <lemonsque...@free.fr>
> wrote:
>
>> Hi,
>>
>> I'm sure the topic must have come up before but i can't seem to find it
>> right now, i'd appreciate if someone can point me in the right direction.
>>
>> I'm looking into MM, LFR and similar cpu-based pattern approaches for
>> generating priors, and was wondering about basic bad shape:
>>
>> Let's say we use 6d games for training. The system becomes pretty good at
>> predicting 6d moves by learning patterns associated with the kind of moves
>> 6d players make.
>>
>> However it seems it doesn't learn to punish basic mistakes effectively
>> (say double ataris, shapes with obvious defects ...) because they almost
>> never show up in 6d games =) They show up in the search tree though and
>> without good answer search might take a long time to realize these moves
>> don't work.
>>
>> Maybe I missed some later paper / development but basically,
>> Wouldn't it make sense to also train on good answers to bad moves ?
>> (maybe harvesting them from the search tree or something like that)
>>
>> I'm thinking about basic situations like this which patterns should be
>> able to recognize:
>>
>>        A B C D E F G H J K L M N O P Q R S T
>>      +---------------------------------------+
>>   19 | . . . . . . . . . . . . . . . . . . . |
>>   18 | . . . O O . O . . . . . . . . . . . . |
>>   17 | . . X . X O . X O . . . . . . . . . . |
>>   16 | . . X . X O . O O . . . . . . X . . . |
>>   15 | . . . . . X X X X . . . . . . . . . . |
>>   14 | . . X . . . . . . . . . . . . . . . . |
>>   13 | . O . . . . . . . . . . . . X . O . . |
>>   12 | . . O . . . . . . . . . . . . . O . . |
>>   11 | . . O X . . . . . . . . . . . X . . . |
>>   10 | . . O X . . . . . . . . . . . X O . . |
>>    9 | . . O X . . . X . . . X O . . X O . . |
>>    8 | . . O X . . . O X X X O X)X X O O . . |
>>    7 | . O X . . . . . O O X O . X O O . . . |
>>    6 | O X X . X X X O . . O . . . X O X . . |
>>    5 | . O O . . O X . . . . . O . . . . . . |
>>    4 | . X O O O O X . . . O . . O . O . . . |
>>    3 | . X X X X O . . X . X X . . . . . . . |
>>    2 | . . . . X O . . . . . . O . . . . . . |
>>    1 | . . . . . . . . . . . . . . . . . . . |
>>      +---------------------------------------+
>>
>> Patterns probably never see this during training and miss W L9,
>> For example :
>>
>> In Remi's CrazyPatterns.exe L9 comes in 4th position:
>>    [ M10 N10 O6 L9 ...
>>
>> With Pachi's large patterns it's 8th:
>>    [ G8  M10 G9  O17 N10 O6  J4  L9  ...
>>
>> Cheers,
>> Matt
>>
>> ----
>>
>> MM: Computing Elo Ratings of Move Patterns in the Game of Go
>>    https://www.remi-coulom.fr/Amsterdam2007/
>>
>> LFR: Move Prediction in Go – Modelling Feature Interactions Using
>>       Latent Factors
>>    https://www.ismll.uni-hildesheim.de/pub/pdfs/wistuba_et_al_KI_2013.pdf
>>
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to