I highly doubt that learning refutations to pertinent bad shape patterns is much more exponential or combinatorially huge than learning good shape already is, if you could somehow be appropriately selective about it. There are enough commonalities that it should "compress" very well into what the neural net learns.
To draw an analogy with humans, for example humans seem to have absolutely no issue learning refutations to lots of bad shapes right alongside good shapes, and don't seem to find it orders of magnitude harder to apply their pattern recognition abilities to that task. Indeed people learn bad shape refutations all the time from performing "training updates" based on their own reading about what works or not every time they read things in a game, (almost certainly a thing the human brain "does" to maximize the use of data), as well as the results of their own mistakes, and also learn and refutations of non-working tactics all the time if they do things like tsumego. But the distribution on bad moves that people devote their attention to learning to refute is extremely nonuniform - e.g. nobody studies what to do if the opponent plays the 1-2 point against your 4-4 point - and I bet that's important too. I'm also highly curious if there's anyone who has experimented with or found any ways to mitigate this issue! ---------- If you want an example of this actually mattering, here's example where Leela makes a big mistake in a game that I think is due to this kind of issue. Leela is white. The last several moves to reach the screenshotted position were: 16. wG6 (capture stone in ladder) 17. bQ13 (shoulder hit + ladder breaker) 18. wR13 (push + restore ladder) 19. bQ12 (extend + break ladder again) 20. wR12 (Leela's mistake, does not restore ladder) You can see why Leela makes the mistake at move 20. On move 21 it pretty much doesn't consider escaping at H5. The policy net puts it at 1.03%, which I guess is just a little too low for it to get some simulations here even with a lot of thinking time. So Leela thinks black is behind, considering moves with 44% and 48% overall win rates. However, if you play H5 on the board, Leela almost instantly changes its mind to 54% for black. So not seeing H5 escape on move 21 leads it to blunder on move 20. If you go back on move 20 and force Leela to capture the ladder, it's happy with the position for white, so it's also not as if Leela dislikes the ladder capture, it's just that it wrongly thinks R12 is slightly better due to not seeing the escape. To some degree this maybe means Leela is insufficiently explorative in cases like this, but still, why does the policy net not put H5 more than 1.03%. After all, it's vastly more likely than 1% that that a good player will see the ladder works here and try escaping in this position. I would speculate that this is because of biases in training. If Leela is trained on high amateur and pro games, then one would expect that in nearly all training games, conditional on white playing a move like R12 and ignoring the ladder escape, the ladder is often not too important and therefore Black should usually not escape. By contrast, when the ladder escape is important, then in such games White will capture at H5 instead of playing R12 and not allow the escape in the first place. So there is rarely a training example that allows the policy net to learn that the ladder escape is likely for black to play here. On Mon, Apr 17, 2017 at 7:04 AM, Jim O'Flaherty <jim.oflaherty...@gmail.com> wrote: > It seems chasing down good moves for bad shapes would be an explosion of > "exception cases", like combinatorially huge. So, while you would be saving > some branching in the search space, you would be ballooning up the number > of patterns for which to scan by orders of magnitude. > > Wouldn't it be preferable to just have the AI continue to make the better > move emergently and generally from probabilities around win placements as > opposed to what appears to be a focus on one type of local optima? > > > On Mon, Apr 17, 2017 at 5:07 AM, lemonsqueeze <lemonsque...@free.fr> > wrote: > >> Hi, >> >> I'm sure the topic must have come up before but i can't seem to find it >> right now, i'd appreciate if someone can point me in the right direction. >> >> I'm looking into MM, LFR and similar cpu-based pattern approaches for >> generating priors, and was wondering about basic bad shape: >> >> Let's say we use 6d games for training. The system becomes pretty good at >> predicting 6d moves by learning patterns associated with the kind of moves >> 6d players make. >> >> However it seems it doesn't learn to punish basic mistakes effectively >> (say double ataris, shapes with obvious defects ...) because they almost >> never show up in 6d games =) They show up in the search tree though and >> without good answer search might take a long time to realize these moves >> don't work. >> >> Maybe I missed some later paper / development but basically, >> Wouldn't it make sense to also train on good answers to bad moves ? >> (maybe harvesting them from the search tree or something like that) >> >> I'm thinking about basic situations like this which patterns should be >> able to recognize: >> >> A B C D E F G H J K L M N O P Q R S T >> +---------------------------------------+ >> 19 | . . . . . . . . . . . . . . . . . . . | >> 18 | . . . O O . O . . . . . . . . . . . . | >> 17 | . . X . X O . X O . . . . . . . . . . | >> 16 | . . X . X O . O O . . . . . . X . . . | >> 15 | . . . . . X X X X . . . . . . . . . . | >> 14 | . . X . . . . . . . . . . . . . . . . | >> 13 | . O . . . . . . . . . . . . X . O . . | >> 12 | . . O . . . . . . . . . . . . . O . . | >> 11 | . . O X . . . . . . . . . . . X . . . | >> 10 | . . O X . . . . . . . . . . . X O . . | >> 9 | . . O X . . . X . . . X O . . X O . . | >> 8 | . . O X . . . O X X X O X)X X O O . . | >> 7 | . O X . . . . . O O X O . X O O . . . | >> 6 | O X X . X X X O . . O . . . X O X . . | >> 5 | . O O . . O X . . . . . O . . . . . . | >> 4 | . X O O O O X . . . O . . O . O . . . | >> 3 | . X X X X O . . X . X X . . . . . . . | >> 2 | . . . . X O . . . . . . O . . . . . . | >> 1 | . . . . . . . . . . . . . . . . . . . | >> +---------------------------------------+ >> >> Patterns probably never see this during training and miss W L9, >> For example : >> >> In Remi's CrazyPatterns.exe L9 comes in 4th position: >> [ M10 N10 O6 L9 ... >> >> With Pachi's large patterns it's 8th: >> [ G8 M10 G9 O17 N10 O6 J4 L9 ... >> >> Cheers, >> Matt >> >> ---- >> >> MM: Computing Elo Ratings of Move Patterns in the Game of Go >> https://www.remi-coulom.fr/Amsterdam2007/ >> >> LFR: Move Prediction in Go – Modelling Feature Interactions Using >> Latent Factors >> https://www.ismll.uni-hildesheim.de/pub/pdfs/wistuba_et_al_KI_2013.pdf >> >> >> _______________________________________________ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go