Hi! Here comes a lengthy post about some thoughts on the "wall" problem.

I hope my approach to these problems are the correct one, because if it is then maybe my stubborn efforts in coding playout patterns by hand might pay off in the future. Note that the ideas here is more my hopes rather my knowledge about the subject matter.

The playouts of Valkyria is really heavy and fix bias problems simply by playing better in the playouts.

But it is hard work. Basically for each position where it plays tactically poorly there are some tactical improvements that can be done. And as soon as the bad move are pruned or forced moves are played correctly it plays much much better in that position. Unfortunately, these improvements only have an impact on small number of special positions. So there is a lot of hard work.

Anyway Valkyria slowly gets stronger although it also gets slower and slower. On the current hardware (P4 3Ghz) Valkyria3.3.7 does only 1000 playouts per second on 9x9, but it lately played really well against the Mogo on CGOS9x9, although I guess that is an old version. I also noticed that the rating of this Mogo as been going up the last day. Has it been running on weaker hardware recently? If so it would nice to know.

But for the sake of discussion I can add that for some positions it is true that search is extremely weak at solving the bias and the best example is seki. If the playouts do not prune the suicidal moves in the seki properly, the search would seek out ways of how to reduce the probability of playing the suicidal moves before the opponent does. This behavior can look really funny because in search both colors try to avoid playing out the seki as long as possible.

Maybe the future strong go programs require super-heavy playouts. That actually incrementally updates the tactical strength of stones during playouts and play correct tactical attacks and defense when the stones are threatened.

Or maybe moderately heavy playouts are sufficient, if one can exactly identify what must be done. For example one such "must" thing in my opinion is to prune violations of seki. Also in this category is probably playing correctly in semeai, but semeai is such a huge and complicated thing that one does not need to play perfectly in all situations in the playouts. But I believe correct play with 3 liberties or less is really important. At least that is my goal with Valkyria.

Valkyria uses AMAF and it works well most of the time, but some times a prior bias towards strong tactical moves has to be done in the tree search part. Valkyria is a little odd because the tree bias comes from the same code as is used in the playouts, so often I add code for tactical patterns to help the tree search rather than the playouts.

A final comment. I am not sure if Valkyria will become stronger as a function of the work I put into it. But I am also sure that a program using Valkyrias approach could be much more efficient and free from bugs and thus much stronger.

Best
Magnus

Quoting Olivier Teytaud <olivier.teyt...@lri.fr>:


But, while that may be the case, perhaps we can say that they are
hitting a wall in their observable playing strength against non-MCTS
players (such as humans) at higher levels. In [2] I touched upon how the
nature of the game changes at higher levels, and how scaling results
obtained between weaker players may not apply at those higher levels. I
was talking about pure random playouts in that article, but the
systematic bias Olivier mentions can lead to the same problems as no
bias at all...


I completly agree with this.
It's like a Monte-Carlo evaluation (in a non-MCTS framework).
Parallelization reduces the variance, but not the bias. You still get
improvements,
and you can believe that parallelization brings a lot, but in fact it does
not.
We spent a lot of energy trying to remove bias by various statistical tricks
(combining MCTS with tactical search or with reweighting according to
macroscopie informations on simulations such as the size of captures) but
nothing works yet
in the case of Go (for some other games we have positive results, but as I'm
not the main author I won't give too much informations on this - I hope
we'll find a similar
solution for Go, but for the moment in spite of many trials it does not work
and I'm not far of being tired of trying plenty of different implementations
of these ideas :-) ).

Best regards,
Olivier




--
Magnus Persson
Berlin, Germany
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to