Hi! Here comes a lengthy post about some thoughts on the "wall" problem.
I hope my approach to these problems are the correct one, because if
it is then maybe my stubborn efforts in coding playout patterns by
hand might pay off in the future. Note that the ideas here is more my
hopes rather my knowledge about the subject matter.
The playouts of Valkyria is really heavy and fix bias problems simply
by playing better in the playouts.
But it is hard work. Basically for each position where it plays
tactically poorly there are some tactical improvements that can be
done. And as soon as the bad move are pruned or forced moves are
played correctly it plays much much better in that position.
Unfortunately, these improvements only have an impact on small number
of special positions. So there is a lot of hard work.
Anyway Valkyria slowly gets stronger although it also gets slower and
slower. On the current hardware (P4 3Ghz) Valkyria3.3.7 does only 1000
playouts per second on 9x9, but it lately played really well against
the Mogo on CGOS9x9, although I guess that is an old version. I also
noticed that the rating of this Mogo as been going up the last day.
Has it been running on weaker hardware recently? If so it would nice
to know.
But for the sake of discussion I can add that for some positions it is
true that search is extremely weak at solving the bias and the best
example is seki. If the playouts do not prune the suicidal moves in
the seki properly, the search would seek out ways of how to reduce the
probability of playing the suicidal moves before the opponent does.
This behavior can look really funny because in search both colors try
to avoid playing out the seki as long as possible.
Maybe the future strong go programs require super-heavy playouts. That
actually incrementally updates the tactical strength of stones during
playouts and play correct tactical attacks and defense when the stones
are threatened.
Or maybe moderately heavy playouts are sufficient, if one can exactly
identify what must be done. For example one such "must" thing in my
opinion is to prune violations of seki. Also in this category is
probably playing correctly in semeai, but semeai is such a huge and
complicated thing that one does not need to play perfectly in all
situations in the playouts. But I believe correct play with 3
liberties or less is really important. At least that is my goal with
Valkyria.
Valkyria uses AMAF and it works well most of the time, but some times
a prior bias towards strong tactical moves has to be done in the tree
search part. Valkyria is a little odd because the tree bias comes from
the same code as is used in the playouts, so often I add code for
tactical patterns to help the tree search rather than the playouts.
A final comment. I am not sure if Valkyria will become stronger as a
function of the work I put into it. But I am also sure that a program
using Valkyrias approach could be much more efficient and free from
bugs and thus much stronger.
Best
Magnus
Quoting Olivier Teytaud <olivier.teyt...@lri.fr>:
But, while that may be the case, perhaps we can say that they are
hitting a wall in their observable playing strength against non-MCTS
players (such as humans) at higher levels. In [2] I touched upon how the
nature of the game changes at higher levels, and how scaling results
obtained between weaker players may not apply at those higher levels. I
was talking about pure random playouts in that article, but the
systematic bias Olivier mentions can lead to the same problems as no
bias at all...
I completly agree with this.
It's like a Monte-Carlo evaluation (in a non-MCTS framework).
Parallelization reduces the variance, but not the bias. You still get
improvements,
and you can believe that parallelization brings a lot, but in fact it does
not.
We spent a lot of energy trying to remove bias by various statistical tricks
(combining MCTS with tactical search or with reweighting according to
macroscopie informations on simulations such as the size of captures) but
nothing works yet
in the case of Go (for some other games we have positive results, but as I'm
not the main author I won't give too much informations on this - I hope
we'll find a similar
solution for Go, but for the moment in spite of many trials it does not work
and I'm not far of being tired of trying plenty of different implementations
of these ideas :-) ).
Best regards,
Olivier
--
Magnus Persson
Berlin, Germany
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/