Hello all,
We just presented our paper describing MoGo's improvements at ICML,
and we thought we would pass on some of the feedback and corrections
we have received.
(http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf)
I have the feeling that the paper is important, but it is completly
obfuscated by the strange reinforcement learning notation and jargon. Can
anyone explain it in Go-programming words?
Is the RLOG Evaluation function used for evaluation or for just selecting
the best move? (by doing a 1 Ply search).
Can anyone explain me, why it is necessary to obfuscate things at all? Why
is a move an action and not just a move, a game an episode and not a game?
Is it less scientific if coders than myself can understand it?
It was pointed out by Donald Knuth in his paper on Alpha-Beta, that the -
simple - algorithm was not understood for a long time, because of the
inappropriate mathematical notation. For recursive functions, (pseudo-)code
is much better suited than the mathematical notation. Actually its
pseudo-mathematic notation.
Why is this inappropriate notation still used?
I have build just for fun a simple BackGammon engine. I think it does what
the paper proposses for the Monte-Carlo-Part. It uses a simple evaluation
function to select the next move in the Rollout aka Monte-Carlo simulation.
The engine does not build up an UCT-tree. It uses UCT only at the root. The
rollout always starts at the first ply.
The 1ply engine has not the slightest chance against sophisticated
BackGammon programm. But the simple minded UCT version is already a serious
opponent.
By build up an UCT tree one could probably reach top Backgammon level (the
effort to do this does not pay. The backgammon market is saturated).
The simple engine behaves in a give position and dieces deterministic. But
the roll of the dices generates sufficient randomnes.
Chrilly
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/