On Tue, Nov 3, 2009 at 6:43 AM, Willemien <wilem...@googlemail.com> wrote:
> I disagree with the point that MCTS is a neural network,
>
> In my opinion (and i maybe completely off target) One of the essences
> of neural networks is that the program changes/learns from the games
> it has played. .

I think that you are right; the learning associated with artificial neural nets
is likely to be from games the program has played, or from games it has
observed.  In some literature, this is called "supervised learning", or
"learning with a teacher", wherein the training data is not at all random,
but comprises loads of very specific examples of actual behavior on the
board.  (Of course, deciding what features of those actual behaviors are
to be measured, or encoded into a pattern, is a very tricky problem.)

> MCTS doesn't have that result, the improvement is only "in-game"
> The program doesn't learn not to make the same mistake anymore, by
> MCTS the mistake is hopefully avoided.

MCTS seems to be (I'm sure someone will correct me if I'm wrong)
"reinforcement learning" which differs from "supervised learning"
in that "correct" input/output pairs are never explicitly shown to the
classifier.

Just tons of nearly-random trials, yes?

According to  http://en.wikipedia.org/wiki/Reinforcement_learning ,  "there
is a focus on on-line performance, which involves finding a balance between
exploration (of uncharted territory) and exploitation (of current knowledge).
The exploration vs. exploitation trade-off in reinforcement learning has been
mostly studied through the multi-armed bandit problem."

So, the two methods are both "classifiers" of "data points", in some sense.

The fundamental difference between them, IMHO, for the purposes of
go-programming, is that artificial neural nets may learn from data that is
harvested (and preserved, at least for the duration of the training), while
Monte-Carlo methods learn from (mostly random) unsupervised trial-and-error.

Further, the tuning of the weights in an artificial neural net may be performed
off-line, with data from thousands of games, while MCTS performs its work
on-line, with a branch of the game-tree that begins only from the
current position.

Are both methods "classifiers"?  Sure.  But neural net training is done through
observation/measurement/feature-extraction (offline, from thousands of games).
That's a much different critter, at the core, from the environment/action/reward
scenario of MCTS (online, from the current position).

-- 
All computer programs are identical.  After all, it's just ones and zeroes,
and a paper tape of arbitrary length.
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to