On Tue, Nov 3, 2009 at 6:43 AM, Willemien <wilem...@googlemail.com> wrote: > I disagree with the point that MCTS is a neural network, > > In my opinion (and i maybe completely off target) One of the essences > of neural networks is that the program changes/learns from the games > it has played. .
I think that you are right; the learning associated with artificial neural nets is likely to be from games the program has played, or from games it has observed. In some literature, this is called "supervised learning", or "learning with a teacher", wherein the training data is not at all random, but comprises loads of very specific examples of actual behavior on the board. (Of course, deciding what features of those actual behaviors are to be measured, or encoded into a pattern, is a very tricky problem.) > MCTS doesn't have that result, the improvement is only "in-game" > The program doesn't learn not to make the same mistake anymore, by > MCTS the mistake is hopefully avoided. MCTS seems to be (I'm sure someone will correct me if I'm wrong) "reinforcement learning" which differs from "supervised learning" in that "correct" input/output pairs are never explicitly shown to the classifier. Just tons of nearly-random trials, yes? According to http://en.wikipedia.org/wiki/Reinforcement_learning , "there is a focus on on-line performance, which involves finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge). The exploration vs. exploitation trade-off in reinforcement learning has been mostly studied through the multi-armed bandit problem." So, the two methods are both "classifiers" of "data points", in some sense. The fundamental difference between them, IMHO, for the purposes of go-programming, is that artificial neural nets may learn from data that is harvested (and preserved, at least for the duration of the training), while Monte-Carlo methods learn from (mostly random) unsupervised trial-and-error. Further, the tuning of the weights in an artificial neural net may be performed off-line, with data from thousands of games, while MCTS performs its work on-line, with a branch of the game-tree that begins only from the current position. Are both methods "classifiers"? Sure. But neural net training is done through observation/measurement/feature-extraction (offline, from thousands of games). That's a much different critter, at the core, from the environment/action/reward scenario of MCTS (online, from the current position). -- All computer programs are identical. After all, it's just ones and zeroes, and a paper tape of arbitrary length. _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/