The articles I've read so far about AlphaGo mention both MCTS and RL/Q-Learning. Since MCTS (and certainly UCT) keeps statistics on wins and propagates that information up the tree, that in and of itself would seem to constitute RL, so how does it make sense to have both? It seems redundant to me. Any thoughts on that? _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
- [Computer-go] AlphaGo MCTS & Reinforcement Learning? Greg Schmidt