Hi, I strongly believe adding rollout makes Zero stronger. They removed rollout just to say "no human knowledge". #Though the number of past moves (16) has been tuned by human :).
Hideki Petr Baudis: <20171116154309.tfq5ix2hzwzci...@machine.or.cz>: > Hi, > > when explaining AlphaGo Zero to a machine learning audience yesterday > > >(https://docs.google.com/presentation/d/1VIueYgFciGr9pxiGmoQyUQ088Ca4ouvEFDPoWpRO4oQ/view) > >it occurred to me that using MCTS in this setup is actually such >a kludge! > > Originally, we used MCTS because with the repeated simulations, >we would be improving the accuracy of the arm reward estimates. MCTS >policies assume stationary distributions, which is violated every time >we expand the tree, but it's an okay tradeoff if all you feed into the >tree are rewards in the form of just Bernoulli trials. Moreover, you >could argue evaluations are somewhat monotonic with increasing node >depths as you are basically just fixing a growing prefix of the MC >simulation. > > But now, we expand the nodes literally all the time, breaking the >stationarity possibly in drastic ways. There are no reevaluations that >would improve your estimate. The input isn't binary but an estimate in >a continuous space. Suddenly the Multi-armed Bandit analogy loses a lot >of ground. > > Therefore, can't we take the next step, and do away with MCTS? Is >there a theoretical viewpoint from which it still makes sense as the best >policy improvement operator? > > What would you say is the current state-of-art game tree search for >chess? That's a very unfamiliar world for me, to be honest all I really >know is MCTS... > >-- > Petr Baudis, Rossum > Run before you walk! Fly before you crawl! Keep moving forward! > If we fail, I'd rather fail really hugely. -- Moist von Lipwig >_______________________________________________ >Computer-go mailing list >Computer-go@computer-go.org >http://computer-go.org/mailman/listinfo/computer-go -- Hideki Kato <mailto:hideki_ka...@ybb.ne.jp> _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go