As far as I know, the state of art in chess is some flavor of alphabeta (as long as I read stockfish source correctly), so basically they prove their current esimation is the best one up to a certain depth.
MCTS has the benefit to enable various depth search depending on how good the evaluation is. I believe that related to the way alpha go and alphago zero is, it is far superior to alphabeta. Le 16/11/2017 à 16:43, Petr Baudis a écrit : > Hi, > > when explaining AlphaGo Zero to a machine learning audience yesterday > > > (https://docs.google.com/presentation/d/1VIueYgFciGr9pxiGmoQyUQ088Ca4ouvEFDPoWpRO4oQ/view) > > it occurred to me that using MCTS in this setup is actually such > a kludge! > > Originally, we used MCTS because with the repeated simulations, > we would be improving the accuracy of the arm reward estimates. MCTS > policies assume stationary distributions, which is violated every time > we expand the tree, but it's an okay tradeoff if all you feed into the > tree are rewards in the form of just Bernoulli trials. Moreover, you > could argue evaluations are somewhat monotonic with increasing node > depths as you are basically just fixing a growing prefix of the MC > simulation. > > But now, we expand the nodes literally all the time, breaking the > stationarity possibly in drastic ways. There are no reevaluations that > would improve your estimate. The input isn't binary but an estimate in > a continuous space. Suddenly the Multi-armed Bandit analogy loses a lot > of ground. > > Therefore, can't we take the next step, and do away with MCTS? Is > there a theoretical viewpoint from which it still makes sense as the best > policy improvement operator? > > What would you say is the current state-of-art game tree search for > chess? That's a very unfamiliar world for me, to be honest all I really > know is MCTS... > _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go