Matthew Lai's paper "Giraffe: Using Deep Reinforcement Learning to Play
Chess" has a chapter called "Probabilistic Search". That should be directly
applicable?

https://arxiv.org/abs/1509.01549

- Andy


2017-11-16 9:43 GMT-06:00 Petr Baudis <pa...@ucw.cz>:

>   Hi,
>
>   when explaining AlphaGo Zero to a machine learning audience yesterday
>
>         (https://docs.google.com/presentation/d/
> 1VIueYgFciGr9pxiGmoQyUQ088Ca4ouvEFDPoWpRO4oQ/view)
>
> it occurred to me that using MCTS in this setup is actually such
> a kludge!
>
>   Originally, we used MCTS because with the repeated simulations,
> we would be improving the accuracy of the arm reward estimates.  MCTS
> policies assume stationary distributions, which is violated every time
> we expand the tree, but it's an okay tradeoff if all you feed into the
> tree are rewards in the form of just Bernoulli trials.  Moreover, you
> could argue evaluations are somewhat monotonic with increasing node
> depths as you are basically just fixing a growing prefix of the MC
> simulation.
>
>   But now, we expand the nodes literally all the time, breaking the
> stationarity possibly in drastic ways.  There are no reevaluations that
> would improve your estimate.  The input isn't binary but an estimate in
> a continuous space.  Suddenly the Multi-armed Bandit analogy loses a lot
> of ground.
>
>   Therefore, can't we take the next step, and do away with MCTS?  Is
> there a theoretical viewpoint from which it still makes sense as the best
> policy improvement operator?
>
>   What would you say is the current state-of-art game tree search for
> chess?  That's a very unfamiliar world for me, to be honest all I really
> know is MCTS...
>
> --
>                                         Petr Baudis, Rossum
>         Run before you walk! Fly before you crawl! Keep moving forward!
>         If we fail, I'd rather fail really hugely.  -- Moist von Lipwig
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to