After much effort, I think I understand most of the Gelly&Shriver
paper[1]. I'm hoping this post will help others and possibly have
people correct any errors I've made.
First, some basic definitions of notation:
* In general, "Q" is an estimated winning rate, used in three ways:
1. As an estimat
Reinforcment Learning: A Survey is available on citeseer.
/Dan Andersson
Ursprungligt meddelande
Datum: 2007-okt-12 02:18
Till: "computer-go"
Ärende: [computer-go] Combining online and offline knowledge in UCT
Does anyone have a good reference for r
Does anyone have a good reference for reading the notation in the
Gelley/Shriver paper "Combining online and offline knowledge in UCT"?
computer-go mailing list