David Silver wrote:
Very interesting paper!
I have one question. The assumption in your paper is that increasing
the performance of the simulation player will increase the performance
of Monte-Carlo methods that use that simulation player. However, we
found in MoGo that this is not necessarily the case! Do you think
there is some property of your learning algorithm that makes it
particularly suitable for Monte-Carlo methods?
Thanks!
Dave
Maximizing the likelihood does not optimize the performance of the
simulation player. For instance, by making it more greedy, I am sure it
would become a stronger player. I have the feeling that maximizing the
likelihood produces a good balance between playing good moves and being
random. It would be worth testing the strength of the MC player with
more or less greedy versions of the random player to test this.
Rémi
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/