[Computer-go] Why do Facebook use DNN next-move predictors? Shouldn't they use next-move generators?

Harald Korneliussen Tue, 24 Nov 2015 11:15:58 -0800

When I read about Facebook's DCNN-using go program, I remembered another
paper that I'd come across on arxiv, namely "How (not) to train your
generative model: scheduled sampling, likelihood, adversary?" by Ferenc
Huszar (http://arxiv.org/pdf/1511.05101.pdf).


A lot of that paper went over my head (I am a "half-studied scoundrel" as
we say in Norway), but his speculation in the end, I think I sort of got,
and it made a lot of sense to me.

He argues that which side you approach the K-L divergence from so to say
matters for what kind of errors you get when the model, and that when
you're generating as opposed to predicting, the goal should be to minimize
the K-L divergence from the "other" way.

When you're using a DCNN in a go program, you are really doing generating,
not prediction, right? You want to generate a good move. A model that
generates "flashy" moves that LOOK really strong, but could potentially be
very bad, would be a good predictor, but a bad generator.

The ideal probability distribution is the distribution of moves a pro would
make. But to the degree your model falls short, you want to minimize the
chance of making a wildly "un-pro" move, rather than maximizing the chance
of making a "pro" move. Since these are probability distributions, those
two things are not the same unless your model is perfect (right?).

If my understanding is correct (and it's quite possible I'm way off course,
I'm an amateur! sorry for wasting your time if so!), then rather than
training a move predictor, they should use the adversarial methods which
are also in the wind now to train a generative model.

-- Harald Korneliussen

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Why do Facebook use DNN next-move predictors? Shouldn't they use next-move generators?

Reply via email to