I am also interested in unsupervised learning. When this first came up, I
found this paper: http://arxiv.org/pdf/1312.5602v1.pdf, where the authors
taught a deep convolutional network to play games from the Atari console.
They used what they call "Deep Reinforcement Learning", a variant of
Q-learning, and the methodology is described in detail.  There's even an
online demo of this technique for an easier problem here:
http://cs.stanford.edu/people/karpathy/convnetjs/demo/rldemo.html

It's impressive that the same network learned to play seven games with just
a win/lose signal.  It's also interesting that both these teams are in
different parts of Google. I assume they are aware of each other's work,
but maybe Aja can confirm.



On Mon, Mar 16, 2015 at 11:51 AM, hughperkins2 <hughperki...@gmail.com>
wrote:

> > The important thing is that the games don't have to be played perfectly:
> They just need to be significantly better than your current model, so you
> can tweak the model to learn from them.
>
> Thats an important incite. I hadnt thought of that.
>
> Maybe could combine with some concept of "forgetting", eg weight decay, so
> the net gradually unlearns some of the original, more naive,
> associations? > The important thing is that the games don't have to be
> played perfectly: They just need to be significantly better than your
> current model, so you can tweak the model to learn from them.
>
> Thats an important incite. I hadnt thought of that.
>
> Maybe could combine with some concept of "forgetting", eg weight decay, so
> the net gradually unlearns some of the original, more naive, associations?
> could combine with some concept of "forgetting", eg weight decay, so the
> net gradually unlearns some of the original, more naive, associations?
>
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Reply via email to