Re: [Computer-go] Learning related stuff

Eric Boesch Mon, 27 Nov 2017 19:37:19 -0800

I imagine implementation determines whether transferred knowledge is
helpful. It's like asking whether forgetting is a problem -- it often is,
but evidently not for AlphaGo Zero.

One crude way to encourage stability is to include an explicit or implicit
age parameter that forces the program to perform smaller modifications to
its state during later stages. If the parameters you copy from problem A to
problem B also include that age parameter, so the network acts old even
though it is faced with a new problem, then its initial exploration may be
inefficient. For an MCTS based example, if a MCTS node is initialized to a
10877-6771 win/loss record based on evaluations under slightly different
game rules, then with a naive implementation, even if the program discovers
the right refutation under the new rules right away, it would still need to
revisit that node thousands of times to convince itself the node is now
probably a losing position.

But unlearning bad plans in a reasonable time frame is already a feature
you need from a good learning algorithm. Even AlphaGo almost fell into trap
states; from their paper, it appears that it stuck with 1-1 as an opening
move for much longer than you would expect from a program probably already
much better than 40 kyu. Even if it's unrealistic for Go specifically, you
could imagine some other game where after days of analysis, the program
suddenly discovers a reliable trick that adds one point for white to every
single game. The effect would be the same as your komi change -- a mature
network now needs to adapt to a general shift in the final score. So the
task of adapting to handle similar games may be similar to the task of
adapting to analysis reversals within a single game, and improvements to
one could lead to improvements to the other.

On Fri, Nov 24, 2017 at 7:54 AM, Stephan K <stephan.ku...@gmail.com> wrote:

> 2017-11-21 23:27 UTC+01:00, "Ingo Althöfer" <3-hirn-ver...@gmx.de>:
> > My understanding is that the AlphaGo hardware is standing
> > somewhere in London, idle and waitung for new action...
> >
> > Ingo.
>
> The announcement at
> https://deepmind.com/blog/applying-machine-learning-mammography/ seems
> to disagree:
>
> "Our partners in this project wanted researchers at both DeepMind and
> Google involved in this research so that the project could take
> advantage of the AI expertise in both teams, as well as Google’s
> supercomputing infrastructure - widely regarded as one of the best in
> the world, and the same global infrastructure that powered DeepMind’s
> victory over the world champion at the ancient game of Go."
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Learning related stuff

Reply via email to