My impression is that each feature gets a single weight in Crazy-stone. The 
team-of-features aspect arises because a single point can match several 
patterns, so you need a model to assign credit when tuning. The paper that I 
remember used fixed receptive fields to define the patterns. (E.g., from 3x3 
through 5x10, or some such.) The easiest way to match those is to use a hash 


My impression is that both NN and DT are capable of asymptotically learning the 
entire game. (Also true if you use fixed receptive fields.) They should be 
equally powerful, though they differ in terms of the degree of understanding 
required by the programmer.


IMO, MCTS should always be the "outermost loop" in the system. MCTS provides 
asymptotic optimality guarantees under remarkably general conditions.




From: Computer-go [] On Behalf Of 
René van de Veerdonk
Sent: Monday, December 15, 2014 11:47 PM
To: computer-go
Subject: Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play 


Correct me if I am wrong, but I believe that the CrazyStone approach of 
team-of-features can be cast in terms of a shallow neural network. The inputs 
are matched patterns on the board and other local information on atari, 
previous moves, ko situation, and such. Remi alluded as much on this list 
sometime after his paper got published.


Without having studied the Deep Learning papers in detail, it seems that these 
are the types of "smart features" that could be learned by a Deep Neural Net in 
the first few layers if the input is restricted to just the raw board, but 
could equally well be provided as domain specific features in order to improve 
computational efficiency (and perhaps enforce correctness).


These approaches may not be all that far apart, other than the depth of the net 
and the domain specific knowledge used directly. Remi recently mentioned that 
the number of patterns in more recent versions of CrazyStone also number in the 
millions. I think the prediction rates for these two approaches are also pretty 
close. Compare the Deep Learning result to the other recent study of a German 
group quoted in the Deep Learning paper.


The bigger questions to me are related to engine architecture. Are you going to 
use this as an input to a search? Or are you going to use this directly to 
play? If the former, it had better be reasonably fast. The latter approach can 
be far slower, but requires the predictions to be of much higher quality. And 
the biggest question, how can you make these two approaches interact 




On Mon, Dec 15, 2014 at 8:00 PM, Brian Sheppard <> wrote:

>Is it really such a burden?


Well, I have to place my bets on some things and not on others.


It seems to me that the costs of a NN must be higher than a system based on 
decision trees. The convolution NN has a very large parameter space if my 
reading of the paper is correct. Specifically, it can represent all patterns 
translated and rotated and matched against all points in parallel.


To me, that seems like a good way to mimic the visual cortex, but an 
inefficient way to match patterns on a Go board.


So my bet is on decision trees. The published research on NN will help me to 
understand the opportunities much better, and I have every expectation that the 
performance of decision trees should be >= NN in every way. E.g., faster, more 
accurate, easier and faster to tune. 


I recognize that my approach is full of challenges. E.g., a NN would 
automatically infer "soft" qualities such as "wall", "influence" that would 
have to be provided to a DT as inputs. No free lunch, but again, this is about 
betting that one technology is (overall) more suitable than another.




From: Computer-go [] On Behalf Of 
Stefan Kaitschick
Sent: Monday, December 15, 2014 6:37 PM
Subject: Re: [Computer-go] Teaching Deep Convolutional Neural Networks to Play 



Finally, I am not a fan of NN in the MCTS architecture. The NN architecture 
imposes a high CPU burden (e.g., compared to decision trees), and this study 
didn't produce such a breakthrough in accuracy that I would give away 


 Is it really such a burden? Supporting the move generator with the NN result 
high up in the decision tree can't be that expensive.

Computer-go mailing list

Computer-go mailing list

Reply via email to