> I tried chain pooling too, and it was too slow. It made the network about twice slower in tensorflow (using tf.unsorted_segment_sum or max). I'd rather have twice more layers.
tf.unsorted_segment_max didn't exist in the first public release of TensorFlow, so I requested it just for this purpose ( https://github.com/tensorflow/tensorflow/issues/549). Too bad it's too slow to be useful. Thanks for sharing some details of what you have learned so far! Álvaro. On Thu, Mar 1, 2018 at 5:48 AM, Rémi Coulom <remi.cou...@free.fr> wrote: > Hi David, > > Thanks for sharing your experiments. It is very interesting. > > I tried chain pooling too, and it was too slow. It made the network about > twice slower in tensorflow (using tf.unsorted_segment_sum or max). I'd > rather have twice more layers. > > I never tried dilated convolutions. That sounds interesting. > > The value network of AQ has an interesting architecture. It does not go > directly from 19x19 to scalar, but works like image-recognition networks, > with 2x2 pooling until it reaches 1x1. I have not tried it yet, but that > feels like a good idea. > > Rémi > > ----- Mail original ----- > De: "David Wu" <lightvec...@gmail.com> > À: computer-go@computer-go.org > Envoyé: Mercredi 28 Février 2018 20:04:11 > Objet: Re: [Computer-go] Crazy Stone is back > > > > > It's not even just liberties and semeai, it's also eyes. Consider for > example a large dragon that has miai for 2 eyes in distant locations, and > the opponent then takes one of them - you'd like the policy net to now > suggest the other eye-making move far away. And you'd also like the value > net to distinguish the three situations where the whole group has 2 eyes > even when they are distant versus the ones where it doesn't. > > > I've been doing experiments with somewhat smaller neural nets (roughly 4-7 > residual blocks = 8-14 layers), without sticking to an idealized "zero" > approach. I've only experimented with policy nets so far, but presumably > much of this should also transfer to a value net's understanding too. > > > > 1. One thing I tried was chain pooling, which was neat, but ultimately > didn't seem promising: > > https://github.com/lightvector/GoNN#chain-pooling > It solves all of these problems when the strings are solidly connected. It > helps also when the strings are long but not quite solidly connected too, > the information still propagates faster than without it. But of course, if > there are lots of little strings forming a group, diagonal connections, > bamboo joints, etc, then of course it won't help. And also chain pooling is > computationally costly, at least in Tensorflow, and it might have negative > effects on the rest of the neural net that I don't understand. > > > > 2. A new thing I've been trying recently that actually does seem > moderately promising is dilated convolutions, although I'm still early in > testing. They also help increase the speed of information propagation, and > don't require solidly connected strings, and also are reasonably cheap. > > > > In particular: my residual blocks have 192 channels, so I tried taking > several of the later residual blocks in the neural net and making 64 of the > channels of the first convolution in each block use dilated convolutions > (leaving 128 channels of regular convolutions), with dilation factors of 2 > or 3. Intuitively, the idea is that earlier blocks could learn to compute > 2x2 or 3x3 connectivity patterns, and then the dilated convolutions in > later residual blocks will be able to use that to propagate information > several spaces at a time across connected groups or dragons. > > > So far, indications are that this works. W hen I looked at it in various > board positions, it helped in a variety of capturing race and > large-dragon-two-eye-miai situations, correctly suggesting moves that the > net without dilated convolutions would fail to find due to the move being > too far away. Also d ilated convolutions seem pretty cheap - it only > slightly increases the computational cost of the net. > > > So far, I've found that it doesn't significantly improve the overall loss > function, presumably because now there are 128 channels instead of 192 > channels of ordinary convolutions, so in return for being better at > long-distance interactions, the neural net has gotten worse at some local > tactics. But it also hasn't gotten worse the way it would if I simply > dropped the number of channels from 192 to 128 without adding any new > channels, so the dilated convolutions are being "used" for real work. > > I'd be curious to hear if anyone else has tried dilated convolutions and > what results they got. If there's anything at all to do other than just add > more layers, I think they're the most promising thing I know of. > > > > > On Wed, Feb 28, 2018 at 12:34 PM, Rémi Coulom < remi.cou...@free.fr > > wrote: > > > 192 and 256 are the numbers of channels. They are fully connected, so the > number of 3x3 filters is 192^2, and 256^2. > > Having liberty counts and string size as input helps, but it solves only a > small part of the problem. You can't read a semeai from just the > liberty-count information. > > I tried to be clever and find ways to propagate information along strings > in the network. But all the techniques I tried make the network much > slower. Adding more layers is simple and works. > > Rémi > > ----- Mail original ----- > De: "Darren Cook" < dar...@dcook.org > > À: computer-go@computer-go.org > Envoyé: Mercredi 28 Février 2018 16:43:10 > Objet: Re: [Computer-go] Crazy Stone is back > > > > > Weights_31_3200 is 20 layers of 192, 3200 board evaluations per move > > (no random playout). But it still has difficulties with very long > > strings. My next network will be 40 layers of 256, like Master. > > "long strings" here means solidly connected stones? > > The 192 vs. 256 is the number of 3x3 convolution filters? > > Has anyone been doing experiments with, say, 5x5 filters (and fewer > layers), and/or putting more raw information in (e.g. liberty counts - > which makes the long string problem go away, if I've understood > correctly what that is)? > > Darren > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go