To be clear, what I was talking about was building an opening book as
part of the game-generation process that produces training data for
the neural network. This makes sure you don't generate the same game
over and over again.
A few more things about my Spanish checkers experiment from a few
year
For checkers, I used a naive implementation of UCT as my opening book
(the "playout" being the actual game where the engine is thinking). So
towards the end of the opening book there is always a position where
it will try a random move, but in the long run good opening moves will
be explored more o
From before AlphaGo was announced, I thought the way forward was
generating games that play to the bitter end maximizing score, and
then using the final ownership as something to predict. I am very glad
that someone has had the time to put this idea (and many others!) into
practice. Congratulations
...or you could just not get your knickers in a twist over somebody's
pronoun selection. I am here for the discussions about computer go,
not gender politics.
On Thu, Jun 21, 2018 at 6:24 PM, Mario Xerxes Castelán Castro
wrote:
> “He” is the genetic singular pronoun in English. If anybody feels
I don't think ko fights have anything to do with this. John Tromp told
me that ladders are PSPACE complete: https://tromp.github.io/lad.ps
Álvaro.
On Mon, Jun 18, 2018 at 2:58 PM, uurtamo wrote:
> FWIW, first-capture go (i.e. winner is first one to make a capture) should
> not be PSPACE-comple
Sorry, I haven't been paying enough attention lately to know what
"alpha-beta rollouts" means precisely. Can you either describe them or give
me a reference?
Thanks,
Álvaro.
On Tue, Mar 6, 2018 at 1:49 PM, Dan wrote:
> I did a quick test with my MCTS chess engine wth two different
> implement
> I tried chain pooling too, and it was too slow. It made the network about
twice slower in tensorflow (using tf.unsorted_segment_sum or max). I'd
rather have twice more layers.
tf.unsorted_segment_max didn't exist in the first public release of
TensorFlow, so I requested it just for this purpose
Chrilly Donninger's quote was probably mostly true in the 90s, but it's now
obsolete. That intellectual protectionism was motivated by the potential
economic profit of having a strong engine. It probably slowed down computer
chess for decades, until the advent of strong open-source programs.
Parado
My hand-wavy argument succumbs to experimental data. And to a better
argument. :)
I stand corrected.
Thanks,
Álvaro.
On Wed, Dec 6, 2017 at 8:52 AM, Gian-Carlo Pascutto wrote:
> On 06-12-17 11:47, Aja Huang wrote:
> > All I can say is that first-play-urgency is not a significant
> > technica
I have a personal definition of art that works pretty well: Pretentious
entertainment. Emphasis on “pretentious”.
On a more serious note, I don’t care if anything I produce is art or not,
and neither should you. If you enjoy what you are doing, keep it up!
Álvaro.
On Tuesday, December 5, 2017, "
ay tests with different settings do not show a big
> impact, but I am changing other variables at the same time.
>
> - Andy
>
>
>
> 2017-12-03 14:30 GMT-06:00 Álvaro Begué :
>
>> The text in the appendix has the answer, in a paragraph titled "Expand
>> and eval
ter-go.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Computer-go digest..."
>>
>>
>> Today's Topics:
>>
>>1. action-value Q for unexpanded nodes (And
o
> tree search in AlphaGo Zero. a Each simulation traverses the tree by
> selecting the edge with maximum action-value Q, plus an upper confidence
> bound U that depends on a stored prior probability P and visit count N for
> that edge (which is incremented once traversed). b The leaf node is
I am not sure where in the paper you think they use Q(s,a) for a node s
that hasn't been expanded yet. Q(s,a) is a property of an edge of the
graph. At a leaf they only use the `value' output of the neural network.
If this doesn't match your understanding of the paper, please point to the
specific
The term you are looking for is "transfer learning":
https://en.wikipedia.org/wiki/Transfer_learning
On Tue, Nov 21, 2017 at 5:27 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
wrote:
> Hi Erik,
>
> > No need for AlphaGo hardware to find out; any
> > toy problem will suffice to explore different
> >
It's a model written using the Keras neural network library:
https://en.wikipedia.org/wiki/Keras
On Fri, Nov 10, 2017 at 7:09 AM, Xavier Combelle
wrote:
> You make me really curious,
> what is a Keras model ?
>
> Le 10/11/2017 à 01:47, Petr Baudis a écrit :
> > Hi,
> >
> > I got first *some
Your understanding matches mine. My guess is that they had a temperature
parameter in the code that would allow for things like slowly transitioning
from random sampling to deterministically picking the maximum, but they
ended up using only those particular values.
Álvaro.
On Tue, Nov 7, 2017
tial analysis (A term I just made up) to
> determine the utility function, which is why having a continuous log of all
> the input streams is necessary.
>
> On Oct 30, 2017, 3:45 PM -0700, Álvaro Begué ,
> wrote:
>
> In your hypothetical scenario, if the car can give you as muc
In your hypothetical scenario, if the car can give you as much debugging
information as you suggest (100% tree is there, 95% child is there), you
can actually figure out what's happening. The only other piece of
information you need is the configured utility values for the possible
outcomes.
Say t
There are ways to do it, but it might be messy. However, the vast majority
of the computational effort will be in playing games to generate a training
database, and that part is trivial to distribute. Testing if the new
version is better than the old version is also very easy to distribute.
Álvaro
No, they are too few games for that.
On Mon, Oct 23, 2017 at 8:05 AM, Jim O'Flaherty
wrote:
> Couldn't they be useful as part of a set of training data for newly
> trained engines and networks?
>
> On Oct 23, 2017 2:34 AM, "Petri Pitkanen"
> wrote:
>
>> They are free to use in any attribution
I suggest scaling down the problem until some experience is gained.
You don't need the full-fledge 40-block network to get started. You can
probably get away with using only 20 blocks and maybe 128 features (from
256). That should save you about a factor of 8, plus you can use larger
mini-batches.
When I did something like this for Spanish checkers (training a neural
network to be the evaluation function in an alpha-beta search, without any
human knowledge), I solved the problem of adding game variety by using UCT
for the opening moves. That means that I kept a tree structure with the
openin
Yes, residual networks are awesome! I learned about them at ICML 2016 (
http://kaiminghe.com/icml16tutorial/index.html). Kaiming He's exposition
was fantastically clear. I used them in my own attempts at training neural
networks for move prediction. It's fairly easy to train something with 20
layer
Yes, it seems really odd that they didn't add a plane of all ones. The
"heads" have weights that depend on the location of the board, but all the
other layers can't tell the difference between a lonely stone at (1,1) and
one at (3,3).
In my own experiments (trying to predict human moves) I found t
This is a quick check of my understanding of the network architecture.
Let's count the number of parameters in the model:
* convolutional block: (17*9+1)*256 + 2*256
[ 17 = number of input channels
9 = size of the 3x3 convolution window
1 = bias (I am not sure this is needed if you are going
It might be a mistake, but on page 30 the paper has a formula for Elo that
is off by a factor of log(10) = 2.3026 with respect to the standard
formula, which means their Elo differences might be inflated. But I suspect
they just meant to have used "10^" instead of "exp" on the paper, and they
proba
A link to the paper (from the blog post):
https://deepmind.com/documents/119/agz_unformatted_nature.pdf
Enjoy!
Álvaro.
On Wed, Oct 18, 2017 at 2:29 PM, Richard Lorentz
wrote:
> Wow! That's very exciting. I'm glad they didn't completely shelve the
> project as they implied they might do after t
When TensorFlow was first released I used it to implement a CNN for move
prediction and evaluation, and I requested the addition of a function to
implement chain pooling: https://github.com/tensorflow/tensorflow/issues/549
It's now implemented here:
https://www.tensorflow.org/api_docs/cc/class/ten
Eventually exploring the entire tree is what I would call "mathematically
sound", meaning that given enough time the algorithm is guaranteed to play
optimally. I would reserve "brute force" for algorithms that simply search
every possible variant exhaustively, like John Tromp's connect 4 program
Fh
It was already a stretch when people said that Deep Blue was a brute-force
searcher. If we apply it to AlphaGo as well, the term just means nothing.
Full-width and brute-force are most definitely not the same thing.
Álvaro.
On Sun, Aug 6, 2017 at 2:20 PM, Brian Sheppard via Computer-go <
comput
No, it is not possible to solve go on a 19x19 board. The closest we have is
5x5, I believe. We have a pretty good idea what optimal play looks like on
7x7. The difficulty of finding optimal play on large boards is
unfathomable.
Álvaro.
On Sun, Aug 6, 2017 at 10:06 AM Cai Gengyang wrote:
> Is A
I agree with you. It makes no sense. You'll take whatever linear
combinations you want and they'll all be zero.
Álvaro.
On Tue, Jul 18, 2017 at 6:53 AM, Brian Lee
wrote:
> I've been wondering about something I've seen in a few papers (AlphaGo's
> paper, Cazenave's resnet policy architecture),
On Tue, May 23, 2017 at 4:51 AM, Hideki Kato wrote:
> (3) CNN cannot learn exclusive-or function due to the ReLU
> activation function, instead of traditional sigmoid (tangent
> hyperbolic). CNN is good at approximating continuous (analog)
> functions but Boolean (digital) ones.
>
Oh, not this
AlphaGo as white won by 0.5 points.
On Tue, May 23, 2017 at 3:00 AM, Jim O'Flaherty
wrote:
> The announcer didn't have her mic on, so I couldn't hear the final score
> announced...
>
> So, what was the final score after the counting of AlphaGo-vs-Ke Jie Game
> #1?
>
>
> ___
ve, like self-atari, filling eye, breaking seki,
> if those moves do not change the result.
> So it is maybe not good at ownership map.
>
> Thanks,
> Hiroshi Yamashita
>
>
> - Original Message - From: "Álvaro Begué"
> To: "computer-go"
> Sent
Dear Yamashita,
This is a great resource and I am very thankful that you made it available
to all of us.
One minor issue: It looks like "remove all dead stones" doesn't always
work. An example is 206_19_0114_2k_r16_add300_7/20170312_0018_05484.sgf ,
where a black stone at P18 is left on the board
For identifying points that look like eyes, it's useful to have a 16-bit
value at each position of the board that contains the colors of the 8
neighbors (2 bits per neighbor, with an encoding like 00=empty, 01=black,
10=white, 11=outside). You can maintain this incrementally when a point on
the boa
Thank you, Gian-Carlo. I couldn't have said it better.
Álvaro.
On Wed, Mar 22, 2017 at 7:07 AM, Gian-Carlo Pascutto wrote:
> On 22-03-17 09:41, Darren Cook wrote:
> >> The issue with Japanese rules is easily solved by refusing to play
> >> under ridiculous rules. Yes, I do have strong opinion
I was thinking the same thing. You can easily equip the value network with
several outputs, corresponding to several settings of komi, then train as
usual.
The issue with Japanese rules is easily solved by refusing to play under
ridiculous rules. Yes, I do have strong opinions. :)
Álvaro.
On T
Oh, you are using a value net? How did you train it? I don't see anything
about it in the bitbucket repository...
Álvaro.
P.S.- Sorry about the thread hijacking, everyone.
On Sat, Mar 4, 2017 at 4:29 AM, Detlef Schmicker wrote:
> I looked into this too:
>
> oakfoam would not benefit a lot fro
I should point out that Reinforcement Learning is a relatively unimportant
part of AlphaGo, according to the paper. They only used it to turn the
move-prediction network into a stronger player (presumably increasing the
weights of the layer before SoftMax would do most of the job, by making the
hig
Thanks, Rémi!
-- Forwarded message --
From: Rémi Coulom
Date: Sun, Feb 12, 2017 at 4:24 AM
Subject: Playout policy optimization
To: Álvaro Begué
Hi Alvaro,
I cannot post to the list any more. Please forward this message to the list
if you can.
Your idea is simulation
Hi,
I remember an old paper by Rémi Coulom ("Computing Elo Ratings of Move
Patterns in the Game of Go") where he computed "gammas" (exponentials of
scores that you could feed to a softmax) for different move features, which
he fit to best explain the move probabilities from real games.
Similarly,
If you like video commentary, Haylee has five game reviews, starting with
this one: https://www.youtube.com/watch?v=b_24iaUMRFs&t=1109s
You may also enjoy this lecture (probably best for kyu players):
https://www.youtube.com/watch?v=v8Eh41m7gVA (you may want to skip to around
9:00).
Enjoy,
Álvaro
If you are killed by an AI-driven car, the manufacturer will use the case
to improve the algorithm and make sure that this type of death never
happens again. Unfortunately a death by a drunk driver doesn't seem to
teach anyone anything and will keep happening as long as people need to
drive and alc
Would you be willing to make the executable of mfgo1998 available so we can
run it locally? Or even better, something with the same engine but which
speaks GTP?
Álvaro.
On Wed, Jan 4, 2017 at 10:32 PM, "Ingo Althöfer" <3-hirn-ver...@gmx.de>
wrote:
> Like in every year, the reminder on my Han
On Sun, Dec 11, 2016 at 4:50 PM, Rémi Coulom wrote:
> It makes the policy stronger because it makes it more deterministic. The
> greedy policy is way stronger than the probability distribution.
>
I suspected this is what it was mainly about. Did you run any experiments
to see if that explains th
Start by computing a "normal" amount of time to spend, using the kinds of
rules described by others in this thread.
Since you are using MCTS, you may want to experiment with spending more
time if the move with the best score is not the one that has been explored
the most, since that probably signa
One could use a curve to map the MC winning rate to an actual winning
probability. It would take only thousands of games to learn such a curve
(as opposed to the 30 million games used to train the value network in
AlphaGo).
Álvaro.
On Wed, Aug 31, 2016 at 8:24 PM, Dan Schmidt wrote:
> Hi Andrew
There are situations where carefully crafting the minibatches makes sense.
For instance, if you are training an image classifier it is good to build
the minibatches so the classes are evenly represented. In the case of
predicting the next move in go I don't expect this kind of thing will make
much
I don't understand the point of using the deeper network to train the
shallower one. If you had enough data to be able to train a model with many
parameters, you have enough to train a model with fewer parameters.
Álvaro.
On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
michael.marke...@gmail
Disclaimer: I haven't actually implemented MCTS with NNs, but I have played
around with both techniques.
Would it make sense to artificially scale down the values before the
SoftMax is applied, so the probability distribution is not as concentrated,
and unlikely moves are not penalized as much?
I just saw the video here: https://www.youtube.com/watch?v=ZdrV2H5zIOM
It's fun to hear the pro making comments as she goes. I had hoped for a
better game, though.
Any comments from the CS camp?
Thanks,
Álvaro.
On Mon, May 16, 2016 at 3:58 AM, Xavier Combelle
wrote:
> That's fantastic
>
> I
What are you doing that uses so much disk space? An extremely naive
computation of required space for what you are doing is:
30M samples * (42 input planes + 1 output plane)/sample * 19*19
floats/plane * 4 bytes/float = 1.7 TB
So that's cutting it close, But I think the inputs and outputs are all
Hi,
I also did computer chess before go (and checkers before chess). I would
start with a straight-forward implementation and learn with it. If you end
up finding your board representation limiting, rewrite it.
Here's some code from my program:
int const N = 19;
int const XN = N + 2;
int const X
A very simple-minded way of trying to identify what a particular neuron in
the upper layers is doing is to find the 50 positions in the database that
make it produce the highest activation values. If the neuron is in one of
the convolutional layers, you get a full 19x19 image of activation values,
> no lack of respect for DeepMind's achievement was contained in my
> posting; on the contrary, i was as surprised as anyone at how well she
> did and it gave me great pause for thought.
>
Well, you wrote this:
> but convolutional neural networks and monte-carlo simulators have not
> advanced the
I have used TensorFlow to train a CNN that predicts the next move, with a
similar architecture to what others have used (1 layers of 5x5 convolutions
followed by 10 more layers of 3x3 convolutions, with 192 hidden units per
layer and ReLU activation functions) but with much simpler inputs. I found
On Tue, Mar 22, 2016 at 1:40 PM, Nick Wedd wrote:
> On 22 March 2016 at 17:20, Álvaro Begué wrote:
>
>> A very simple-minded analysis is that, if the null hypothesis is that
>> AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we
>> observed or be
A very simple-minded analysis is that, if the null hypothesis is that
AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we
observed or better 15.625% of the time. That's a p-value that even social
scientists don't get excited about. :)
Álvaro.
On Tue, Mar 22, 2016 at 12:48 PM
Actually the DCNN plays on 9x9 acceptably well (somewhere in the
single-digit kyus).
On Friday, March 18, 2016, Benjamin Teuber wrote:
> This is really cool. Now it just needs to learn 9x9 via reinforcement
> learning ;-)
>
> Josef Moudrik > schrieb am Fr.,
> 18. März 2016 10:21:
>
>> Aha! Than
A while back somebody posted a link to a browser implementation of a DCNN:
https://chrisc36.github.io/deep-go/
Would something like that do?
Álvaro.
On Wed, Mar 16, 2016 at 4:44 PM, Benjamin Teuber wrote:
> Hi everyone,
>
> for a Go beginner website I would like to have a bot that runs in
>
I have experimented with a CNN that predicts ownership, but I found it to
be too weak to be useful. The main difference between what Google did and
what I did is in the dataset used for training: I had tens of thousands of
games (I did several different experiments) and I used all the positions
fro
You could express the intended move as a pair of real numbers. A random
offset is then added, following some probability distribution (Gaussian, or
uniform in a disk of a certain radius, or ...), and then the result is
rounded to the nearest point of integer coordinates. What possibilities
does thi
tion error of the eventual
> result.
>
> >
>
> >I'm currently not able to look at the paper, but couldn't you use a
>
> >softmax output layer with two nodes and take the probability
>
> >distribution as winrate?
>
> >
>
> >On Thu, Feb 4, 20
I am not sure how exactly they define MSE. If you look at the plot in
figure 2b, the MSE at the very beginning of the game (where you can't
possibly know anything about the result) is 0.50. That suggests it's
something else than your [very sensible] interpretation.
Álvaro.
On Thu, Feb 4, 2016 a
sed data set:
> in the referred chapter they state, they have used their kgs dataset
> in a first try (which is in another part of the paper referred to
> being a 6d+ data set).
>
> Am 04.02.2016 um 18:11 schrieb Álvaro Begué:
> > The positions they used are not from high-quality gam
The positions they used are not from high-quality games. They actually
include one last move that is completely random.
Álvaro.
On Thursday, February 4, 2016, Detlef Schmicker wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Hi,
>
> I try to reproduce numbers from section 3: traini
I searched for the file name on the web and found this copy:
http://airesearch.com/wp-content/uploads/2016/01/deepmind-mastering-go.pdf
Álvaro.
On Wed, Feb 3, 2016 at 4:37 AM, Oliver Lewis wrote:
> Is the paper still available for download? The direct link appears to be
> broken.
>
> Thanks
>
Aja,
I read the paper with great interest. [Insert appropriate praises here.]
I am trying to understand the part where you use reinforcement learning to
improve upon the CNN trained by imitating humans. One thing that is not
explained is how to determine that a game is over, particularly when a
p
How about you read the paper first? The conversation would make much more
sense if you actually spent some time trying to understand the details of
what they did. :) <-- (mandatory smiley to indicate I am not upset or
anything)
On Sun, Jan 31, 2016 at 10:20 AM, Greg Schmidt
wrote:
> The articl
It's in the paper: "ladder capture" and "ladder escape" are features that
are fed as inputs into the CNN.
Álvaro.
On Wed, Jan 27, 2016 at 6:03 PM, Ryan Grant wrote:
> To the authors: Did the deep-NN architecture learn ladders on its own,
> or was any extra ladder-evaluation code added to the
Yes, it has been:
http://computer-go.org/pipermail/computer-go/2015-November/008267.html
Are there any news on Google's efforts?
Álvaro.
On Wed, Jan 27, 2016 at 10:10 AM, Richard Lorentz
wrote:
> Not sure if this has been posted here already or not:
> http://arxiv.org/abs/1511.06410
>
>
14 PM, Petr Baudis wrote:
> Hi!
>
> On Fri, Jan 15, 2016 at 04:54:18PM -0500, Álvaro Begué wrote:
> > In their standard configuration, MCTS engines will sometimes let lots of
> > groups die after they know the game is hopeless, or if they have a large
> > advantage
I understand that using games from humans to learn about life and death
introduces all sorts of biases. That's why I tried to use games from an
engine instead.
In their standard configuration, MCTS engines will sometimes let lots of
groups die after they know the game is hopeless, or if they have
> Regarding 9x9, I believe Alvaro Begue has explored this idea in a way
> which perhaps would work better in a go engine. He used pachi to generate a
> database of games by playing against itself and then trained a model in a
> similar fashion to what I did. I'm not sure about the results of his
>
I would much rather use a forum.
Álvaro.
On Fri, Jan 1, 2016 at 8:45 PM, Igor Polyakov
wrote:
> I think most people would rather use a forum than a mailing list. I myself
> didn't sign up for the list until I actually created a separate account for
> it. But if nobody posts on it other than t
nger playouts don't necessarily lead
> to an better evaluation function? (Yes, that what playouts essential are, a
> dynamic evaluation function.) This is even under the assumption that we can
> reach the same number of playouts per move.
>
>
> On 08 Dec 2015, at 10:21, Álvaro
I don't think the CPU-GPU communication is what's going to kill this idea.
The latency in actually computing the feed-forward pass of the CNN is going
to be in the order of 0.1 seconds (I am guessing here), which means
finishing the first playout will take many seconds.
So perhaps it would be inte
After reading the relevant code, I realized that val_scale=1.0 should do
precisely what I wanted. I have tested it a bit, and so far so good.
Thanks!
Álvaro.
On Tue, Nov 17, 2015 at 7:12 AM, Petr Baudis wrote:
> Hi!
>
> On Tue, Nov 17, 2015 at 07:05:34AM -0500, Álvaro Begué wrot
I wouldn't say they are "not compatible", since the move that maximizes
score is always in the top class (win>draw>loss) for any setting of komi.
You probably mean it in a practical sense, in that MCTS engines are
stronger when maximizing win probability.
I am more interested in attempting to maxi
n do by just looking at a group, with no search. But
it is very important that groups that are clearly dead end up dying and
groups that are clearly alive end up living.
On Tue, Nov 17, 2015 at 7:12 AM, Petr Baudis wrote:
> Hi!
>
> On Tue, Nov 17, 2015 at 07:05:34AM -0500, Álvaro Beg
Thanks for your answer.
Unfortunately Pachi doesn't seem to really try to maximize score, even with
these settings: Once one side has won by a large enough margin, it will
stop trying to kill small groups and I am precisely trying to generate a
database to learn about life and death. Perhaps I can
Hi,
I am trying to create a database of games to do some machine-learning
experiments. My requirements are:
* that all games be played by the same strong engine on both sides,
* that all games be played to the bitter end (so everything on the board
is alive at the end), and
* that both sides pl
Normalizing the probabilities and re-throwing the frisbee until it lands in
a valid move are equivalent, of course.
On Thu, Nov 12, 2015 at 5:01 AM, David Peters wrote:
> To keep changes to the protocol and number of parameters low, wouldn't it
> be a possibility to consider multiple 'throws'
Oh! You can have a continuous handicap control by giving the players
different epsilons. :)
On Wed, Nov 11, 2015 at 2:25 PM, John Tromp wrote:
> >> Would the game end after two unintentional passes?
>
> > Good point. In principle I would say so.
>
> That makes little sense to me.
> IMO, the pr
1/5 also seems natural (equal chance of hitting each the 5 possible points).
Álvaro.
On Wed, Nov 11, 2015 at 10:08 AM, John Tromp wrote:
> > By the way: It would also be necessary to decide about
> > the eps for the event. Natural candidates would be
> > eps=0.1 or eps=0.125.
>
> I would say
I get 1107 (954 in the middle + 135 on the edge + 18 on a corner).
Álvaro.
On Tue, Nov 3, 2015 at 2:00 PM, Detlef Schmicker wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Thanks, but I need them reduced by reflection and rotation symmetries
> (and leave the center empty so 3^8 +
If you're only getting 1000 table generations a second, you should look
> into your algorithm. You should get at least 100,000 table generations a
> second!
>
>
> On 2015-10-16 7:21, Álvaro Begué wrote:
>
> That sounds kind of obsessive. I think the probability of having a 0
uring each bit has a similar ratio of occurrences,
> or bruteforcing it.
>
> On 16/10/2015 14:51, Álvaro Begué wrote:
>
>> Btw does anyone have a good initialization vector for the Zobrist table?
>>>
>> The obvious thing to try is random numbers. Another idea is turnin
> Btw does anyone have a good initialization vector for the Zobrist table?
The obvious thing to try is random numbers. Another idea is turning your
Zobrist key into CRC64, which I think is what you get if you generate your
numbers like this:
#include
int main() {
unsigned long long const P =
Could you please stop posting your videos to this list? I find nothing of
value in them. If others disagree, please speak up.
Álvaro.
On Thu, Sep 3, 2015 at 11:31 PM, djhbrown . wrote:
>
>
> https://www.youtube.com/watch?v=IoO7Nhlf_k4&list=PL4y5WtsvtduqNW0AKlSsOdea3Hl1X_v-S&index=10
>
> Pleas
If your PRNG is consuming 40% of your CPU time, your playouts are too light.
Anyway, it's very easy to make a fast PRNG these days. The first thing that
comes to mind is a 64-bit linear congruential generator of which you use
the middle bits, or you can XOR the high 32 bits and the low 32 bits
tog
One normally checks superko in the UCT tree but not in the playouts.
On Sat, Mar 28, 2015 at 11:53 AM, hughperkins2
wrote:
> Oh wait, superko check is not that cheap, but it is so rare, you can
> probably ignore it in playouts, and jist check befote submitting a move to
> the server. If its su
I am not sure I understand the question. The only thing that is typically
not checked in the playouts is superko. What other "validity checks" are
you performing?
Álvaro.
On Sat, Mar 28, 2015 at 9:54 AM, holger krekel wrote:
> On Sat, Mar 28, 2015 at 08:51 +0100, folkert wrote:
> > Hi,
> >
>
On Fri, Mar 20, 2015 at 8:24 PM, Hugh Perkins wrote:
> On 1/12/15, Álvaro Begué wrote:
> > A CNN that starts with a board and returns a single number will typically
> > have a few fully-connected layers at the end. You could make the komi an
> > extra input in the first on
The human brain is not the most powerful AI, because it fails the "A" test.
I suspect bootstrapping is not very hard. I have recently written a Spanish
checkers program starting with no knowledge and I got it to play top-human
level checkers within a few weeks.
You can build a database of games a
You can keep track of "pseudo-liberties", where you count a liberty
multiple times if it is adjacent to multiple stones in a chain. That seems
to be the easiest way to implement it, although a serious program will
eventually need the actual liberty count, so perhaps you should just do
that from the
Ko is not missing: It is a particular case of the prohibition to repeat
positions. Making suicide illegal is an easy patch.
Álvaro.
On Wed, Mar 11, 2015 at 7:08 AM, folkert wrote:
> Hi,
>
> After 3 years of not working on my Go software, I decided to tinker
> again a bit on it.
> First thing
1 - 100 of 242 matches
Mail list logo