Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
I haven't thought clearly about the 7x7 case, but on 19x19 I think it would suffer both challenges -- you'd count dead stuff as alive quite frequently, and because you're pruning the game ending early you might be getting wrong who has actually won. That's why some people use less ambiguous definit

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Oh, I see. I believe I am, in fact, using Tromp-Taylor rules for scoring. I was unaware that that's what it was called. ‐‐‐ Original Message ‐‐‐ On Sunday, December 9, 2018 10:09 PM, cody2007 wrote: > Sorry, just to make sure I understand: your concern is the network may be > learning

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Sorry, just to make sure I understand: your concern is the network may be learning from the scoring system rather than through the self-play? Or are you concerned the scoring is giving sub-par evaluations of games? The scoring I use is to simply count the number of stones each player has on the

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
Imagine that your score estimator has a better idea about the outcome of the game than the players themselves. Then you can build a stronger computer player with the following algorithm: use the score estimator to pick the next move after evaluating all legal moves, by evaluating their after-move

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
>By the way, why only 40 moves? That seems like the wrong place to economize, >but maybe on 7x7 it's fine? I haven't implemented any resign mechanism, so felt it was a reasonable balance to at least see where the players roughly stand. Although, I think I errored on too few turns. >A "scoring e

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
A "scoring estimate" by definition should be weaker than the computer players it's evaluating until there are no more captures possible. Yes? s. On Sun, Dec 9, 2018, 5:49 PM uurtamo By the way, why only 40 moves? That seems like the wrong place to > economize, but maybe on 7x7 it's fine? > > s.

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
By the way, why only 40 moves? That seems like the wrong place to economize, but maybe on 7x7 it's fine? s. On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go < computer-go@computer-go.org wrote: > Thanks for your comments. > > >looks you made it work on a 7x7 19x19 would probably give better

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread Dani
Thanks for the tutorial! I have some questions about training a) Do you use Dirichlet noise during training, if so is it limited to first 30 or so plies ( which is the opening phase of chess) ? The alphazero paper is not clear about it. b) Do you need to shuffle batches if you are doing one epoch

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Thanks for your comments. >looks you made it work on a 7x7 19x19 would probably give better result >especially against yourself if you are a complete novice I'd expect that'd make me win even more against the algorithm since it would explore a far smaller amount of the search space, right? Certa

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread Xavier Combelle
looks you made it work on a 7x7 19x19 would probably give better result especially against yourself if you are a complete novice for not cheating against gnugo, use --play-out-aftermath of gnugo parameter If I don't mistake a competitive ai would need a lot more training such what does leela zero

[Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Hi all, I've posted an implementation of the AlphaZero algorithm and brief tutorial. The code runs on a single GPU. While performance is not that great, I suspect its mostly been limited by hardware limitations (my training and evaluation has been on a single Titan X). The network can beat GNU