Re: [Computer-go] New paper by DeepMind

2018-12-09 Thread uurtamo
Thank you for this clarification,

s.

On Sat, Dec 8, 2018, 6:51 PM 甲斐徳本  Those are the points not well understood commonly.
>
> A patent application does two things.  1. Apply for an eventual granting
> of the patent, 2. Makes what's described in it a public knowledge as of the
> date of the filing.
> Patent may be functionally meaningless.  There may be no one to sue.  And
> these are huge issues for the point No.1.  However, a strategic patent
> applicants file patent applications for the point No.2 to deny any
> possibility of somebody else obtaining a patent.  (A public knowledge
> cannot be patented.)
>
> Many companies are trying to figure out how to patent DCNN based AI, and
> Google may be saying "Nope, as long as it is like the DeepMind method, you
> can't patent it."   Google is likely NOT saying "We are hoping to obtain
> the patent, and intend to enforce it."
>
> Despite many differences in patent law from a country to another, two
> basic purposes of patent are universal: 1. To protect the inventor, and 2.
> To promote the use of inventions by making the details a public knowledge.
>
>
>
>
> On Sat, Dec 8, 2018 at 12:47 AM uurtamo  wrote:
>
>> What I'm saying is that the patent is functionally meaningless. Who is
>> there to sue?
>>
>> Moreover, there is no enforceable patent on the broad class of algorithms
>> that could reproduce these results. No?
>>
>> s.
>>
>> On Fri, Dec 7, 2018, 4:16 AM Jim O'Flaherty > wrote:
>>
>>> Tysvm for the clarification, Tokumoto.
>>>
>>> On Thu, Dec 6, 2018, 8:02 PM 甲斐徳本 >>
 What's insane about it?
 To me, what Jim O'Flaherty stated is common sense in the field of
 patents, and any patent attorney would attest to that.  If I may add, Jim's
 last sentence should read "Google's patent application" instead of
 "Google's patent".  The difference is huge, and this may be in the heart of
 the issue, which is not well understood by the general public.

 In other words, thousands of patent applications are filed in the world
 without any hope of the patent eventually being granted, to establish
 "prior art" thereby protecting what's described in it from being patented
 by somebody else.

 Or, am I responding to a troll?

 Tokumoto


 On Fri, Dec 7, 2018 at 10:01 AM uurtamo  wrote:

> You're insane.
>
> On Thu, Dec 6, 2018, 4:13 PM Jim O'Flaherty <
> jim.oflaherty...@gmail.com wrote:
>
>> Remember, patents are a STRATEGIC mechanism as well as a legal
>> mechanism. As soon as a patent is publically filed (for example, as
>> utility, and following provisional), the text and claims in the patent
>> immediately become prior art globally as of the original filing date
>> REGARDLESS of whether the patent is eventually approved or rejected. 
>> IOW, a
>> patent filing is a mechanism to ensure no one else can make a similar 
>> claim
>> without risking this filing being used as a possible prior art 
>> refutation.
>>
>> I know this only because it is a strategy option my company is using
>> in an entirely different unrelated domain. The patent filing is defensive
>> such that someone else cannot make a claim and take our inventions away
>> from us just because the coincidentally hit near our inventions.
>>
>> So considering Google's past and their participation in the OIN, it
>> is very likely Google's patent is ensuring the ground all around this 
>> area
>> is sufficiently salted to stop anyone from attempting to exploit nearby
>> patent claims.
>>
>>
>> Respectfully,
>>
>> Jim O'Flaherty
>>
>>
>> On Thu, Dec 6, 2018 at 5:44 PM Erik van der Werf <
>> erikvanderw...@gmail.com> wrote:
>>
>>> On Thu, Dec 6, 2018 at 11:28 PM Rémi Coulom 
>>> wrote:
>>>
 Also, the AlphaZero algorithm is patented:

 https://patentscope2.wipo.int/search/en/detail.jsf?docId=WO2018215665

>>>
>>> So far it just looks like an application (and I don't think it will
>>> be be difficult to oppose, if you care about this)
>>>
>>> Erik
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

 ___
 Computer-go mailing list
 Computer-go@computer-go.org
 http://computer-go.org/mailman/listinfo/computer-go
>>>
>>> ___
>>> Compu

Re: [Computer-go] New paper by DeepMind

2018-12-09 Thread Jim O'Flaherty
Tysvm for your excellent explanation.

And now you can see why I mentioned Google's being a member of OIN as a
critical distinction. It strongly increases the weight of 2. And implicitly
reduces the motivation for 1.


On Sat, Dec 8, 2018, 8:51 PM 甲斐徳本  Those are the points not well understood commonly.
>
> A patent application does two things.  1. Apply for an eventual granting
> of the patent, 2. Makes what's described in it a public knowledge as of the
> date of the filing.
> Patent may be functionally meaningless.  There may be no one to sue.  And
> these are huge issues for the point No.1.  However, a strategic patent
> applicants file patent applications for the point No.2 to deny any
> possibility of somebody else obtaining a patent.  (A public knowledge
> cannot be patented.)
>
> Many companies are trying to figure out how to patent DCNN based AI, and
> Google may be saying "Nope, as long as it is like the DeepMind method, you
> can't patent it."   Google is likely NOT saying "We are hoping to obtain
> the patent, and intend to enforce it."
>
> Despite many differences in patent law from a country to another, two
> basic purposes of patent are universal: 1. To protect the inventor, and 2.
> To promote the use of inventions by making the details a public knowledge.
>
>
>
>
> On Sat, Dec 8, 2018 at 12:47 AM uurtamo  wrote:
>
>> What I'm saying is that the patent is functionally meaningless. Who is
>> there to sue?
>>
>> Moreover, there is no enforceable patent on the broad class of algorithms
>> that could reproduce these results. No?
>>
>> s.
>>
>> On Fri, Dec 7, 2018, 4:16 AM Jim O'Flaherty > wrote:
>>
>>> Tysvm for the clarification, Tokumoto.
>>>
>>> On Thu, Dec 6, 2018, 8:02 PM 甲斐徳本 >>
 What's insane about it?
 To me, what Jim O'Flaherty stated is common sense in the field of
 patents, and any patent attorney would attest to that.  If I may add, Jim's
 last sentence should read "Google's patent application" instead of
 "Google's patent".  The difference is huge, and this may be in the heart of
 the issue, which is not well understood by the general public.

 In other words, thousands of patent applications are filed in the world
 without any hope of the patent eventually being granted, to establish
 "prior art" thereby protecting what's described in it from being patented
 by somebody else.

 Or, am I responding to a troll?

 Tokumoto


 On Fri, Dec 7, 2018 at 10:01 AM uurtamo  wrote:

> You're insane.
>
> On Thu, Dec 6, 2018, 4:13 PM Jim O'Flaherty <
> jim.oflaherty...@gmail.com wrote:
>
>> Remember, patents are a STRATEGIC mechanism as well as a legal
>> mechanism. As soon as a patent is publically filed (for example, as
>> utility, and following provisional), the text and claims in the patent
>> immediately become prior art globally as of the original filing date
>> REGARDLESS of whether the patent is eventually approved or rejected. 
>> IOW, a
>> patent filing is a mechanism to ensure no one else can make a similar 
>> claim
>> without risking this filing being used as a possible prior art 
>> refutation.
>>
>> I know this only because it is a strategy option my company is using
>> in an entirely different unrelated domain. The patent filing is defensive
>> such that someone else cannot make a claim and take our inventions away
>> from us just because the coincidentally hit near our inventions.
>>
>> So considering Google's past and their participation in the OIN, it
>> is very likely Google's patent is ensuring the ground all around this 
>> area
>> is sufficiently salted to stop anyone from attempting to exploit nearby
>> patent claims.
>>
>>
>> Respectfully,
>>
>> Jim O'Flaherty
>>
>>
>> On Thu, Dec 6, 2018 at 5:44 PM Erik van der Werf <
>> erikvanderw...@gmail.com> wrote:
>>
>>> On Thu, Dec 6, 2018 at 11:28 PM Rémi Coulom 
>>> wrote:
>>>
 Also, the AlphaZero algorithm is patented:

 https://patentscope2.wipo.int/search/en/detail.jsf?docId=WO2018215665

>>>
>>> So far it just looks like an application (and I don't think it will
>>> be be difficult to oppose, if you care about this)
>>>
>>> Erik
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go

 ___
 

Re: [Computer-go] New paper by DeepMind

2018-12-09 Thread David Doshay via Computer-go
Another very important aspect of this discussion is that the US patent office 
changed to a ‘first to file’ method of prioritizing patent rights. This 
encouraged several patent trolls to try to undercut the true inventors. So, it 
is now more important to file for defensive purposes just to assure that deep 
pockets like Alpha do not have to pay royalties to others for their own 
inventions. 

Many years ago when I worked at NASA we were researching doing a patent filing 
for an image processing technique so that we could release it for public domain 
use. We found that someone successfully got a patent for using a bitmap to 
represent a black-and-white image! It may indeed have been possible and 
successful to argue in court that this is obvious to anyone in the industry and 
thus should not be granted a patent, but it would be costly and a bother to 
have to do so. Likewise for a deep pocket like Alpha who would be an obvious 
target for patent trolling if they did not get this technique labeled as public 
knowledge quickly enough.

Cheers,
David

> On Dec 9, 2018, at 8:30 AM, Jim O'Flaherty  wrote:
> 
> Tysvm for your excellent explanation.
> 
> And now you can see why I mentioned Google's being a member of OIN as a 
> critical distinction. It strongly increases the weight of 2. And implicitly 
> reduces the motivation for 1.
> 
> 
>> On Sat, Dec 8, 2018, 8:51 PM 甲斐徳本 > Those are the points not well understood commonly.
>> 
>> A patent application does two things.  1. Apply for an eventual granting of 
>> the patent, 2. Makes what's described in it a public knowledge as of the 
>> date of the filing.  
>> Patent may be functionally meaningless.  There may be no one to sue.  And 
>> these are huge issues for the point No.1.  However, a strategic patent 
>> applicants file patent applications for the point No.2 to deny any 
>> possibility of somebody else obtaining a patent.  (A public knowledge cannot 
>> be patented.)
>> 
>> Many companies are trying to figure out how to patent DCNN based AI, and 
>> Google may be saying "Nope, as long as it is like the DeepMind method, you 
>> can't patent it."   Google is likely NOT saying "We are hoping to obtain the 
>> patent, and intend to enforce it."
>> 
>> Despite many differences in patent law from a country to another, two basic 
>> purposes of patent are universal: 1. To protect the inventor, and 2. To 
>> promote the use of inventions by making the details a public knowledge.
>> 
>> 
>> 
>> 
>>> On Sat, Dec 8, 2018 at 12:47 AM uurtamo  wrote:
>>> What I'm saying is that the patent is functionally meaningless. Who is 
>>> there to sue?
>>> 
>>> Moreover, there is no enforceable patent on the broad class of algorithms 
>>> that could reproduce these results. No?
>>> 
>>> s.
>>> 
 On Fri, Dec 7, 2018, 4:16 AM Jim O'Flaherty >>> wrote:
 Tysvm for the clarification, Tokumoto.
 
> On Thu, Dec 6, 2018, 8:02 PM 甲斐徳本  What's insane about it?
> To me, what Jim O'Flaherty stated is common sense in the field of 
> patents, and any patent attorney would attest to that.  If I may add, 
> Jim's last sentence should read "Google's patent application" instead of 
> "Google's patent".  The difference is huge, and this may be in the heart 
> of the issue, which is not well understood by the general public.
> 
> In other words, thousands of patent applications are filed in the world 
> without any hope of the patent eventually being granted, to establish 
> "prior art" thereby protecting what's described in it from being patented 
> by somebody else.
> 
> Or, am I responding to a troll?
> 
> Tokumoto
> 
> 
>> On Fri, Dec 7, 2018 at 10:01 AM uurtamo  wrote:
>> You're insane.
>> 
>>> On Thu, Dec 6, 2018, 4:13 PM Jim O'Flaherty >> wrote:
>>> Remember, patents are a STRATEGIC mechanism as well as a legal 
>>> mechanism. As soon as a patent is publically filed (for example, as 
>>> utility, and following provisional), the text and claims in the patent 
>>> immediately become prior art globally as of the original filing date 
>>> REGARDLESS of whether the patent is eventually approved or rejected. 
>>> IOW, a patent filing is a mechanism to ensure no one else can make a 
>>> similar claim without risking this filing being used as a possible 
>>> prior art refutation.
>>> 
>>> I know this only because it is a strategy option my company is using in 
>>> an entirely different unrelated domain. The patent filing is defensive 
>>> such that someone else cannot make a claim and take our inventions away 
>>> from us just because the coincidentally hit near our inventions.
>>> 
>>> So considering Google's past and their participation in the OIN, it is 
>>> very likely Google's patent is ensuring the ground all around this area 
>>> is sufficiently salted to stop anyone from attempting to exploit nearby 
>>> patent cl

Re: [Computer-go] New paper by DeepMind

2018-12-09 Thread uurtamo
So published prior art isn't a defense? It's pretty widely publicized what
they did and how.

The problem I have with most tech patents is when they're overly broad.

s.

On Sun, Dec 9, 2018, 9:11 AM David Doshay via Computer-go <
computer-go@computer-go.org wrote:

> Another very important aspect of this discussion is that the US patent
> office changed to a ‘first to file’ method of prioritizing patent rights.
> This encouraged several patent trolls to try to undercut the true
> inventors. So, it is now more important to file for defensive purposes just
> to assure that deep pockets like Alpha do not have to pay royalties to
> others for their own inventions.
>
> Many years ago when I worked at NASA we were researching doing a patent
> filing for an image processing technique so that we could release it for
> public domain use. We found that someone successfully got a patent for
> using a bitmap to represent a black-and-white image! It may indeed have
> been possible and successful to argue in court that this is obvious to
> anyone in the industry and thus should not be granted a patent, but it
> would be costly and a bother to have to do so. Likewise for a deep pocket
> like Alpha who would be an obvious target for patent trolling if they did
> not get this technique labeled as public knowledge quickly enough.
>
> Cheers,
> David
>
> On Dec 9, 2018, at 8:30 AM, Jim O'Flaherty 
> wrote:
>
> Tysvm for your excellent explanation.
>
> And now you can see why I mentioned Google's being a member of OIN as a
> critical distinction. It strongly increases the weight of 2. And implicitly
> reduces the motivation for 1.
>
>
> On Sat, Dec 8, 2018, 8:51 PM 甲斐徳本 
>> Those are the points not well understood commonly.
>>
>> A patent application does two things.  1. Apply for an eventual granting
>> of the patent, 2. Makes what's described in it a public knowledge as of the
>> date of the filing.
>> Patent may be functionally meaningless.  There may be no one to sue.  And
>> these are huge issues for the point No.1.  However, a strategic patent
>> applicants file patent applications for the point No.2 to deny any
>> possibility of somebody else obtaining a patent.  (A public knowledge
>> cannot be patented.)
>>
>> Many companies are trying to figure out how to patent DCNN based AI, and
>> Google may be saying "Nope, as long as it is like the DeepMind method, you
>> can't patent it."   Google is likely NOT saying "We are hoping to obtain
>> the patent, and intend to enforce it."
>>
>> Despite many differences in patent law from a country to another, two
>> basic purposes of patent are universal: 1. To protect the inventor, and 2.
>> To promote the use of inventions by making the details a public knowledge.
>>
>>
>>
>>
>> On Sat, Dec 8, 2018 at 12:47 AM uurtamo  wrote:
>>
>>> What I'm saying is that the patent is functionally meaningless. Who is
>>> there to sue?
>>>
>>> Moreover, there is no enforceable patent on the broad class of
>>> algorithms that could reproduce these results. No?
>>>
>>> s.
>>>
>>> On Fri, Dec 7, 2018, 4:16 AM Jim O'Flaherty >> wrote:
>>>
 Tysvm for the clarification, Tokumoto.

 On Thu, Dec 6, 2018, 8:02 PM 甲斐徳本 >>>
> What's insane about it?
> To me, what Jim O'Flaherty stated is common sense in the field of
> patents, and any patent attorney would attest to that.  If I may add, 
> Jim's
> last sentence should read "Google's patent application" instead of
> "Google's patent".  The difference is huge, and this may be in the heart 
> of
> the issue, which is not well understood by the general public.
>
> In other words, thousands of patent applications are filed in the
> world without any hope of the patent eventually being granted, to 
> establish
> "prior art" thereby protecting what's described in it from being patented
> by somebody else.
>
> Or, am I responding to a troll?
>
> Tokumoto
>
>
> On Fri, Dec 7, 2018 at 10:01 AM uurtamo  wrote:
>
>> You're insane.
>>
>> On Thu, Dec 6, 2018, 4:13 PM Jim O'Flaherty <
>> jim.oflaherty...@gmail.com wrote:
>>
>>> Remember, patents are a STRATEGIC mechanism as well as a legal
>>> mechanism. As soon as a patent is publically filed (for example, as
>>> utility, and following provisional), the text and claims in the patent
>>> immediately become prior art globally as of the original filing date
>>> REGARDLESS of whether the patent is eventually approved or rejected. 
>>> IOW, a
>>> patent filing is a mechanism to ensure no one else can make a similar 
>>> claim
>>> without risking this filing being used as a possible prior art 
>>> refutation.
>>>
>>> I know this only because it is a strategy option my company is using
>>> in an entirely different unrelated domain. The patent filing is 
>>> defensive
>>> such that someone else cannot make a claim and take our inventions away
>>>

[Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Hi all,

I've posted an implementation of the AlphaZero algorithm and brief tutorial. 
The code runs on a single GPU. While performance is not that great, I suspect 
its mostly been limited by hardware limitations (my training and evaluation has 
been on a single Titan X). The network can beat GNU go about 50% of the time, 
although it "abuses" the scoring a little bit--which I talk a little more about 
in the article:

https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc

-Cody___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread Xavier Combelle
looks you made it work on a 7x7 19x19 would probably give better result
especially against yourself if you are a complete novice

for not cheating against gnugo, use --play-out-aftermath of gnugo parameter

If I don't mistake a competitive ai would need a lot more training such
what does leela zero https://github.com/gcp/leela-zero

Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief
> tutorial. The code runs on a single GPU. While performance is not that
> great, I suspect its mostly been limited by hardware limitations (my
> training and evaluation has been on a single Titan X). The network can
> beat GNU go about 50% of the time, although it "abuses" the scoring a
> little bit--which I talk a little more about in the article:
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Thanks for your comments.

>looks you made it work on a 7x7 19x19 would probably give better result 
>especially against yourself if you are a complete novice
I'd expect that'd make me win even more against the algorithm since it would 
explore a far smaller amount of the search space, right?
Certainly something I'd be interested in testing though--I just would expect 
it'd take many months more months of training however, but would be interesting 
to see how much performance falls apart, if at all.

>for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
Yep, I evaluate with that parameter. The problem is more that I only play 20 
turns per player per game. And the network seems to like placing stones in 
terrotories "owned" by the other player. My scoring system then no longer 
counts that area as owned by the player. Probably playing more turns out and/or 
using a more sophisticated scoring system would fix this.

>If I don't mistake a competitive ai would need a lot more training such what 
>does leela zero https://github.com/gcp/leela-zero
Yeah, I agree more training is probably the key here. I'll take a look at 
leela-zero.

‐‐‐ Original Message ‐‐‐
On Sunday, December 9, 2018 7:41 PM, Xavier Combelle 
 wrote:

> looks you made it work on a 7x7 19x19 would probably give better result 
> especially against yourself if you are a complete novice
>
> for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>
> If I don't mistake a competitive ai would need a lot more training such what 
> does leela zero https://github.com/gcp/leela-zero
>
> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>
>> Hi all,
>>
>> I've posted an implementation of the AlphaZero algorithm and brief tutorial. 
>> The code runs on a single GPU. While performance is not that great, I 
>> suspect its mostly been limited by hardware limitations (my training and 
>> evaluation has been on a single Titan X). The network can beat GNU go about 
>> 50% of the time, although it "abuses" the scoring a little bit--which I talk 
>> a little more about in the article:
>>
>> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>>
>> -Cody
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>>
>> http://computer-go.org/mailman/listinfo/computer-go___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread Dani
Thanks for the tutorial! I have some questions about training

a) Do you use Dirichlet noise during training, if so is it limited to first
30 or so plies ( which is the opening phase of chess) ?
The alphazero paper is not clear about it.

b) Do you need to shuffle batches if you are doing one epoch? Also after
generating game positions from each game,
do you shuffle those postions? I found the latter to be very important to
avoid overfitting.

c) Do you think there is a problem with using Adam Optimizer instead of SGD
with learning rate drops?

Daniel

On Sun, Dec 9, 2018 at 6:23 PM cody2007 via Computer-go <
computer-go@computer-go.org> wrote:

> Thanks for your comments.
>
> >looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
> I'd expect that'd make me win even more against the algorithm since it
> would explore a far smaller amount of the search space, right?
> Certainly something I'd be interested in testing though--I just would
> expect it'd take many months more months of training however, but would be
> interesting to see how much performance falls apart, if at all.
>
> >for not cheating against gnugo, use --play-out-aftermath of gnugo
> parameter
> Yep, I evaluate with that parameter. The problem is more that I only play
> 20 turns per player per game. And the network seems to like placing stones
> in terrotories "owned" by the other player. My scoring system then no
> longer counts that area as owned by the player. Probably playing more turns
> out and/or using a more sophisticated scoring system would fix this.
>
> >If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Yeah, I agree more training is probably the key here. I'll take a look at
> leela-zero.
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
> xavier.combe...@gmail.com> wrote:
>
> looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
>
> for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>
> If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>
> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief
> tutorial. The code runs on a single GPU. While performance is not that
> great, I suspect its mostly been limited by hardware limitations (my
> training and evaluation has been on a single Titan X). The network can beat
> GNU go about 50% of the time, although it "abuses" the scoring a little
> bit--which I talk a little more about in the article:
>
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> ___
> Computer-go mailing 
> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
By the way, why only 40 moves? That seems like the wrong place to
economize, but maybe on 7x7 it's fine?

s.

On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go <
computer-go@computer-go.org wrote:

> Thanks for your comments.
>
> >looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
> I'd expect that'd make me win even more against the algorithm since it
> would explore a far smaller amount of the search space, right?
> Certainly something I'd be interested in testing though--I just would
> expect it'd take many months more months of training however, but would be
> interesting to see how much performance falls apart, if at all.
>
> >for not cheating against gnugo, use --play-out-aftermath of gnugo
> parameter
> Yep, I evaluate with that parameter. The problem is more that I only play
> 20 turns per player per game. And the network seems to like placing stones
> in terrotories "owned" by the other player. My scoring system then no
> longer counts that area as owned by the player. Probably playing more turns
> out and/or using a more sophisticated scoring system would fix this.
>
> >If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Yeah, I agree more training is probably the key here. I'll take a look at
> leela-zero.
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
> xavier.combe...@gmail.com> wrote:
>
> looks you made it work on a 7x7 19x19 would probably give better result
> especially against yourself if you are a complete novice
>
> for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>
> If I don't mistake a competitive ai would need a lot more training such
> what does leela zero https://github.com/gcp/leela-zero
> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>
> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief
> tutorial. The code runs on a single GPU. While performance is not that
> great, I suspect its mostly been limited by hardware limitations (my
> training and evaluation has been on a single Titan X). The network can beat
> GNU go about 50% of the time, although it "abuses" the scoring a little
> bit--which I talk a little more about in the article:
>
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> ___
> Computer-go mailing 
> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
A "scoring estimate" by definition should be weaker than the computer
players it's evaluating until there are no more captures possible.

Yes?

s.

On Sun, Dec 9, 2018, 5:49 PM uurtamo  By the way, why only 40 moves? That seems like the wrong place to
> economize, but maybe on 7x7 it's fine?
>
> s.
>
> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go <
> computer-go@computer-go.org wrote:
>
>> Thanks for your comments.
>>
>> >looks you made it work on a 7x7 19x19 would probably give better result
>> especially against yourself if you are a complete novice
>> I'd expect that'd make me win even more against the algorithm since it
>> would explore a far smaller amount of the search space, right?
>> Certainly something I'd be interested in testing though--I just would
>> expect it'd take many months more months of training however, but would be
>> interesting to see how much performance falls apart, if at all.
>>
>> >for not cheating against gnugo, use --play-out-aftermath of gnugo
>> parameter
>> Yep, I evaluate with that parameter. The problem is more that I only play
>> 20 turns per player per game. And the network seems to like placing stones
>> in terrotories "owned" by the other player. My scoring system then no
>> longer counts that area as owned by the player. Probably playing more turns
>> out and/or using a more sophisticated scoring system would fix this.
>>
>> >If I don't mistake a competitive ai would need a lot more training such
>> what does leela zero https://github.com/gcp/leela-zero
>> Yeah, I agree more training is probably the key here. I'll take a look at
>> leela-zero.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
>> xavier.combe...@gmail.com> wrote:
>>
>> looks you made it work on a 7x7 19x19 would probably give better result
>> especially against yourself if you are a complete novice
>>
>> for not cheating against gnugo, use --play-out-aftermath of gnugo
>> parameter
>>
>> If I don't mistake a competitive ai would need a lot more training such
>> what does leela zero https://github.com/gcp/leela-zero
>> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>>
>> Hi all,
>>
>> I've posted an implementation of the AlphaZero algorithm and brief
>> tutorial. The code runs on a single GPU. While performance is not that
>> great, I suspect its mostly been limited by hardware limitations (my
>> training and evaluation has been on a single Titan X). The network can beat
>> GNU go about 50% of the time, although it "abuses" the scoring a little
>> bit--which I talk a little more about in the article:
>>
>>
>> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>>
>> -Cody
>>
>> ___
>> Computer-go mailing 
>> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
>By the way, why only 40 moves? That seems like the wrong place to economize, 
>but maybe on 7x7 it's fine?
I haven't implemented any resign mechanism, so felt it was a reasonable balance 
to at least see where the players roughly stand. Although, I think I errored on 
too few turns.

>A "scoring estimate" by definition should be weaker than the computer players 
>it's evaluating until there are no more captures possible.
Not sure I understand entirely. But would agree that the scoring I use is 
probably a limitation here.

‐‐‐ Original Message ‐‐‐
On Sunday, December 9, 2018 8:51 PM, uurtamo  wrote:

> A "scoring estimate" by definition should be weaker than the computer players 
> it's evaluating until there are no more captures possible.
>
> Yes?
>
> s.
>
> On Sun, Dec 9, 2018, 5:49 PM uurtamo 
>> By the way, why only 40 moves? That seems like the wrong place to economize, 
>> but maybe on 7x7 it's fine?
>>
>> s.
>>
>> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go 
>> >
>>> Thanks for your comments.
>>>
looks you made it work on a 7x7 19x19 would probably give better result 
especially against yourself if you are a complete novice
>>> I'd expect that'd make me win even more against the algorithm since it 
>>> would explore a far smaller amount of the search space, right?
>>> Certainly something I'd be interested in testing though--I just would 
>>> expect it'd take many months more months of training however, but would be 
>>> interesting to see how much performance falls apart, if at all.
>>>
for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>>> Yep, I evaluate with that parameter. The problem is more that I only play 
>>> 20 turns per player per game. And the network seems to like placing stones 
>>> in terrotories "owned" by the other player. My scoring system then no 
>>> longer counts that area as owned by the player. Probably playing more turns 
>>> out and/or using a more sophisticated scoring system would fix this.
>>>
If I don't mistake a competitive ai would need a lot more training such 
what does leela zero https://github.com/gcp/leela-zero
>>> Yeah, I agree more training is probably the key here. I'll take a look at 
>>> leela-zero.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle 
>>>  wrote:
>>>
 looks you made it work on a 7x7 19x19 would probably give better result 
 especially against yourself if you are a complete novice

 for not cheating against gnugo, use --play-out-aftermath of gnugo parameter

 If I don't mistake a competitive ai would need a lot more training such 
 what does leela zero https://github.com/gcp/leela-zero

 Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :

> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief 
> tutorial. The code runs on a single GPU. While performance is not that 
> great, I suspect its mostly been limited by hardware limitations (my 
> training and evaluation has been on a single Titan X). The network can 
> beat GNU go about 50% of the time, although it "abuses" the scoring a 
> little bit--which I talk a little more about in the article:
>
> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>
> -Cody
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
>
> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] Fw: Re: AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
(resending because I forgot to send this to the mailing list originally)
‐‐‐ Original Message ‐‐‐
On Sunday, December 9, 2018 8:59 PM, cody2007  wrote:

> Hi Daniel,
> Thanks for your thoughts/questions.
>
>>a) Do you use Dirichlet noise during training, if so is it limited to first 
>>30 or so plies ( which is the opening phase of chess) ?
>>The alphazero paper is not clear about it.
> I don't. I had implemented functionality for it at one point but wasn't 
> entirely clear if I had implemented it correctly / was unclear from what they 
> said in the paper so have disabled it. Have you noticed the noise to be 
> useful?
>
>>b) Do you need to shuffle batches if you are doing one epoch?
> To clarify: I run out 128 games in parallel (my batch size) to 20 
> turns/player (with 1000 simulations). So I get 20 turns times 128 games. I 
> train on *all* turns in random order, so 20 gradient descent steps. 
> Otherwise, I'd be training on turn 1, 2, 3... in order. Then I repeat after 
> the gradient steps with new simulations. Is that what you mean by epoch?
>
> I've been concerned I could be overfitting by doing all 20 turns so have been 
> running a test where I randomly select 10 and discard the rest. So far, no 
> difference in performance. Could be I haven't trained out enough to see a 
> difference or that I'd have to reduce the turn sampling even further (or pool 
> the training examples over larger numbers of games to randomize further, 
> which I think the AlphaGo papers did, if I recall).
>
>>do you shuffle those postions? I found the latter to be very important to 
>>avoid overfitting.
> If you mean random rotations and reflections, yes, I do.
>
>>c) Do you think there is a problem with using Adam Optimizer instead of SGD 
>>with learning rate drops?
> I haven't tried-- have you? In other domains (some stuff I've done with A3C 
> objectives), I felt like it was unstable in my hands--but maybe that's just 
> me. (A3C in general (for other games I've tried it on) has been unstable for 
> me which is partly why I've gone down the route of exploring the AlphaGo 
> approach.
>
> Have you written up anything about your experiments?
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 8:34 PM, Dani  wrote:
>
>> Thanks for the tutorial! I have some questions about training
>>
>> a) Do you use Dirichlet noise during training, if so is it limited to first 
>> 30 or so plies ( which is the opening phase of chess) ?
>> The alphazero paper is not clear about it.
>>
>> b) Do you need to shuffle batches if you are doing one epoch? Also after 
>> generating game positions from each game,
>> do you shuffle those postions? I found the latter to be very important to 
>> avoid overfitting.
>>
>> c) Do you think there is a problem with using Adam Optimizer instead of SGD 
>> with learning rate drops?
>>
>> Daniel
>>
>> On Sun, Dec 9, 2018 at 6:23 PM cody2007 via Computer-go 
>>  wrote:
>>
>>> Thanks for your comments.
>>>
looks you made it work on a 7x7 19x19 would probably give better result 
especially against yourself if you are a complete novice
>>> I'd expect that'd make me win even more against the algorithm since it 
>>> would explore a far smaller amount of the search space, right?
>>> Certainly something I'd be interested in testing though--I just would 
>>> expect it'd take many months more months of training however, but would be 
>>> interesting to see how much performance falls apart, if at all.
>>>
for not cheating against gnugo, use --play-out-aftermath of gnugo parameter
>>> Yep, I evaluate with that parameter. The problem is more that I only play 
>>> 20 turns per player per game. And the network seems to like placing stones 
>>> in terrotories "owned" by the other player. My scoring system then no 
>>> longer counts that area as owned by the player. Probably playing more turns 
>>> out and/or using a more sophisticated scoring system would fix this.
>>>
If I don't mistake a competitive ai would need a lot more training such 
what does leela zero https://github.com/gcp/leela-zero
>>> Yeah, I agree more training is probably the key here. I'll take a look at 
>>> leela-zero.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle 
>>>  wrote:
>>>
 looks you made it work on a 7x7 19x19 would probably give better result 
 especially against yourself if you are a complete novice

 for not cheating against gnugo, use --play-out-aftermath of gnugo parameter

 If I don't mistake a competitive ai would need a lot more training such 
 what does leela zero https://github.com/gcp/leela-zero

 Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :

> Hi all,
>
> I've posted an implementation of the AlphaZero algorithm and brief 
> tutorial. The code runs on a single GPU. While performance is not that 
> great, I suspect its mostly been limited by hardware limitations (my 
>>>

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
Imagine that your score estimator has a better idea about the outcome of
the game than the players themselves.

Then you can build a stronger computer player with the following algorithm:
use the score estimator to pick the next move after evaluating all legal
moves, by evaluating their after-move scores.

If you use something like Tromp-Taylor (not sure what most people use
nowadays) then you can score it less equivocally.

Perhaps I was misunderstanding, but if not then this could be a somewhat
serious problem.

s


On Sun, Dec 9, 2018, 6:17 PM cody2007  >By the way, why only 40 moves? That seems like the wrong place to
> economize, but maybe on 7x7 it's fine?
> I haven't implemented any resign mechanism, so felt it was a reasonable
> balance to at least see where the players roughly stand. Although, I think
> I errored on too few turns.
>
> >A "scoring estimate" by definition should be weaker than the computer
> players it's evaluating until there are no more captures possible.
> Not sure I understand entirely. But would agree that the scoring I use is
> probably a limitation here.
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 8:51 PM, uurtamo  wrote:
>
> A "scoring estimate" by definition should be weaker than the computer
> players it's evaluating until there are no more captures possible.
>
> Yes?
>
> s.
>
> On Sun, Dec 9, 2018, 5:49 PM uurtamo 
>> By the way, why only 40 moves? That seems like the wrong place to
>> economize, but maybe on 7x7 it's fine?
>>
>> s.
>>
>> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go <
>> computer-go@computer-go.org wrote:
>>
>>> Thanks for your comments.
>>>
>>> >looks you made it work on a 7x7 19x19 would probably give better result
>>> especially against yourself if you are a complete novice
>>> I'd expect that'd make me win even more against the algorithm since it
>>> would explore a far smaller amount of the search space, right?
>>> Certainly something I'd be interested in testing though--I just would
>>> expect it'd take many months more months of training however, but would be
>>> interesting to see how much performance falls apart, if at all.
>>>
>>> >for not cheating against gnugo, use --play-out-aftermath of gnugo
>>> parameter
>>> Yep, I evaluate with that parameter. The problem is more that I only
>>> play 20 turns per player per game. And the network seems to like placing
>>> stones in terrotories "owned" by the other player. My scoring system then
>>> no longer counts that area as owned by the player. Probably playing more
>>> turns out and/or using a more sophisticated scoring system would fix this.
>>>
>>> >If I don't mistake a competitive ai would need a lot more training such
>>> what does leela zero https://github.com/gcp/leela-zero
>>> Yeah, I agree more training is probably the key here. I'll take a look
>>> at leela-zero.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
>>> xavier.combe...@gmail.com> wrote:
>>>
>>> looks you made it work on a 7x7 19x19 would probably give better result
>>> especially against yourself if you are a complete novice
>>>
>>> for not cheating against gnugo, use --play-out-aftermath of gnugo
>>> parameter
>>>
>>> If I don't mistake a competitive ai would need a lot more training such
>>> what does leela zero https://github.com/gcp/leela-zero
>>> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>>>
>>> Hi all,
>>>
>>> I've posted an implementation of the AlphaZero algorithm and brief
>>> tutorial. The code runs on a single GPU. While performance is not that
>>> great, I suspect its mostly been limited by hardware limitations (my
>>> training and evaluation has been on a single Titan X). The network can beat
>>> GNU go about 50% of the time, although it "abuses" the scoring a little
>>> bit--which I talk a little more about in the article:
>>>
>>>
>>> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>>>
>>> -Cody
>>>
>>> ___
>>> Computer-go mailing 
>>> listComputer-go@computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>>>
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Sorry, just to make sure I understand: your concern is the network may be 
learning from the scoring system rather than through the self-play? Or are you 
concerned the scoring is giving sub-par evaluations of games?

The scoring I use is to simply count the number of stones each player has on 
the board. Then add a point for each unoccupied space that is surrounded 
completely by each player. It is simplistic and I think it does give sub-par 
evaluations of who is the winner--and definitely is a potentially serious 
deterrent to getting better performance. How much, maybe a lot. What do you 
think?

‐‐‐ Original Message ‐‐‐
On Sunday, December 9, 2018 9:31 PM, uurtamo  wrote:

> Imagine that your score estimator has a better idea about the outcome of the 
> game than the players themselves.
>
> Then you can build a stronger computer player with the following algorithm: 
> use the score estimator to pick the next move after evaluating all legal 
> moves, by evaluating their after-move scores.
>
> If you use something like Tromp-Taylor (not sure what most people use 
> nowadays) then you can score it less equivocally.
>
> Perhaps I was misunderstanding, but if not then this could be a somewhat 
> serious problem.
>
> s
>
> On Sun, Dec 9, 2018, 6:17 PM cody2007 
>>>By the way, why only 40 moves? That seems like the wrong place to economize, 
>>>but maybe on 7x7 it's fine?
>> I haven't implemented any resign mechanism, so felt it was a reasonable 
>> balance to at least see where the players roughly stand. Although, I think I 
>> errored on too few turns.
>>
>>>A "scoring estimate" by definition should be weaker than the computer 
>>>players it's evaluating until there are no more captures possible.
>> Not sure I understand entirely. But would agree that the scoring I use is 
>> probably a limitation here.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On Sunday, December 9, 2018 8:51 PM, uurtamo  wrote:
>>
>>> A "scoring estimate" by definition should be weaker than the computer 
>>> players it's evaluating until there are no more captures possible.
>>>
>>> Yes?
>>>
>>> s.
>>>
>>> On Sun, Dec 9, 2018, 5:49 PM uurtamo >>
 By the way, why only 40 moves? That seems like the wrong place to 
 economize, but maybe on 7x7 it's fine?

 s.

 On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go 
 >>>
> Thanks for your comments.
>
>>looks you made it work on a 7x7 19x19 would probably give better result 
>>especially against yourself if you are a complete novice
> I'd expect that'd make me win even more against the algorithm since it 
> would explore a far smaller amount of the search space, right?
> Certainly something I'd be interested in testing though--I just would 
> expect it'd take many months more months of training however, but would 
> be interesting to see how much performance falls apart, if at all.
>
>>for not cheating against gnugo, use --play-out-aftermath of gnugo 
>>parameter
> Yep, I evaluate with that parameter. The problem is more that I only play 
> 20 turns per player per game. And the network seems to like placing 
> stones in terrotories "owned" by the other player. My scoring system then 
> no longer counts that area as owned by the player. Probably playing more 
> turns out and/or using a more sophisticated scoring system would fix this.
>
>>If I don't mistake a competitive ai would need a lot more training such 
>>what does leela zero https://github.com/gcp/leela-zero
> Yeah, I agree more training is probably the key here. I'll take a look at 
> leela-zero.
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle 
>  wrote:
>
>> looks you made it work on a 7x7 19x19 would probably give better result 
>> especially against yourself if you are a complete novice
>>
>> for not cheating against gnugo, use --play-out-aftermath of gnugo 
>> parameter
>>
>> If I don't mistake a competitive ai would need a lot more training such 
>> what does leela zero https://github.com/gcp/leela-zero
>>
>> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>>
>>> Hi all,
>>>
>>> I've posted an implementation of the AlphaZero algorithm and brief 
>>> tutorial. The code runs on a single GPU. While performance is not that 
>>> great, I suspect its mostly been limited by hardware limitations (my 
>>> training and evaluation has been on a single Titan X). The network can 
>>> beat GNU go about 50% of the time, although it "abuses" the scoring a 
>>> little bit--which I talk a little more about in the article:
>>>
>>> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>>>
>>> -Cody
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>>
>

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread cody2007 via Computer-go
Oh, I see. I believe I am, in fact, using Tromp-Taylor rules for scoring. I was 
unaware that that's what it was called.

‐‐‐ Original Message ‐‐‐
On Sunday, December 9, 2018 10:09 PM, cody2007  wrote:

> Sorry, just to make sure I understand: your concern is the network may be 
> learning from the scoring system rather than through the self-play? Or are 
> you concerned the scoring is giving sub-par evaluations of games?
>
> The scoring I use is to simply count the number of stones each player has on 
> the board. Then add a point for each unoccupied space that is surrounded 
> completely by each player. It is simplistic and I think it does give sub-par 
> evaluations of who is the winner--and definitely is a potentially serious 
> deterrent to getting better performance. How much, maybe a lot. What do you 
> think?
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 9:31 PM, uurtamo  wrote:
>
>> Imagine that your score estimator has a better idea about the outcome of the 
>> game than the players themselves.
>>
>> Then you can build a stronger computer player with the following algorithm: 
>> use the score estimator to pick the next move after evaluating all legal 
>> moves, by evaluating their after-move scores.
>>
>> If you use something like Tromp-Taylor (not sure what most people use 
>> nowadays) then you can score it less equivocally.
>>
>> Perhaps I was misunderstanding, but if not then this could be a somewhat 
>> serious problem.
>>
>> s
>>
>> On Sun, Dec 9, 2018, 6:17 PM cody2007 >
By the way, why only 40 moves? That seems like the wrong place to 
economize, but maybe on 7x7 it's fine?
>>> I haven't implemented any resign mechanism, so felt it was a reasonable 
>>> balance to at least see where the players roughly stand. Although, I think 
>>> I errored on too few turns.
>>>
A "scoring estimate" by definition should be weaker than the computer 
players it's evaluating until there are no more captures possible.
>>> Not sure I understand entirely. But would agree that the scoring I use is 
>>> probably a limitation here.
>>>
>>> ‐‐‐ Original Message ‐‐‐
>>> On Sunday, December 9, 2018 8:51 PM, uurtamo  wrote:
>>>
 A "scoring estimate" by definition should be weaker than the computer 
 players it's evaluating until there are no more captures possible.

 Yes?

 s.

 On Sun, Dec 9, 2018, 5:49 PM uurtamo >>>
> By the way, why only 40 moves? That seems like the wrong place to 
> economize, but maybe on 7x7 it's fine?
>
> s.
>
> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go 
> 
>> Thanks for your comments.
>>
>>>looks you made it work on a 7x7 19x19 would probably give better result 
>>>especially against yourself if you are a complete novice
>> I'd expect that'd make me win even more against the algorithm since it 
>> would explore a far smaller amount of the search space, right?
>> Certainly something I'd be interested in testing though--I just would 
>> expect it'd take many months more months of training however, but would 
>> be interesting to see how much performance falls apart, if at all.
>>
>>>for not cheating against gnugo, use --play-out-aftermath of gnugo 
>>>parameter
>> Yep, I evaluate with that parameter. The problem is more that I only 
>> play 20 turns per player per game. And the network seems to like placing 
>> stones in terrotories "owned" by the other player. My scoring system 
>> then no longer counts that area as owned by the player. Probably playing 
>> more turns out and/or using a more sophisticated scoring system would 
>> fix this.
>>
>>>If I don't mistake a competitive ai would need a lot more training such 
>>>what does leela zero https://github.com/gcp/leela-zero
>> Yeah, I agree more training is probably the key here. I'll take a look 
>> at leela-zero.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle 
>>  wrote:
>>
>>> looks you made it work on a 7x7 19x19 would probably give better result 
>>> especially against yourself if you are a complete novice
>>>
>>> for not cheating against gnugo, use --play-out-aftermath of gnugo 
>>> parameter
>>>
>>> If I don't mistake a competitive ai would need a lot more training such 
>>> what does leela zero https://github.com/gcp/leela-zero
>>>
>>> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>>>
 Hi all,

 I've posted an implementation of the AlphaZero algorithm and brief 
 tutorial. The code runs on a single GPU. While performance is not that 
 great, I suspect its mostly been limited by hardware limitations (my 
 training and evaluation has been on a single Titan X). The network can 
 beat GNU go about 50% of the time, although it "abuses" the scori

Re: [Computer-go] AlphaZero tensorflow implementation/tutorial

2018-12-09 Thread uurtamo
I haven't thought clearly about the 7x7 case, but on 19x19 I think it would
suffer both challenges -- you'd count dead stuff as alive quite frequently,
and because you're pruning the game ending early you might be getting wrong
who has actually won. That's why some people use less ambiguous definitions
of the end of the game.

Knowing when the game is unambiguously over and scoring it unambiguously
are good subroutines to have in your toolbelt.

s.

On Sun, Dec 9, 2018, 7:09 PM cody2007  Sorry, just to make sure I understand: your concern is the network may be
> learning from the scoring system rather than through the self-play? Or are
> you concerned the scoring is giving sub-par evaluations of games?
>
> The scoring I use is to simply count the number of stones each player has
> on the board. Then add a point for each unoccupied space that is surrounded
> completely by each player. It is simplistic and I think it does give
> sub-par evaluations of who is the winner--and definitely is a potentially
> serious deterrent to getting better performance. How much, maybe a lot.
> What do you think?
>
> ‐‐‐ Original Message ‐‐‐
> On Sunday, December 9, 2018 9:31 PM, uurtamo  wrote:
>
> Imagine that your score estimator has a better idea about the outcome of
> the game than the players themselves.
>
> Then you can build a stronger computer player with the following
> algorithm: use the score estimator to pick the next move after evaluating
> all legal moves, by evaluating their after-move scores.
>
> If you use something like Tromp-Taylor (not sure what most people use
> nowadays) then you can score it less equivocally.
>
> Perhaps I was misunderstanding, but if not then this could be a somewhat
> serious problem.
>
> s
>
>
> On Sun, Dec 9, 2018, 6:17 PM cody2007 
>> >By the way, why only 40 moves? That seems like the wrong place to
>> economize, but maybe on 7x7 it's fine?
>> I haven't implemented any resign mechanism, so felt it was a reasonable
>> balance to at least see where the players roughly stand. Although, I think
>> I errored on too few turns.
>>
>> >A "scoring estimate" by definition should be weaker than the computer
>> players it's evaluating until there are no more captures possible.
>> Not sure I understand entirely. But would agree that the scoring I use is
>> probably a limitation here.
>>
>> ‐‐‐ Original Message ‐‐‐
>> On Sunday, December 9, 2018 8:51 PM, uurtamo  wrote:
>>
>> A "scoring estimate" by definition should be weaker than the computer
>> players it's evaluating until there are no more captures possible.
>>
>> Yes?
>>
>> s.
>>
>> On Sun, Dec 9, 2018, 5:49 PM uurtamo >
>>> By the way, why only 40 moves? That seems like the wrong place to
>>> economize, but maybe on 7x7 it's fine?
>>>
>>> s.
>>>
>>> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go <
>>> computer-go@computer-go.org wrote:
>>>
 Thanks for your comments.

 >looks you made it work on a 7x7 19x19 would probably give better
 result especially against yourself if you are a complete novice
 I'd expect that'd make me win even more against the algorithm since it
 would explore a far smaller amount of the search space, right?
 Certainly something I'd be interested in testing though--I just would
 expect it'd take many months more months of training however, but would be
 interesting to see how much performance falls apart, if at all.

 >for not cheating against gnugo, use --play-out-aftermath of gnugo
 parameter
 Yep, I evaluate with that parameter. The problem is more that I only
 play 20 turns per player per game. And the network seems to like placing
 stones in terrotories "owned" by the other player. My scoring system then
 no longer counts that area as owned by the player. Probably playing more
 turns out and/or using a more sophisticated scoring system would fix this.

 >If I don't mistake a competitive ai would need a lot more training
 such what does leela zero https://github.com/gcp/leela-zero
 Yeah, I agree more training is probably the key here. I'll take a look
 at leela-zero.

 ‐‐‐ Original Message ‐‐‐
 On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
 xavier.combe...@gmail.com> wrote:

 looks you made it work on a 7x7 19x19 would probably give better result
 especially against yourself if you are a complete novice

 for not cheating against gnugo, use --play-out-aftermath of gnugo
 parameter

 If I don't mistake a competitive ai would need a lot more training such
 what does leela zero https://github.com/gcp/leela-zero
 Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :

 Hi all,

 I've posted an implementation of the AlphaZero algorithm and brief
 tutorial. The code runs on a single GPU. While performance is not that
 great, I suspect its mostly been limited by hardware limitations (my
 training and evaluat