date:20080312

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Jason House



On Mar 11, 2008, at 2:41 PM, Christoph Birk <[EMAIL PROTECTED]> wrote:


On Tue, 11 Mar 2008, Don Dailey wrote:

I am going to keep the 25k playouts running and add a 10k play-out
version of UCT. I want to establish a standard testing size so  
that


Great! That way Jason can also participate.


I will have my bot online either tonight or tomorrow night. The  
machine I use for CGOS is out of date...


PS: Don, any luck fixing the client bug that knocks me offline after  
one game?





myCtest-10k-UCT has a long-term rating of about 1250.
For the 50k version I have just started a test series that experiments
with various thresholds before creating a new node.

Christoph

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Tim Foden


Don Dailey wrote:

I suggest
exactly 25,000 play-outs that we should standardize on.50,000 will
tax my spare computer which I like to use for modest CGOS tests.  


If it is agreed,  I will start a 25k test.My prediction is that this
will finish around 1600 ELO on CGOS.   
  
OK, I added Fluke to this (25k) test (twice), before I saw the later 
comment about using 10k too.


Its looking like your drdGeneric 25k bot is currently around 1475 (147 
games).


Fluke on the other hand looks to be settling at around 1300 (125 
games).  I feel that I've probably got a problem in my implementation!  
:)  (I've felt this for some time actually -- UCT never seemed to work 
well for me at all.)


Details of Fluke's UCT + Random playouts.

1. UCT constant, c = 0.25.  e.g. UCB value = averageScore + c * 
sqrt(log(n)/m).
2. New children are created once a node is visited 1 time (URd) or 2 
times (UR2).

3. Eye rule for random playouts:
  * Solid eyes (all 4 from same group).
  * False non-solid eyes (at least 50% of corners are of opposite colour).
4. Choosing legal moves for playouts:  1st probe is random, then scan.

Is there anything else that's likely to be significant here?

I guess I'll let it play some more games and see where it ends up.

Cheers, Tim.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Don Dailey



Tim Foden wrote:
> Don Dailey wrote:
>> I suggest
>> exactly 25,000 play-outs that we should standardize on.50,000 will
>> tax my spare computer which I like to use for modest CGOS tests. 
>> If it is agreed,  I will start a 25k test.My prediction is that this
>> will finish around 1600 ELO on CGOS. 
> OK, I added Fluke to this (25k) test (twice), before I saw the later
> comment about using 10k too.
>
> Its looking like your drdGeneric 25k bot is currently around 1475 (147
> games).
>
> Fluke on the other hand looks to be settling at around 1300 (125
> games).  I feel that I've probably got a problem in my
> implementation!  :)  (I've felt this for some time actually -- UCT
> never seemed to work well for me at all.)
>
> Details of Fluke's UCT + Random playouts.
>
> 1. UCT constant, c = 0.25.  e.g. UCB value = averageScore + c *
> sqrt(log(n)/m).
> 2. New children are created once a node is visited 1 time (URd) or 2
> times (UR2).
> 3. Eye rule for random playouts:
>   * Solid eyes (all 4 from same group).
>   * False non-solid eyes (at least 50% of corners are of opposite
> colour).
> 4. Choosing legal moves for playouts:  1st probe is random, then scan.
>
> Is there anything else that's likely to be significant here?
 1.  My UCT constant is 1.0  - my formula is  averageScore + c * sqrt(
(2.0 * log(n)) / (10.0 * m) );
 2.  New children are created when parent exceeds 100 visits.
 3.  I think the eye rule is the same (you state it differently, but I
believe it's the same.)
 4.  playouts are truly uniform random - yours are not. 

I think point 4 could be significant but I can't be sure.

- Don





>
> I guess I'll let it play some more games and see where it ends up.
>
> Cheers, Tim.
> ___
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Jason House




On Mar 12, 2008, at 2:17 PM, Don Dailey <[EMAIL PROTECTED]> wrote:




Tim Foden wrote:

Don Dailey wrote:

I suggest
exactly 25,000 play-outs that we should standardize on.50,000  
will

tax my spare computer which I like to use for modest CGOS tests.
If it is agreed,  I will start a 25k test.My prediction is  
that this

will finish around 1600 ELO on CGOS.

OK, I added Fluke to this (25k) test (twice), before I saw the later
comment about using 10k too.

Its looking like your drdGeneric 25k bot is currently around 1475  
(147

games).

Fluke on the other hand looks to be settling at around 1300 (125
games).  I feel that I've probably got a problem in my
implementation!  :)  (I've felt this for some time actually -- UCT
never seemed to work well for me at all.)

Details of Fluke's UCT + Random playouts.

1. UCT constant, c = 0.25.  e.g. UCB value = averageScore + c *
sqrt(log(n)/m).
2. New children are created once a node is visited 1 time (URd) or 2
times (UR2).
3. Eye rule for random playouts:
 * Solid eyes (all 4 from same group).
 * False non-solid eyes (at least 50% of corners are of opposite
colour).
4. Choosing legal moves for playouts:  1st probe is random, then  
scan.


Is there anything else that's likely to be significant here?

1.  My UCT constant is 1.0  - my formula is  averageScore + c * sqrt(
(2.0 * log(n)) / (10.0 * m) );


Looks like your constant would be 1/sqrt(5) in his formula... Or about  
what he uses.




2.  New children are created when parent exceeds 100 visits.
3.  I think the eye rule is the same (you state it differently, but I
believe it's the same.)
4.  playouts are truly uniform random - yours are not.

I think point 4 could be significant but I can't be sure.


I haven't seen much of an impact from that.







- Don







I guess I'll let it play some more games and see where it ends up.

Cheers, Tim.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Don Dailey





By the way,  I never experimented with the formula I
use,  I just plagiarized it from information posted here.   I'm working
on my own fomula in the meantime and used to use something different I
made up.    

What I used before worked like this:

   u = (w + n) / (g + n)

   u  - is what you maximize, same as
ucb value
   w - wins for this move
   g - games for this move
   n - a constant related to how many times parent visited 

n increases as the number of parent visits increase.  I tried
log(parent_count) but it increases too slow.  I tried lot's of things
here.   

When you compare this to the standard formula,  it tends to be too
exploitive, and not very explorative.   If you use a constant for N, 
you get a search which eventually focuses on a single move only -
because other moves can never "catch up" in score (unless the current
move is discovered to have problems.)

- Don



Don Dailey wrote:

  
Tim Foden wrote:
  
  
Don Dailey wrote:


  I suggest
exactly 25,000 play-outs that we should standardize on.50,000 will
tax my spare computer which I like to use for modest CGOS tests. 
If it is agreed,  I will start a 25k test.My prediction is that this
will finish around 1600 ELO on CGOS. 
  

OK, I added Fluke to this (25k) test (twice), before I saw the later
comment about using 10k too.

Its looking like your drdGeneric 25k bot is currently around 1475 (147
games).

Fluke on the other hand looks to be settling at around 1300 (125
games).  I feel that I've probably got a problem in my
implementation!  :)  (I've felt this for some time actually -- UCT
never seemed to work well for me at all.)

Details of Fluke's UCT + Random playouts.

1. UCT constant, c = 0.25.  e.g. UCB value = averageScore + c *
sqrt(log(n)/m).
2. New children are created once a node is visited 1 time (URd) or 2
times (UR2).
3. Eye rule for random playouts:
  * Solid eyes (all 4 from same group).
  * False non-solid eyes (at least 50% of corners are of opposite
colour).
4. Choosing legal moves for playouts:  1st probe is random, then scan.

Is there anything else that's likely to be significant here?

  
   1.  My UCT constant is 1.0  - my formula is  averageScore + c * sqrt(
(2.0 * log(n)) / (10.0 * m) );
 2.  New children are created when parent exceeds 100 visits.
 3.  I think the eye rule is the same (you state it differently, but I
believe it's the same.)
 4.  playouts are truly uniform random - yours are not. 

I think point 4 could be significant but I can't be sure.

- Don





  
  
I guess I'll let it play some more games and see where it ends up.

Cheers, Tim.
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/


  
  ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

  



___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

2008-03-12 Thread Christoph Birk


On Wed, 12 Mar 2008, Don Dailey wrote:

1.  My UCT constant is 1.0  - my formula is  averageScore + c * sqrt(
(2.0 * log(n)) / (10.0 * m) );


so your contstant is 2/10 = 0.2 inside the sqrt(), which is
equivalent to c=0.44 ?

Christoph
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] Perl Pattern Matching Go Program

2008-03-12 Thread D Gilder

Hi,

I have been working on a Go playing program in Perl, which I have attached as 
a module. I'll try to put in on my webspace too at 
http://homepage.ntlworld.com/daniel.gilder/  (when I figure out why I can't 
log in)

It can play on KGS, using the script kgsbot.pl, where its about 25 kyu on a 
9x9. It requires the perl modules Games::Go::GTP, Games::Go::Referee, and 
Games::Go::SGF, wich are on CPAN.

It uses pattern matching, by learning patterns from sgf files. These it stores 
in database files (which are not included, so you need to provide the sgf 
files).

I've got to the point where it works, but there are bugs. If anyone would like 
to try and improve it, they are welcome to have a go, and I will try to 
answer any questions about it.

dan


Games-Go-Player-0.05.tar.gz
Description: application/tgz
___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] solving the nakade problem

2008-03-12 Thread Ivan Dubois

Hello,

I think there is a very easy and straigthforward solution to the "nakade/seki" 
problem, here it is : 

For moves that are self-atari on a group that contains MORE than 5 stones : 
Both in the tree and the playouts, strictly forbid them (exactly like you 
forbid filling an eye).
(This is to handle seki and have efficient playouts).

 For moves that are self-atari on a group that contains LESS than 5 stones :
Allow them both in the tree and the playouts. In the playouts, they should 
be played with a low probability. But they should be played 
when there is no other move left. (This is to ensure groups with are dead with 
nakade are eventualy captured in "some" playouts). 

What do you think about this solution ? I will probably implement it in Zoe to 
see if it efficient, unless someone finds a flaw in the logic.

   ___
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Optimal explore rates for plain UCT

Re: [computer-go] Optimal explore rates for plain UCT

Re: [computer-go] Optimal explore rates for plain UCT

Re: [computer-go] Optimal explore rates for plain UCT

Re: [computer-go] Optimal explore rates for plain UCT

Re: [computer-go] Optimal explore rates for plain UCT

[computer-go] Perl Pattern Matching Go Program

[computer-go] solving the nakade problem

8 matches

Site Navigation

Mail list logo

Footer information