Re: [computer-go] Optimal explore rates for plain UCT
On Mar 11, 2008, at 2:41 PM, Christoph Birk <[EMAIL PROTECTED]> wrote: On Tue, 11 Mar 2008, Don Dailey wrote: I am going to keep the 25k playouts running and add a 10k play-out version of UCT. I want to establish a standard testing size so that Great! That way Jason can also participate. I will have my bot online either tonight or tomorrow night. The machine I use for CGOS is out of date... PS: Don, any luck fixing the client bug that knocks me offline after one game? myCtest-10k-UCT has a long-term rating of about 1250. For the 50k version I have just started a test series that experiments with various thresholds before creating a new node. Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Optimal explore rates for plain UCT
Don Dailey wrote: I suggest exactly 25,000 play-outs that we should standardize on.50,000 will tax my spare computer which I like to use for modest CGOS tests. If it is agreed, I will start a 25k test.My prediction is that this will finish around 1600 ELO on CGOS. OK, I added Fluke to this (25k) test (twice), before I saw the later comment about using 10k too. Its looking like your drdGeneric 25k bot is currently around 1475 (147 games). Fluke on the other hand looks to be settling at around 1300 (125 games). I feel that I've probably got a problem in my implementation! :) (I've felt this for some time actually -- UCT never seemed to work well for me at all.) Details of Fluke's UCT + Random playouts. 1. UCT constant, c = 0.25. e.g. UCB value = averageScore + c * sqrt(log(n)/m). 2. New children are created once a node is visited 1 time (URd) or 2 times (UR2). 3. Eye rule for random playouts: * Solid eyes (all 4 from same group). * False non-solid eyes (at least 50% of corners are of opposite colour). 4. Choosing legal moves for playouts: 1st probe is random, then scan. Is there anything else that's likely to be significant here? I guess I'll let it play some more games and see where it ends up. Cheers, Tim. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Optimal explore rates for plain UCT
Tim Foden wrote: > Don Dailey wrote: >> I suggest >> exactly 25,000 play-outs that we should standardize on.50,000 will >> tax my spare computer which I like to use for modest CGOS tests. >> If it is agreed, I will start a 25k test.My prediction is that this >> will finish around 1600 ELO on CGOS. > OK, I added Fluke to this (25k) test (twice), before I saw the later > comment about using 10k too. > > Its looking like your drdGeneric 25k bot is currently around 1475 (147 > games). > > Fluke on the other hand looks to be settling at around 1300 (125 > games). I feel that I've probably got a problem in my > implementation! :) (I've felt this for some time actually -- UCT > never seemed to work well for me at all.) > > Details of Fluke's UCT + Random playouts. > > 1. UCT constant, c = 0.25. e.g. UCB value = averageScore + c * > sqrt(log(n)/m). > 2. New children are created once a node is visited 1 time (URd) or 2 > times (UR2). > 3. Eye rule for random playouts: > * Solid eyes (all 4 from same group). > * False non-solid eyes (at least 50% of corners are of opposite > colour). > 4. Choosing legal moves for playouts: 1st probe is random, then scan. > > Is there anything else that's likely to be significant here? 1. My UCT constant is 1.0 - my formula is averageScore + c * sqrt( (2.0 * log(n)) / (10.0 * m) ); 2. New children are created when parent exceeds 100 visits. 3. I think the eye rule is the same (you state it differently, but I believe it's the same.) 4. playouts are truly uniform random - yours are not. I think point 4 could be significant but I can't be sure. - Don > > I guess I'll let it play some more games and see where it ends up. > > Cheers, Tim. > ___ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/ > ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Optimal explore rates for plain UCT
On Mar 12, 2008, at 2:17 PM, Don Dailey <[EMAIL PROTECTED]> wrote: Tim Foden wrote: Don Dailey wrote: I suggest exactly 25,000 play-outs that we should standardize on.50,000 will tax my spare computer which I like to use for modest CGOS tests. If it is agreed, I will start a 25k test.My prediction is that this will finish around 1600 ELO on CGOS. OK, I added Fluke to this (25k) test (twice), before I saw the later comment about using 10k too. Its looking like your drdGeneric 25k bot is currently around 1475 (147 games). Fluke on the other hand looks to be settling at around 1300 (125 games). I feel that I've probably got a problem in my implementation! :) (I've felt this for some time actually -- UCT never seemed to work well for me at all.) Details of Fluke's UCT + Random playouts. 1. UCT constant, c = 0.25. e.g. UCB value = averageScore + c * sqrt(log(n)/m). 2. New children are created once a node is visited 1 time (URd) or 2 times (UR2). 3. Eye rule for random playouts: * Solid eyes (all 4 from same group). * False non-solid eyes (at least 50% of corners are of opposite colour). 4. Choosing legal moves for playouts: 1st probe is random, then scan. Is there anything else that's likely to be significant here? 1. My UCT constant is 1.0 - my formula is averageScore + c * sqrt( (2.0 * log(n)) / (10.0 * m) ); Looks like your constant would be 1/sqrt(5) in his formula... Or about what he uses. 2. New children are created when parent exceeds 100 visits. 3. I think the eye rule is the same (you state it differently, but I believe it's the same.) 4. playouts are truly uniform random - yours are not. I think point 4 could be significant but I can't be sure. I haven't seen much of an impact from that. - Don I guess I'll let it play some more games and see where it ends up. Cheers, Tim. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Optimal explore rates for plain UCT
By the way, I never experimented with the formula I use, I just plagiarized it from information posted here. I'm working on my own fomula in the meantime and used to use something different I made up. What I used before worked like this: u = (w + n) / (g + n) u - is what you maximize, same as ucb value w - wins for this move g - games for this move n - a constant related to how many times parent visited n increases as the number of parent visits increase. I tried log(parent_count) but it increases too slow. I tried lot's of things here. When you compare this to the standard formula, it tends to be too exploitive, and not very explorative. If you use a constant for N, you get a search which eventually focuses on a single move only - because other moves can never "catch up" in score (unless the current move is discovered to have problems.) - Don Don Dailey wrote: Tim Foden wrote: Don Dailey wrote: I suggest exactly 25,000 play-outs that we should standardize on.50,000 will tax my spare computer which I like to use for modest CGOS tests. If it is agreed, I will start a 25k test.My prediction is that this will finish around 1600 ELO on CGOS. OK, I added Fluke to this (25k) test (twice), before I saw the later comment about using 10k too. Its looking like your drdGeneric 25k bot is currently around 1475 (147 games). Fluke on the other hand looks to be settling at around 1300 (125 games). I feel that I've probably got a problem in my implementation! :) (I've felt this for some time actually -- UCT never seemed to work well for me at all.) Details of Fluke's UCT + Random playouts. 1. UCT constant, c = 0.25. e.g. UCB value = averageScore + c * sqrt(log(n)/m). 2. New children are created once a node is visited 1 time (URd) or 2 times (UR2). 3. Eye rule for random playouts: * Solid eyes (all 4 from same group). * False non-solid eyes (at least 50% of corners are of opposite colour). 4. Choosing legal moves for playouts: 1st probe is random, then scan. Is there anything else that's likely to be significant here? 1. My UCT constant is 1.0 - my formula is averageScore + c * sqrt( (2.0 * log(n)) / (10.0 * m) ); 2. New children are created when parent exceeds 100 visits. 3. I think the eye rule is the same (you state it differently, but I believe it's the same.) 4. playouts are truly uniform random - yours are not. I think point 4 could be significant but I can't be sure. - Don I guess I'll let it play some more games and see where it ends up. Cheers, Tim. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Optimal explore rates for plain UCT
On Wed, 12 Mar 2008, Don Dailey wrote: 1. My UCT constant is 1.0 - my formula is averageScore + c * sqrt( (2.0 * log(n)) / (10.0 * m) ); so your contstant is 2/10 = 0.2 inside the sqrt(), which is equivalent to c=0.44 ? Christoph ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] Perl Pattern Matching Go Program
Hi, I have been working on a Go playing program in Perl, which I have attached as a module. I'll try to put in on my webspace too at http://homepage.ntlworld.com/daniel.gilder/ (when I figure out why I can't log in) It can play on KGS, using the script kgsbot.pl, where its about 25 kyu on a 9x9. It requires the perl modules Games::Go::GTP, Games::Go::Referee, and Games::Go::SGF, wich are on CPAN. It uses pattern matching, by learning patterns from sgf files. These it stores in database files (which are not included, so you need to provide the sgf files). I've got to the point where it works, but there are bugs. If anyone would like to try and improve it, they are welcome to have a go, and I will try to answer any questions about it. dan Games-Go-Player-0.05.tar.gz Description: application/tgz ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] solving the nakade problem
Hello, I think there is a very easy and straigthforward solution to the "nakade/seki" problem, here it is : For moves that are self-atari on a group that contains MORE than 5 stones : Both in the tree and the playouts, strictly forbid them (exactly like you forbid filling an eye). (This is to handle seki and have efficient playouts). For moves that are self-atari on a group that contains LESS than 5 stones : Allow them both in the tree and the playouts. In the playouts, they should be played with a low probability. But they should be played when there is no other move left. (This is to ensure groups with are dead with nakade are eventualy captured in "some" playouts). What do you think about this solution ? I will probably implement it in Zoe to see if it efficient, unless someone finds a flaw in the logic. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/