On Sat, 2007-09-29 at 22:06 -0400, Don Dailey wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I'm starting to get curious. What are you doing that is causing it to > win 11 out of 11 against genAnchor_1k and yet it's only 113 ELO > stronger? And it's supposedly an identical program? I don't think > it's actually identical and I don't trust that you did everything the > same. Of course I could have a bug too, but it doesn't explain why > your program is much weaker than the results against genAnchor would > indicate.
One explanation is that ELO is not a perfect predictor of performance. Some methods are more effective against others. It also looks like genAnchor_1k had a rally overnight and it's now 15/21. One of the losses was on time as HouseBot got tripped up thinking indefinitely while in a won position. It's certainly possible for either of us to have a bug. Maybe mine makes my bot stronger ;) > How many all-as-first moves are you considering? 7/8 of the moves until the end of the simulated game. This will vary from random sim to random sim if the game lengths are different. It also uses passes as part of the game length calculation. A result of 0 moves is replaced with 1 move. Out of that list, only moves for the color to play are considered, and only if the opponent didn't play in that spot first. > What are the possible > differences? > > 1. eye rule I thought we beat this one to death. It's an eye if all 4 neighbors match the color to play (or are off the board), and one of the following are true: 1. It's a corner and the diagonal position is not an enemy stone 2. It's an edge and neither of the diagonal positions are an enemy stone 3. It's in the center and no more than one of the diagonal positions are an enemy stone > 2. number of all-as-first moves (7/8 for me + 1) Explained above, but I realize I got one small detail wrong. It's max(1, min(1000,7/8*moves)) > 3. quality of RNG Mersenne Twister (generic implementation available free online) > 4. correctness of random move selection strategy. Pick a random empty position. If illegal or eye-filling, remove from consideration the list and repeat. > 5. depth at which you stop a game. (about 1000 moves for me.) I never stop a game. They always play out to the end. I put an upper bound of 1000 on the moves to keep for AMAF, but the random game will continue until it's completed. > 6. stopping rule. (both program have no non-eye filling moves.) I use that same rule. If a program hits that condition, it's forced to pass. Play then continues until there's two passes in a row. Each pass is considered a move for purposes of AMAF counting. > I just thought of something. I think I initialize the statistics array > with 1 draw per move as a cheesy way to avoid divide by zero error. > Could this be affecting the performance? Perhaps at low levels like > this it has a noticeable effect? Would it make the program especially > vulnerable to an identical program that doesn't do this? Draw? Is that a valid outcome of your random simulations? I would assume your counters would be integers (and a draw value of 1/2 wouldn't work). Curiously enough, you found a potential bug in my amaf implementation. Divide by zero could occur (but doesn't seem to) Also, we may differ on our handling of ko. I use simple ko rules in my playouts. I think that differs from other implementations that don't use ko rules at all. I don't use any super ko logic anywhere in my bot at the moment. _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/