On Sat, 2007-09-29 at 22:06 -0400, Don Dailey wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm starting to get curious.  What are you doing that is causing it to
> win 11 out of 11 against genAnchor_1k and yet it's only 113 ELO
> stronger?    And it's supposedly an identical program?    I don't think
> it's actually identical and I don't trust that you did everything the
> same.   Of course I could have a bug too,  but it doesn't explain why
> your program is much weaker than the results against genAnchor would
> indicate.

One explanation is that ELO is not a perfect predictor of performance.
Some methods are more effective against others.  It also looks like
genAnchor_1k had a rally overnight and it's now 15/21.  One of the
losses was on time as HouseBot got tripped up thinking indefinitely
while in a won position.

It's certainly possible for either of us to have a bug.  Maybe mine
makes my bot stronger ;)

> How many all-as-first moves are you considering?

7/8 of the moves until the end of the simulated game.  This will vary
from random sim to random sim if the game lengths are different.  It
also uses passes as part of the game length calculation.  A result of 0
moves is replaced with 1 move.  Out of that list, only moves for the
color to play are considered, and only if the opponent didn't play in
that spot first.


> What are the possible
> differences?
> 
>   1. eye rule

I thought we beat this one to death.  It's an eye if all 4 neighbors
match the color to play (or are off the board), and one of the following
are true:
1. It's a corner and the diagonal position is not an enemy stone
2. It's an edge and neither of the diagonal positions are an enemy stone
3. It's in the center and no more than one of the diagonal positions are
an enemy stone

>   2. number of all-as-first moves  (7/8 for me + 1)

Explained above, but I realize I got one small detail wrong.  It's
max(1, min(1000,7/8*moves))

>   3. quality of RNG

Mersenne Twister (generic implementation available free online)

>   4. correctness of random move selection strategy.

Pick a random empty position.  If illegal or eye-filling, remove from
consideration the list and repeat.

>   5. depth at which you stop a game.  (about 1000 moves for me.)

I never stop a game.  They always play out to the end.  I put an upper
bound of 1000 on the moves to keep for AMAF, but the random game will
continue until it's completed.

>   6. stopping rule.  (both program have no non-eye filling moves.)

I use that same rule.  If a program hits that condition, it's forced to
pass.  Play then continues until there's two passes in a row.  Each pass
is considered a move for purposes of AMAF counting.


> I just thought of something.   I think I initialize the statistics array
> with 1 draw per move as a cheesy way to avoid divide by zero error.
> Could this be affecting the performance?   Perhaps at low levels like
> this it has a noticeable effect?    Would it make the program especially
> vulnerable to an identical program that doesn't do this?

Draw?  Is that a valid outcome of your random simulations?  I would
assume your counters would be integers (and a draw value of 1/2 wouldn't
work).

Curiously enough, you found a potential bug in my amaf implementation.
Divide by zero could occur (but doesn't seem to)

Also, we may differ on our handling of ko.  I use simple ko rules in my
playouts.  I think that differs from other implementations that don't
use ko rules at all.  I don't use any super ko logic anywhere in my bot
at the moment.

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to