On Tue, 2008-10-28 at 12:48 -0200, Mark Boon wrote:
What do those number mean exactly? Are those winning percentages
against your ref-bot with 1,000 playouts?
I'm not getting the same results as Michael does. I see some
improvement, but not nearly as much.
I had to make a few changes however, as I was adding up the results
in a different way than your ref-bot. So what I did was implement
tracking the AMAF statistics exactly the same way as in your ref-bot.
Next I ran a few hundred games, comparing the win/hit ratio of the
newly implemented method with my old one to make sure I got the same
result. When that checked out I made the modifications as described
by Michael. It seems rather straightforward, but after 200 games I
only get a win-rate of ~55% with 2,000 playouts instead of the 63% he
has. That's a considerable difference.
Possibly I introduced a little bug somewhere... Or I need a larger
test-sample.
I do agree with Michael that this idea is more logical and simpler.
It's simpler for sure, more logical I don't agree. The simplest
thing
is just to say test every move and don't worry about counting ko moves
multiple times or giving credit for moves the opponent plays first (in
which case you don't really deserve credit for that move.) It's
clearly not very logical and it plays weaker too. Like I say, I
had
to make a decision on this one way or the other so I went with what
was
the more standard approach since I believed it to be more logical,
stronger playing, and more correct.
However, with Michael what I was arguing against wasn't the update
style, but his progressive bias against later moves - clearly good,
but
clearly a performance enhancement and a more complicated
specification.
If I accepted that, it would be difficult to refuse every other
little
tweak and enhancement that others will be sending me.
For instance, this is not a new idea - Christoph Birk has used it
in his
bot, but he uses a more sophisticated decaying weight that is not
linear. Why should I use the Williams function instead of the Birk
function? Or some other function? Do you see why you have to
draw a
line in the sand somewhere?
I
had implemented a different way to check which side moved at which
location first. I believe it's more efficient but it doesn't lend
itself for Michael's method. However, your way of checking seems
rather costly as it's a loop of 1/4 N^2 iterations, where N is the
number of moves (which we know is 111 in the beginning). With
Michael's method this is reduced to 1/2N iterations.
If you are worried about big-O characteristics, you don't have to
implement this in a loop like I do. I did it that way to make it
crystal clear what I was doing. A fast way is to use a perfect hash
table to track duplicates. It's totally trivial. In this case a
perfect hash table is a small array (large enough to encompass all the
legal intersections) where the move IS the index (or you could say the
move, which is an integer, IS the hash function.)
If there are 81 points, zero out 81 elements in an array before you
start gathering stats, then set that to 1 when you first encounter
that
move. The test is a single array lookup instead of an expensive loop.
However, I didn't bother with all of that because I felt the code
would
be somewhat less clear. There are number of optimization like this
that are in my own program that I didn't do here.
This optimization is probably not worth very much. Compared to the
cost
of playing an actual game, this is probably not that expensive but I
will test it for you. Maybe I will be surprised.
The one optimization that I DID do, I probably should not have. The
uniform random move selection loop is too complicated to understand
with
a quick glance at the code - but it's a huge performance gain over the
most naive correct method and at the time it seemed too good to leave
out. Also, I wanted the reference bot to not be too slow since I
envision it being used to test other references against.
- Don
My ref-bot on CGOS is still going. It seems to be always within a few
ELO points of your Jref and Cref bots. So I'm going to stop it as I
don't see much point in populating CGOS with identical bots. It has
played over 500 games, which I think is enough empirical evidence
that the implementation is 'good enough'.
Mark
On 28-okt-08, at 10:54, Don Dailey wrote:
I have empirical evidence that mwRefbot level 4096 (the Michael
Williams
enhanced reference bot at 4096 playouts) is stronger than the
reference
bot at the same (and even higher levels.)
Rank Name Elo + - games score oppo. draws
1 mwRefbot-004096 2684 23 22 1529 78% 2140 0%
2 refbot-008192 2670 21 21 1532 73% 2274 0%
3 refbot-004096 2645 21 21 1528 71% 2251 0%
4 refbot-002048 2555 21 21 1530 63% 2237 0%
5 refbot-001024 2431 23 23 1529 54% 2178 0%
6 refbot-000512 2253 27 28 1531 47% 2066 0%
7 refbot-000256 2000 34 35 1529 42% 1932 0%
8 refbot-000128 1646 42 42 1529 45% 1655 0%
9 refbot-000064 1290 40 40 1529 50% 1321 0%
10 refbot-000032 970 40 40 1531 52% 1041 0%
11 refbot-000016 621 33 32 1529 55% 780 0%
12 refbot-000008 370 25 25 1529 47% 649 0%
13 refbot-000004 250 23 22 1529 43% 551 0%
14 refbot-000002 145 22 22 1529 35% 500 0%
15 refbot-000001 5 22 23 1528 22% 475 0%
16 refbot-000000 0 22 23 1531 22% 484 0%
- Don
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/