Re: [computer-go] Michael Williams enhancement

Mark Boon Tue, 28 Oct 2008 09:08:49 -0700

Don,

Don't get me wrong, I understand very well where you're coming from.It's very handy to have a concise defintion of a minimalist playingprogram that can be used to check correctness. The main reason Ithought Michael's idea was worth considering was that it was bothsimpler AND played better AND scaled better. But if in order to get ameaningful boost in strength you have to maintain some sort of checkwhether a move was played by one side before, then I agree it may bebetter to leave his improvement out of the reference specification.


        Mark

On 28-okt-08, at 13:49, Don Dailey wrote:

On Tue, 2008-10-28 at 12:48 -0200, Mark Boon wrote:

What do those number mean exactly? Are those winning percentages
against your ref-bot with 1,000 playouts?

I'm not getting the same results as Michael does. I see some
improvement, but not nearly as much.

I had to make a few changes however, as I was adding up the results
in a different way than your ref-bot. So what I did was implement
tracking the AMAF statistics exactly the same way as in your ref-bot.
Next I ran a few hundred games, comparing the win/hit ratio of the
newly implemented method with my old one to make sure I got the same
result. When that checked out I made the modifications as described
by Michael. It seems rather straightforward, but after 200 games I
only get a win-rate of ~55% with 2,000 playouts instead of the 63% he
has. That's a considerable difference.

Possibly I introduced a little bug somewhere...  Or I need a larger
test-sample.

I do agree with Michael that this idea is more logical and simpler.

It's simpler for sure, more logical I don't agree. The simplestthing

is just to say test every move and don't worry about counting ko moves
multiple times or giving credit for moves the opponent plays first (in
which case you don't really deserve credit for that move.)   It's

clearly not very logical and it plays weaker too. Like I say, Ihadto make a decision on this one way or the other so I went with whatwas

the more standard approach since I believed it to be more logical,
stronger playing, and more correct.

However, with Michael what I was arguing against wasn't the update

style, but his progressive bias against later moves - clearly good,butclearly a performance enhancement and a more complicatedspecification.If I accepted that, it would be difficult to refuse every otherlittle

tweak and enhancement that others will be sending me.

For instance, this is not a new idea - Christoph Birk has used itin his

bot, but he uses a more sophisticated decaying weight that is not
linear.   Why should I use the Williams function instead of the Birk

function? Or some other function? Do you see why you have todraw a

line in the sand somewhere?

I
had implemented a different way to check which side moved at which
location first. I believe it's more efficient but it doesn't lend
itself for Michael's method. However, your way of checking seems
rather costly as it's a loop of 1/4 N^2 iterations, where N is the
number of moves (which we know is 111 in the beginning). With
Michael's method this is reduced to 1/2N iterations.


If you are worried about big-O characteristics, you don't have to
implement this in a loop like I do.  I did it that way to make it
crystal clear what I was doing.   A fast way is to use a perfect hash
table to track duplicates.  It's totally trivial.  In this case a
perfect hash table is a small array (large enough to encompass all the
legal intersections) where the move IS the index (or you could say the
move, which is an integer, IS the hash function.)

If there are 81 points, zero out 81 elements in an array before you

start gathering stats, then set that to 1 when you first encounterthat

move.  The test is a single array lookup instead of an expensive loop.

However, I didn't bother with all of that because I felt the codewould

be somewhat less clear.    There are number of optimization like this
that are in my own program that I didn't do here.

This optimization is probably not worth very much. Compared to thecost

of playing an actual game, this is probably not that expensive but I
will test it for you.  Maybe I will be surprised.

The one optimization that I DID do, I probably should not have.  The

uniform random move selection loop is too complicated to understandwith

a quick glance at the code - but it's a huge performance gain over the
most naive correct method and at the time it seemed too good to leave
out.  Also, I wanted the reference bot to not be too slow since I
envision it being used to test other references against.


- Don


My ref-bot on CGOS is still going. It seems to be always within a few
ELO points of your Jref and Cref bots. So I'm going to stop it as I
don't see much point in populating CGOS with identical bots. It has
played over 500 games, which I think is enough empirical evidence
that the implementation is 'good enough'.

        Mark


On 28-okt-08, at 10:54, Don Dailey wrote:

I have empirical evidence that mwRefbot level 4096 (the Michael
Williams
enhanced reference bot at 4096 playouts) is stronger than the
reference
bot at the same (and even higher levels.)


Rank Name              Elo    +    - games score oppo. draws
   1 mwRefbot-004096  2684   23   22  1529   78%  2140    0%
   2 refbot-008192    2670   21   21  1532   73%  2274    0%
   3 refbot-004096    2645   21   21  1528   71%  2251    0%
   4 refbot-002048    2555   21   21  1530   63%  2237    0%
   5 refbot-001024    2431   23   23  1529   54%  2178    0%
   6 refbot-000512    2253   27   28  1531   47%  2066    0%
   7 refbot-000256    2000   34   35  1529   42%  1932    0%
   8 refbot-000128    1646   42   42  1529   45%  1655    0%
   9 refbot-000064    1290   40   40  1529   50%  1321    0%
  10 refbot-000032     970   40   40  1531   52%  1041    0%
  11 refbot-000016     621   33   32  1529   55%   780    0%
  12 refbot-000008     370   25   25  1529   47%   649    0%
  13 refbot-000004     250   23   22  1529   43%   551    0%
  14 refbot-000002     145   22   22  1529   35%   500    0%
  15 refbot-000001       5   22   23  1528   22%   475    0%
  16 refbot-000000       0   22   23  1531   22%   484    0%

- Don


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/


_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Michael Williams enhancement

Reply via email to