On Tue, 2008-10-28 at 12:48 -0200, Mark Boon wrote: > What do those number mean exactly? Are those winning percentages > against your ref-bot with 1,000 playouts? > > I'm not getting the same results as Michael does. I see some > improvement, but not nearly as much. > > I had to make a few changes however, as I was adding up the results > in a different way than your ref-bot. So what I did was implement > tracking the AMAF statistics exactly the same way as in your ref-bot. > Next I ran a few hundred games, comparing the win/hit ratio of the > newly implemented method with my old one to make sure I got the same > result. When that checked out I made the modifications as described > by Michael. It seems rather straightforward, but after 200 games I > only get a win-rate of ~55% with 2,000 playouts instead of the 63% he > has. That's a considerable difference. > > Possibly I introduced a little bug somewhere... Or I need a larger > test-sample. > > I do agree with Michael that this idea is more logical and simpler.
It's simpler for sure, more logical I don't agree. The simplest thing is just to say test every move and don't worry about counting ko moves multiple times or giving credit for moves the opponent plays first (in which case you don't really deserve credit for that move.) It's clearly not very logical and it plays weaker too. Like I say, I had to make a decision on this one way or the other so I went with what was the more standard approach since I believed it to be more logical, stronger playing, and more correct. However, with Michael what I was arguing against wasn't the update style, but his progressive bias against later moves - clearly good, but clearly a performance enhancement and a more complicated specification. If I accepted that, it would be difficult to refuse every other little tweak and enhancement that others will be sending me. For instance, this is not a new idea - Christoph Birk has used it in his bot, but he uses a more sophisticated decaying weight that is not linear. Why should I use the Williams function instead of the Birk function? Or some other function? Do you see why you have to draw a line in the sand somewhere? > I > had implemented a different way to check which side moved at which > location first. I believe it's more efficient but it doesn't lend > itself for Michael's method. However, your way of checking seems > rather costly as it's a loop of 1/4 N^2 iterations, where N is the > number of moves (which we know is 111 in the beginning). With > Michael's method this is reduced to 1/2N iterations. If you are worried about big-O characteristics, you don't have to implement this in a loop like I do. I did it that way to make it crystal clear what I was doing. A fast way is to use a perfect hash table to track duplicates. It's totally trivial. In this case a perfect hash table is a small array (large enough to encompass all the legal intersections) where the move IS the index (or you could say the move, which is an integer, IS the hash function.) If there are 81 points, zero out 81 elements in an array before you start gathering stats, then set that to 1 when you first encounter that move. The test is a single array lookup instead of an expensive loop. However, I didn't bother with all of that because I felt the code would be somewhat less clear. There are number of optimization like this that are in my own program that I didn't do here. This optimization is probably not worth very much. Compared to the cost of playing an actual game, this is probably not that expensive but I will test it for you. Maybe I will be surprised. The one optimization that I DID do, I probably should not have. The uniform random move selection loop is too complicated to understand with a quick glance at the code - but it's a huge performance gain over the most naive correct method and at the time it seemed too good to leave out. Also, I wanted the reference bot to not be too slow since I envision it being used to test other references against. - Don > > My ref-bot on CGOS is still going. It seems to be always within a few > ELO points of your Jref and Cref bots. So I'm going to stop it as I > don't see much point in populating CGOS with identical bots. It has > played over 500 games, which I think is enough empirical evidence > that the implementation is 'good enough'. > > Mark > > > On 28-okt-08, at 10:54, Don Dailey wrote: > > > I have empirical evidence that mwRefbot level 4096 (the Michael > > Williams > > enhanced reference bot at 4096 playouts) is stronger than the > > reference > > bot at the same (and even higher levels.) > > > > > > Rank Name Elo + - games score oppo. draws > > 1 mwRefbot-004096 2684 23 22 1529 78% 2140 0% > > 2 refbot-008192 2670 21 21 1532 73% 2274 0% > > 3 refbot-004096 2645 21 21 1528 71% 2251 0% > > 4 refbot-002048 2555 21 21 1530 63% 2237 0% > > 5 refbot-001024 2431 23 23 1529 54% 2178 0% > > 6 refbot-000512 2253 27 28 1531 47% 2066 0% > > 7 refbot-000256 2000 34 35 1529 42% 1932 0% > > 8 refbot-000128 1646 42 42 1529 45% 1655 0% > > 9 refbot-000064 1290 40 40 1529 50% 1321 0% > > 10 refbot-000032 970 40 40 1531 52% 1041 0% > > 11 refbot-000016 621 33 32 1529 55% 780 0% > > 12 refbot-000008 370 25 25 1529 47% 649 0% > > 13 refbot-000004 250 23 22 1529 43% 551 0% > > 14 refbot-000002 145 22 22 1529 35% 500 0% > > 15 refbot-000001 5 22 23 1528 22% 475 0% > > 16 refbot-000000 0 22 23 1531 22% 484 0% > > > > - Don > > > > > > _______________________________________________ > > computer-go mailing list > > computer-go@computer-go.org > > http://www.computer-go.org/mailman/listinfo/computer-go/ > > _______________________________________________ > computer-go mailing list > computer-go@computer-go.org > http://www.computer-go.org/mailman/listinfo/computer-go/
signature.asc
Description: This is a digitally signed message part
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/