Hi, I thought I'd report a small change I have made to the plain MC algorithm. I have been unhappy with the fact that the result of the game gets pressed into one bit (win/loose), and all other information is discarded. This leads to silly endgame moves, since the algorithm sees no difference between a 1-point win and a 100-point win.
After thinking a bit about this, I decided to keep the whole board image at the end. In other words, I have one board-sized array of ints, and at the end, when scoring the result of a simulation, I add +1 to every point that belongs to black, and -1 to every point that belongs to white. Finally I divide this by the number of simulations I have done. When evaluating the simulation, I sum up the points on this board. A simple summing didn't work so well, I thought squaring the numbers before summing would be better, but in fact it was worse. Now I end up with the following: sum += sign(v) * pow(v, exp); where exp is something like 0.5 I did run one player with exp=0.8, and it seemed to perform about equally well. As a side benefit, I get to know which points clearly belong to black or white, and can skip unnecessary endgame moves easily (filling own territory to one-point eyes, making silly invasions to enemy territory, etc). I believe this could be used for much more clever pruning, but I have not (yet?) done so. The best of these experiments, Halgo-1.5-500k reached almost 1400 elo on cgos, which I think is pretty good for plain MC without any trickery. Here my notes of the versions I had there: Halgo1 - 1250 - the all first one Halgo-1.0-500k - 477 - Traditional MC. Something wrong with it? Halgo-1.1-500k - 1280 - Simple MC with area evaluation Halgo-1.2-500k - 1053 - Same as 1.1, but with a non-linear scoring function (s*v*v) Halgo-1.3-500k - 1400 - Same as 1.1, but with pow(v,0.8) as the summing function. Halgo-1.4-500k - 492 - Same as 1.1, but with plain sign(v) as the summing function. Halgo-1.5-500k - 1390 - Going for the pow function again, with 0.5. About same as 1.3 I am not sure if there was something wrong with halgo-1.0, it was supposed to play 'traditional' MC, summing just the winner of the games. My code is heavily based on the Efficient Go Library by Lukas Lew. Actually, my own code is only some 360 lines, warts and all. I have put a tarball with Lew's library and my code at http://www.lsd.dk/heikki/halgo-1.5.tgz if you want to look at it. Being derived from a GPL work, it is under GPL itself too, of course. The 500k refers to the number of simulated random games from the given position, equally divided among all legal moves that don't fill one-point eyes. 500k sounds like a large number but the games typically use less than 5 minutes of their clock, so are quite ok to play on cgos. This is on my dual-core AMD desktop (using only one core). I still suspect that 500k is nowhere enough to get reliabe results, at least from the opening position, as the program wants to start at all sort of places. It has some preference to the centerpoint (on 9x9), and some tendency to avoid the first and second line moves, but nowhere near enough to be predictable. I even played a game on 19x19, and although the opening was kind of surreal, it shaped up quite well. I overlooked an atari, and it took advantage of that immediately, saving a group of maybe 20 stones. In the end I won clearly, but Halgo had three separate live groups on the board. It took 2.5 hours, and I was glad I had eliminated the silly endgame moves. In case you are interested in numbers, here are some: Cpuinfo says vendor_id : AuthenticAMD cpu family : 15 model : 43 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping : 1 cpu MHz : 1000.000 cache size : 512 KB except that MHz will be 2000 when there is load on the machine, as there is when playing. The system is Linux with Debian/testing. GCC says: gcc -v Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --with-tune=i686 --enable-checking=release i486-linux-gnu Thread model: posix gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) Next I guess I will have a look at UCT, it seems to be the hottest topic at the moment. - Heikki -- Heikki Levanto "In Murphy We Turst" heikki (at) lsd (dot) dk _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/