RE: [computer-go] The effect of the UCT-constant on Valkyria

Magnus Persson Sat, 03 May 2008 09:48:47 -0700

Quoting David Fotland <[EMAIL PROTECTED]>:

So I'm curious then.  With simple UCT (no rave, no priors, no progressive
widening), many people said the best constant was about 0.45.  What are the
new concepts that let you avoid the constant?


Is it RAVE, because the information gathered during the search lets you
focus the search accurately without the UCT term?  Many people have said
that RAVE has no benefit for them.

Yes, it is RAVE, and mor specifil as it was last presented hererecently in the mailing list by the Mogo team, and not how it is wasoriginally presented in the mogo paper. Also there may be severalminor details that are peculiar to my implementation. Actually I didnot understand some aspects of the Mogo method mailed here and justguessed some details. It suddenly worked and I could feel that thesearch was unusually strong and selective, and since then I justadjusted some parameters.

I used to do progressive widening but that is now turned off. RAVE isfree to pick any move that is not pruned right away.

Currently I believe that RAVE is only effective if one gets otherparameters right. For me it meant changing the uct parameter from 0.8into 0.1. I also know of many pathological situations where Valkyriacurrently will not find the best move, but rather the second best. Itis possible that other programs suffers even more than Valkyria fromsimilar problems and that this to some extent has to do with that thenature of the playouts may interfere with AMAF. For example V eitherplays forced moves or uniformly random among moves that are notpruned. Other programs may rely on patterns to pick all moves in theplayouts and this might be bad for AMAF (this is a wild speculation).

Do most of the strongest programs use RAVE?  I think from Crazystone's
papers, that it does not use RAVE.  Gnugomc does not use rave.

You might not need it if you have strong pattern matching priors forthe tree part similar to Crazystone. RAVE makes it possible to ignoremost bad moves in a given positions. The weakness is that often somegood (with a chance of being the best possible move) are also ignoredcompletely.

Is it the prior values from go knowledge, like opening books, reading
tactics before the search etc?  Do all of the top programs have opening
books now?  I know mogo does.

Valkyria has just 4 moves in a hardcoded openingbook. Previousversions used a book with several 1000's of positions that was bothself learned and modified by hand, but as long as the program changesthe book tend become inaccurate, so right now I do not use it and isplanning to write something more efficient than the old one which kepteach position as file on the harddrive.

Do most of the top programs read tactics before the search?  I know Aya
does.

Valkyria only does some simple tactics in the playouts. It is strongerthan anything I ever programmed (on 9x9 at least) so currently Icannot see how to integrate precomputed tactical results in the latersearch. I think Aya is special because it was very strong doing searchbefore it went MC.

Does it matter how prior values are used to guide the search?  I think mogo
uses prior knowledge to initialize the RAVE values.  Do other programs
include it some other way, by initializing the FPU value, or by initializing
the UCT visits and confidence, or some extra, "prior" term in the equation?

Right know Valkyria sets priors for AMAF so that moves that are a goodlocal response to the last move have a prior 100% winrate with 20-100visits depending on the priority of the triggered pattern. I thinkMogo has a fixed number of visits for the priorities but modifies thewinrate, but I never saw this described in a way that made it clear.

Previously I biased the UCT values after everyting else was computedbut found that this led to some bad behavior. By biasing the AMAFvalues these biases will get less influential as the true winrate hasmore weight than the AMAF-scores.

Are there other techniques (not RAVE) that people are using to get
information from the search to guide the move ordering?  I think crazystone
estimates ownership of each point and uses it to set prior values in some
way.

I used to do that long time ago in Viking (the precursor to Valkyria)that used alphabeta + MC-eval. As I remember it then it had a greatimpact on move ordering that was quite bad (or even nonexistent) forViking.

I have tried it in Valkyria but was never able to see an improvement.But I did not try hard enough to tell for sure. Both ownership andAMAF use the same information (playouts), so trying to use it twice isperhaps partially a waste of effort.


-Magnus

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

RE: [computer-go] The effect of the UCT-constant on Valkyria

Reply via email to