On Sat, Jan 17, 2009 at 08:29:32PM +0100, Sylvain Gelly wrote:
> A small point: in "PlayoutOutTree", just after "if
> (!played.AlreadyPlayed(move)) {", there should have a "played.Play(move)".
> I believe it does not change the final result (as the check is also done in
> the backup, and the move played in the backup), but I simply forgot that
> line (that should make moves_played_out_tree smaller).
> 
> To avoid confusion, I repost the pseudo code with that correction (and
> hoping the indentation is not broken by the email editor once again).

Thank you so much for this! I have switched my RAVE implementation to
this formula and the bot has gotten noticeably stronger, though I
apparently still have some bugs to chase, since it seems to have trouble
considering strongest opponent's responses and frequently focuses on
unreasonable opponent's replies instead of the obvious (e.g. keeping a
group of stones in atari). Maybe I need better prior hinting...

I have few questions. Of course, please feel free to skip questions
about particular constants if you feel that's giving away too much. :-)

> ChooseMove(node, board) {
>   bias = 0.015  // I put a random number here, to be tuned
>   b = bias * bias / 0.25

Maybe it would be cleaner to define b = 1 / rave_equiv, where rave_equiv
is the number of playouts RAVE is thought to be equivalent of? Or is the
meaning of this constant actually different?

What value works best for people? I did not do much tuning yet, but I
use b=1/3000. I see Fuego uses b=1/5000. (This example b=1/1111.)

>   best_value = -1
>   best_move = PASSMOVE
>   for (move in board.allmoves) {
>     c = node.child(move).counts
>     w = node.child(move).wins
>     rc = node.rave_counts[move]
>     rw = node.rave_wins[move]
>     coefficient = 1 - rc / (rc + c + rc * c * b)
>     value = w / c * coef + rw / rc * (1 - coef)  // please here take care of
> the c==0 and rc == 0 cases
>     if (value > best_value) {
>       best_value = value
>       best_move = move
>     }
>   }
>   return best_move
> }

I have two questions here:

* Is the FPU concept abandoned? Or what values are reasonable? It seems
  to me 1.0, which is usually recommended, is obviously too big here
  since that's the upper bound of the value already. So far I have tried
  0.6 and 0.7 but both just make my bot slightly weaker.

* How to accomodate prior knowledge? (I'm using grand-parent heuristics,
  atari liberties, and few patterns.) Do you use it to fill normal
  counts, RAVE values or both? What count values work best for you?
  I have settled on 50 playouts.

-- 
                                Petr "Pasky" Baudis
The average, healthy, well-adjusted adult gets up at seven-thirty
in the morning feeling just terrible. -- Jean Kerr
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to