Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Erik van der Werf Wed, 06 Feb 2008 03:05:09 -0800

Hi Guillaume,

I think we talked about this before, but others may be interested as
well. In my opinion the ICML paper on Rave has several weaknesses.
It's been a while since I read the paper, but here are some I
remember:

(1) They compared Rave to plain UCT. If they would have compared it to
a more sophisticated implementation (like the best Mogo before Rave)
they probably could not have shown a spectacular improvement.

(2) The paper suggest that you should add an upper confidence bound to
the rave values. Although they didn't actually make this explicit,
because IIRC the paper failed to provide an actual value for the
constant c, I guess most people naturally assume it to be positive.
However, Rave is already a greedy heuristic, so if anything you should
probably subtract the confidence bound. Depending on the playout
policy, adding an upper confidence bound to the rave values can push
some terrible bad moves up (like playing on 1-1). The reason seems to
be that such moves are normally sampled very infrequently (so the UCB
will be higher), and when they are selected (e.g. for a big capture
deep in the playout) they correlate more with winning (so the value
will also be higher) without generalizing to earlier positions.

Best,
Erik

On Wed, Feb 6, 2008 at 10:47 AM, Chaslot G (MICC)
<[EMAIL PROTECTED]> wrote:
> I also implemented RAVE in Mango. There was a few points of improvements 
> (around 60 Elo points with gnugo as reference), but as much as in the paper 
> of Gelly and Silver :( (around 250 Elo points if I remember well)
>
>  It might be that the effect of RAVE depends a lot on the simulation 
> strategy. Indeed, sometimes my RAVE was playing very good moves but also very 
> bad ones.
>
>  Guillaume
>
>  -----Original Message-----
>  From: [EMAIL PROTECTED] on behalf of Magnus Persson
>  Sent: Wed 06/02/2008 00:42
>  To: computer-go@computer-go.org
>  Subject: Re: [computer-go] More UCT / Monte-Carlo questions
>
>  Quoting Gunnar Farnebäck <[EMAIL PROTECTED]>:
>
>  > I have never managed to implement RAVE successfully. It made my
>  > program significantly slower but no stronger even at a fixed number of
>  > simulations.
>
>  I get a small effect from RAVE, My rationalisation is that if the
>  program is rich with other features to improve performance RAVE may
>  not add that much.
>
>  -Magnus
>
>
>  _______________________________________________
>  computer-go mailing list
>  computer-go@computer-go.org
>  http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
> _______________________________________________
>  computer-go mailing list
>  computer-go@computer-go.org
>  http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Reply via email to