On 6/25/07, Sylvain Gelly <[EMAIL PROTECTED]> wrote:
I have to admit that it took me several weeks to make the RAVE algorithm
actually work, although the idea is so simple. That maybe explain your
previous results.
The description in the paper should be sufficient to make it work well.
Ok, I'll
It really pays when checking out an idea to be persistent and patient.
You usually don't get it right the first time if it's very complex or
interesting.
- Don
On Mon, 2007-06-25 at 19:31 +0200, Sylvain Gelly wrote:
> Hi,
>
>
> In the paper you only present results of UCT_RAVE with th
Hi,
In the paper you only present results of UCT_RAVE with the MoGo
default policy. Did you run tests with UCT_RAVE using "pure" random
playouts too?
Yes we did, and the improvement was also huge, but I don't remember the
exact results.
I'm curious because I've tried millions ( well, it fee
Hi,
In the paper you only present results of UCT_RAVE with the MoGo
default policy. Did you run tests with UCT_RAVE using "pure" random
playouts too?
I'm curious because I've tried millions ( well, it feels that way ) of
uses for AMAF in my code... but so far all of them have been proven
useless
>> Sorry, what is AMAF?
>
>Sorry: All Moves As First :)
OK, I see.
>Q_RLGO is not used in MoGo's versions which play online.
>Q_MoGo(s,a) is:
>- if (self atari(s,a)): 0
>- if one pattern, among the patterns used in MoGo's simulation policy,
>matches for move "a" in position "s", then 1
>- else 0.
Sorry, what is AMAF?
Sorry: All Moves As First :)
And I have another question; Don't you use Q_RLGO anymore?
If so, would you explain the detail of the Q_MoGo heuristic?
Q_RLGO is not used in MoGo's versions which play online.
Q_MoGo(s,a) is:
- if (self atari(s,a)): 0
- if one pattern, am
>> >Using prior knowledge on "normal" uct, and this was the use of prior
>> >knowledge brought about the same improvement.
>>
>> You mean, there is more improvement when using both?
>
>I mean that there is no need to have AMAF to get improvement by using prior
>knowledge.
Sorry, what is AMAF?
And
2007/6/23, Yamato <[EMAIL PROTECTED]>:
>Using prior knowledge on "normal" uct, and this was the use of prior
>knowledge brought about the same improvement.
You mean, there is more improvement when using both?
I mean that there is no need to have AMAF to get improvement by using prior
knowled
>Using prior knowledge on "normal" uct, and this was the use of prior
>knowledge brought about the same improvement.
You mean, there is more improvement when using both?
>It was gnugo default level, and we thought "default" was 8, but default is
>actually 10. I don't see why it is so surprising,
Hello,
2007/6/23, Yamato <[EMAIL PROTECTED]>:
>The cumulative result is only given using the prior knowledge on top
>of RAVE, but it could have been done the other way round and give the
>same type of results. Each particular improvement is somehow
>independent of the others.
I think I don't u
>The cumulative result is only given using the prior knowledge on top
>of RAVE, but it could have been done the other way round and give the
>same type of results. Each particular improvement is somehow
>independent of the others.
I think I don't understand that.
What do you mean for "the other wa
Hello all,
We just presented our paper describing MoGo's improvements at ICML,
and we thought we would pass on some of the feedback and corrections
we have received.
(http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf)
The way that we incorporate prior knowledge in UCT can be see
12 matches
Mail list logo