Le 12 mai 2010 à 19:30, Don Dailey a écrit :
> Thomas,
>
> You method sound very sensible if I understand it correctly and it seems to
> me to be the best.
>
> I'm not clear on the memory collection when you run out of memory. Are you
> saying that you track the "age" of each node, perhaps bumping a global age
> every time you create a node and just overwrite entries that are old when you
> are looking for a slot?
I keep a "current time" variable that I increment each time I start a new walk
in the tree. During this walk, I set the age of each node I cross to this
value. So when I search for an old node, I look at the difference between node
value and current global time. This value tell for how many step I've not gone
through this node.
This allow me to find the oldest node to free. The best thing would be to
search the oldest node in the full tree but this would be too expensive so,
instead I just search in restricted subset of the hashtable. I've tried to add
an additional check here to check a more larger subset if I didn't find a node
old enough but this doesn't improve significantly the performance.
Freeing nodes on the fly require to be bit carefull about the design of your
hashtable but another scheme allow to do it a lot more simply but less cleaner
in my opinion. You use a very simple hashtable and when you are out of nodes,
you stop a bit just for the time to collect all old nodes and restart
searching. This is like a garbage collection scheme (and can be done during
pondering to not loose time) and also work very well and have almost the same
properties.
> So basically you don't create a big child list until you are sure you need
> it. I'm not sure I fully understand your description except for the gist of
> it. Do you generate only 1 child at a time or all at once (as soon as you
> know the parent will start getting expanded.
I like to see my nodes as node with links to the childs but no childs... Let me
explain a bit more.
Remember I don't use real tree so the notion of child is not real here. I just
have nodes who represent a state of the goban. In this state there is some
valid points to be played and I want to choose the best of them. In order to do
this I keep in the node all informations needed to make an UCB+RAVE choice.
When I walk throught the part of the game tree stored as nodes in the
hashtable, I always use only information stored in that node to make the choice
of the next play and this play allow me to produce the next situation on the
goban and so find an eventual next node in the hashtable.
When I come to a goban position for which no node currenly exist, I look at the
informations used to choose the last play a decide if I'm confident enough that
this is a promising way. If this is the case, I allocate a new heavy node and
initialize it: this imply building all it's internal informations needed to
choose play from this position so I need to find the valid play at this
position and build the list.
For the moment I use a very simple way to choose if I allocate a new node based
only on the number of simulation but you can use any scheme you want based on
all informations you have.
So, to return to the parallel with the tree with light and heavy nodes, the
informations stored in the child list of one of my node can be seen as a light
node in a tree based engine: a small information used by the parent to choose
next move but without informations about their own childs.
The difference come at the expansion time. In tree based engine, you replace
the light node with a full node. From the parent point of view there is no
change as it will still use the same information in the node but if the parent
choose to go in this node then you will use the childs informations.
In my case, the operation equivalent to expansion to heavier node is adding a
node for this child in the hashtable. So the light representation will still be
in the parent list but the full information will be in the new node. Using
hashtable, this make sense as the light informations is needed by the parent to
make it's decision and the heavier one is needed by the child, so each one keep
is own data. (in fact the child also need the light data so it is duplicated)
I'm not sure I'm very clear as english is not my native language but I hope
this clearer. As an example, this is a rought code for doing this in pseudo-C
like. I'm sure there is a lot of errors in it but I hope it will my scheme will
be easier to understand with this as like the adage say "some code is worth
thousand words" (or something like this ;-) :
struct node_s {
int nb_simul;
int nb_win;
int nb_child;
int child_point[nb_child];
int child_simul[nb_child];
int child_win [nb_child];
}
search_in_tree(goban, create) {
node = get_node_for(goban);
// If no node go for playout
if (node == NULL) {
// If needed create the new node
if (create) {
node = get_newnode_for(goban);
initialize(node);
}
// Go for a playout
result = playout(goban);
// Update and return
update_node(node, result);
return result;
}
// Select next point to play
best = -1.0;
for (i = 0; i < node->nb_child; i++) {
sc = calc_score(node, i);
if (sc > best)
next = i;
}
point = node->child_point[next];
// Play the move and go deeper
play_a_stone(goban, point);
result = search_in_tree(goban, (node->ch_simul[next] > 50));
// Update node and return
update_node(node, next, result);
return result;
}
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go