Thanks for the suggestion, Mark.  Below I show the exact code used (C#) and the 
amount of time each version took.

Original:

        for (int mv = 0; mv < NNN; mv++)
        {
            credit[mv] = true; // initial assumption is that credit is awarded 
for any move
        }
        double weight = 1.0;
        double weightDelta = 2.0 / (ctm - savctm + 1);
        for (int i = savctm; i < ctm; i += 2)
        {
            int mv = mvs[i] & MASK;

            if (credit[mv])
            {
                wins[mv] += weight * sc;
                hits[mv] += weight;
                credit[mv] = false;  // do not award credit for this move again
            }

            int opponentMove = mvs[i + 1] & MASK;
            if (credit[opponentMove])
            {
                double opponentWeight = 0.1 * weight;
                wins[opponentMove] -= opponentWeight * sc;
                hits[opponentMove] += opponentWeight;
                credit[opponentMove] = false;  // do not award credit for this 
move again
            }

            weight -= weightDelta;
        }


After Mark's suggestion:

        currentPlayoutId++;
        if (currentPlayoutId == 0)
        {
            // Need to reset
            for (int mv = 0; mv < NNN; mv++)
                playoutInWhichMoveLastUsed[mv] = 0;
            currentPlayoutId = 1;
        }
        double weight = 1.0;
        double weightDelta = 2.0 / (ctm - savctm + 1);
        for (int i = savctm; i < ctm; i += 2)
        {
            int mv = mvs[i] & MASK;

            if (playoutInWhichMoveLastUsed[mv] != currentPlayoutId)
            {
                wins[mv] += weight * sc;
                hits[mv] += weight;
                playoutInWhichMoveLastUsed[mv] = currentPlayoutId;  // do not 
award credit for this move again
            }

            int opponentMove = mvs[i + 1] & MASK;
            if (playoutInWhichMoveLastUsed[opponentMove] != currentPlayoutId)
            {
                double opponentWeight = 0.1 * weight;
                wins[opponentMove] -= opponentWeight * sc;
                hits[opponentMove] += opponentWeight;
                playoutInWhichMoveLastUsed[opponentMove] = currentPlayoutId;  
// do not award credit for this move again
            }

            weight -= weightDelta;
        }


1M playouts on 9x9 (2 test runs each):
        Original          -- 83.23 seconds, 83.66 seconds
        Mark's Suggestion -- 84.45 seconds, 85.69 seconds

100K playouts on 19x19 (2 test runs each):
        Original          -- 51.02 seconds, 50.83 seconds
        Mark's Suggestion -- 51.48 seconds, 50.90 seconds

50K playouts on 29x29 (2 test runs each):
        Original          -- 77.77 seconds, 76.76 seconds
        Mark's Suggestion -- 79.11 seconds, 79.10 seconds


I am unable to explain the results.


Don Dailey wrote:
The marker idea I call aging.   I call the variable age because it seems
to be more descriptive to me.  Each playout has a different age
associated with it.   The idea is used in persistent hash tables in
computer chess and I'm sure in other games too.

However, I seriously doubt this particular enhancement is even worth
doing in this context.   It may even be difficult to measure the speedup
it will give you. I would be surprised if you get 1/2 percent.
I don't initialize the array in a loop either, I use memset in C.   If I
did this is java I would use whatever java method is the equivalent
assuming it is optimized.
If you want to save on initialization (which is pretty fast on modern
processors anyway) you can make this a bit array instead of an int
array.   Then it would initialize about 32 times faster but you would
probably spend a little more time referencing it.
If you were to profile the code, I'll bet this would be one of the very
last sections of code you would bother optimizing.   Still, if you are a
speed freak you would probably at some point get around to doing it just
because you can!
- Don



On Tue, 2008-10-28 at 21:24 -0200, Mark Boon wrote:
Some interesting ideas Michael, keep them coming.

With regards to the code-snippet you posted, this kind of
'mark-to-prevent-double-work' type of construction is very common in
computer-Go. Looping over the complete array to set the initial value
is very expensive and totally avoidable.

The same functionality could be achieved by the following:

        marker++; // This is an int initialised to zero somewhere in
the beginning of your program.

        double weight = 1.0;
        double weightDelta = 2.0 / (ctm - savctm + 1);
        for (int i = savctm; i < ctm; i += 2)
        {
            int mv = mvs[i] & MASK;

            if (credit[mv]!=marker) // Note that credit[] is now an
array of int instead of boolean
            {
                wins[mv] += weight * sc;
                hits[mv] += weight;
                credit[mv] = marker;  // do not award credit for this move
 again
            }

This is slightly simplified, just so you get the idea. To make it 100%
safe you need to check if marker==0 after the increment and clear the
credit array. This is so common I use a little helper class that wraps
this fucntionality for me, called BoardMarker. The code would look as
follows:

       boardMarker.getNewMarker(); // BoardMarker is allocated
somewhere at the beginning of the program: BoardMarker boardMarker =
new BoardMarker();

        double weight = 1.0;
        double weightDelta = 2.0 / (ctm - savctm + 1);
        for (int i = savctm; i < ctm; i += 2)
        {
            int mv = mvs[i] & MASK;

            if (boardMarker.notSet(mv))
            {
                wins[mv] += weight * sc;
                hits[mv] += weight;
                boardMarker.set(mv);  // do not award credit for this move
 again
            }

Disclaimer, my code is not like this (yet) so I didn't try this for
real. But it should work.

On Tue, Oct 28, 2008 at 8:43 PM, Michael Williams
<[EMAIL PROTECTED]> wrote:
       for (int mv = 0; mv < NNN; mv++)
       {
           credit[mv] = true; // initial assumption is that credit is
awarded for any move
       }
       double weight = 1.0;
       double weightDelta = 2.0 / (ctm - savctm + 1);
       for (int i = savctm; i < ctm; i += 2)
       {
           int mv = mvs[i] & MASK;

           if (credit[mv])
           {
               wins[mv] += weight * sc;
               hits[mv] += weight;
               credit[mv] = false;  // do not award credit for this move
again
           }

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

------------------------------------------------------------------------

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to