Re: [computer-go] Re: Mogo scalability

Hideki Kato Tue, 06 May 2008 07:40:31 -0700

Mark Boon: <[EMAIL PROTECTED]>:
>
>On 4-mei-08, at 14:57, Hideki Kato wrote:
>
>>  By my obserbation (they are running on my pcs and
>> both are Q6600/3GHz with different mother boards), mogo_big_4core's
>> perallelism is around 300% (by top command), perhaps due to its
>> heavier uct part (just my guess).
>
>Of course the CPU load doesn't really say how effective  
>parallelization is. Recently I bought an octo-core Mac and have been  
>running some tests. It takes time to get real conclusive data but I  
>have some observations that come purely from some testing and  
>watching. When using eight cores I get a speed-up of around six  
>times. That is in number of playouts per second. I think that's a  
>much more useful metric than looking at the CPU load.


Yes, of course.  It's just as wrote.

>Still, even number of playouts is not the end-all I believe. I have  
>the distinct impression that eight cores running for one second plays  
>considerably worse than one core running for six seconds, even though  
>the number of playouts is in the same ball-park. I haven't had the  
>time to do an extensive test on that yet but I'm convinced that the  
>picture is more complicated than just looking at total computing power.

I've wrote a paper about this issue for GPW 2007 (in Japanese).

Following is its English abstract.  Later half addresses this problem 
which parallel implementations of UCT show worse performance than 
single thread ones.  The cause is that uct part create and evaluate 
positions _before_ mc part (threads) finishes simulations 
completely.
----------------------------------
        A Study on Implementing Parallel MC/UCT Algorithm

                HIDEKI KATO and IKUO TAKEUCHI

We have developed a parallel MC/UCT computer Go program as a test bed for our 
research,
applied recurrent neural networks. We measured the execution time of both 
commonly used
shared-tree and client-server implementations on two different types of 
systems, Intel Core
2 Quad on a PC and Cell Broadband Engine on a SONY PLAYSTATION 3. The 
client-server
implementation runs three times faster and 10% slower than shared-tree on the 
Playstation
3 and PC, respectively. Also, the effect of a well-known problem that 
parallelizing Monte
Carlo simulations may make UCT algorithm behave differently was evaluated with 
the winning
rates against GNU GO. Our experiments using four cores show that the winning 
rates
decrease 35 ELO at most and can be improved to 20 ELO.
-----------------------------------

-Hideki

>Mark
>
>---- inline file
>_______________________________________________
>computer-go mailing list
>computer-go@computer-go.org
>http://www.computer-go.org/mailman/listinfo/computer-go/
--
[EMAIL PROTECTED] (Kato)
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] Re: Mogo scalability

Reply via email to