Thank you for the information. Despite of the fact that I probably be able to 
acelerate several of these functions at a time you are right. That will speed 
up the program maybe twice, but I think that the hardware will be only valuable 
if it will be able to speed up the program at least in an order of magnitude.

As to developing completely new algorithm - it is challenging. The hardware 
will cost at least the same as your software, so it should at least have a 
parity with Many Faces to be valuable. I do not have adequate skills for that 
now, so I probably should be looking for a partner.

Regards,
Dmitry

22.05.2013, 19:40, "David Fotland" <[email protected]>:
> I measured the speed using only one core.  133 ns is the time is takes one 
> core to make one move in the playouts.
>
> In Many Faces there is no small piece of code that uses a lot of time.  The 
> top functions and times are:
>
> 12% Generate local moves
> 8%: count liberties for ladder search in playouts
> 8% match big patterns on the full board during the tree search
> 8%: make a move in playouts6% generate moves to remove an eye in the playout 
> move generation
> 5%: count liberties of adjacent enemy group during playout move generation
> 4%: calculate gamma for a move during move generation for tree search
> 3%: extract an 8x8 bitmap for large pattern matching during tree search
> 2%: check a generated move for legality
> 2%: make a move during ladder search in the playouts
>
> Etc.
>
> Many of these functions are complex, and accelerating any one of them will 
> have negligible impact on performance.
>
> You may be able to find some simple algorithm that can play well and that can 
> be accelerated.  The top programs are too complex to gain by this kind of 
> hardware.  Accelerating a few key functions can’t make the program much 
> faster.  Amdahl's law severely limits the gains that are possible.
>
> Regards,
>
> David
>
>>  -----Original Message-----
>>  From: [email protected] [mailto:computer-go-
>>  [email protected]] On Behalf Of ?????????????? ???????
>>  Sent: Wednesday, May 22, 2013 12:34 AM
>>  To: [email protected]
>>  Subject: Re: [Computer-go] Go playing software accelerator development
>>
>>  i7 is four-core, so probably it is closer to the upper bound. But if you
>>  say that only 8% of time spent on calculating board positon it changes
>>  nothing: it becomes obvious that some other things than board positoin
>>  calculations should be accelerated first. And I will highly appreciate
>>  any hints about what it could be.
>>
>>  Dmitry
>>
>>  22.05.2013, 11:05, "David Fotland" <[email protected]>:
>>>  I just checked a profile for 19x19 on one 3.4 GHz i7-3770 core.
>>>
>>>  8% of the time is spent in making moves in play outs.� So the maximum
>>  possible performance benefit of hardware accelerated move making is only
>>  8% higher performance.
>>>  On one core Many Faces is doing about 1200 games per second, of about
>>  500 moves each, so it is making 600,000 moves in 8% of one second.�
>>  That�s 133 ns per move made, near the lower end of Mark�s range.� It
>>  seems that this hardware won�t make the program any faster.
>>>  Many Faces� play outs are quite heavy, but the additional time is
>>  mostly in move generation, not making moves.
>>>  The code to update the board state is quite efficient.� It includes
>>  things that are probably not included in the hardware below (like
>>  updating the index of the local 3x3 pattern at each empty point),
>>  keeping a list of liberties for each group, and updating local feature
>>  information that will be used by the move generator.� The �make move�
>>  function is 400 lines of C, so it�s doing much more than just simple
>>  board state update.
>>>  David
>>>
>>>  From: [email protected] [mailto:computer-go-
>>  [email protected]] On Behalf Of ?????????????? ???????
>>>  Sent: Tuesday, May 21, 2013 11:36 PM
>>>  To: [email protected]
>>>  Subject: Re: [Computer-go] Go playing software accelerator development
>>>
>>>  Incredible, 100 nanoseconds is only about 300 instructions of a CPU.
>>  Are you talking about 19x19? And 1 microsecond for my design will
>>  probably be a worst-case (as I calculate freedom and capture
>>  iteratively). When almost all stones have free places around it will be
>>  down to�~100 nanoseconds.
>>>  As to the number of possible accelerators on-chip - it varies upon
>>  price. I think it can be 5-250, for the price $250-$5000. So the cost of
>>  a single simple accelerator will be $20-$50.
>>>  Dmitry
>>>
>>>  21.05.2013, 23:13, "Mark Boon" <[email protected]>:
>>>>  Sounds interesting. But 1 microsecond for a move is not particularly
>>  fast. There are already implementations that do that in the 100-300
>>  nanoseconds range on one core. 1 microsecond is probably considered as
>>  'semi-light' playout. I suppose the question then becomes, how many of
>>  these could your accelerator do in parallel?
>>>>  Mark
>>>>
>>>>  On Tue, May 21, 2013 at 8:06 AM, Alexander Kozlovsky
>>  <[email protected]> wrote:
>>>>  ? ???? ?????? ?? ?????, ? ?????????? ??????????, ????? ? ????????????
>>  :)
>>>>  On Tue, May 21, 2013 at 7:02 PM, ?????????????? ???????
>>  <[email protected]> wrote:
>>>>  Hi all,
>>>>
>>>>  I have got an idea to create a hardware accelerator for Go playing
>>  software. It will probably be a USB (or maybe PCI-Express) device that
>>  will be able to do some basic, but very time-consuming for general-
>>  purpose CPU calculations very fast. For example load a goban layout,
>>  make a number of random moves (as used in Monte-Carlo algorithm) and
>>  unload result back to a computer.
>>>>  As long as it will be a hardware, it will be able to do specified
>>  calculations only, but the speed will be very high. For example, making
>>  just a copy of the particular goban layout will require typically about
>>  10 nanoseconds only (one internal clock cycle). Calculation of the
>>  validity and results of a particular move (including a check for ko and
>>  captured stones) will probably take 1 microsecond. This as usual may
>>  vary during debugging, but the current move calculation engine draft
>>  I've started to develop is about this figures.
>>>>  My nearest aims here are:
>>>>  - to understand a demand from go playing software developers, and
>>>>  - to understand what particular calculation chains are most demanded
>>  for hardware acceleration.
>>>>  Dmitry
>>>>  _______________________________________________
>>>>  Computer-go mailing list
>>>>  [email protected]
>>>>  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>>>
>>>>  _______________________________________________
>>>>  Computer-go mailing list
>>>>  [email protected]
>>>>  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>>>
>>>>  ,
>>>>
>>>>  _______________________________________________
>>>>  Computer-go mailing list
>>>>  [email protected]
>>>>  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>>  ,
>>>  _______________________________________________
>>>  Computer-go mailing list
>>>  [email protected]
>>>  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>>  _______________________________________________
>>  Computer-go mailing list
>>  [email protected]
>>  http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to