I just checked a profile for 19x19 on one 3.4 GHz i7-3770 core.
8% of the time is spent in making moves in play outs. So the maximum possible performance benefit of hardware accelerated move making is only 8% higher performance. On one core Many Faces is doing about 1200 games per second, of about 500 moves each, so it is making 600,000 moves in 8% of one second. That's 133 ns per move made, near the lower end of Mark's range. It seems that this hardware won't make the program any faster. Many Faces' play outs are quite heavy, but the additional time is mostly in move generation, not making moves. The code to update the board state is quite efficient. It includes things that are probably not included in the hardware below (like updating the index of the local 3x3 pattern at each empty point), keeping a list of liberties for each group, and updating local feature information that will be used by the move generator. The "make move" function is 400 lines of C, so it's doing much more than just simple board state update. David From: [email protected] [mailto:[email protected]] On Behalf Of ?????????????? ??????? Sent: Tuesday, May 21, 2013 11:36 PM To: [email protected] Subject: Re: [Computer-go] Go playing software accelerator development Incredible, 100 nanoseconds is only about 300 instructions of a CPU. Are you talking about 19x19? And 1 microsecond for my design will probably be a worst-case (as I calculate freedom and capture iteratively). When almost all stones have free places around it will be down to ~100 nanoseconds. As to the number of possible accelerators on-chip - it varies upon price. I think it can be 5-250, for the price $250-$5000. So the cost of a single simple accelerator will be $20-$50. Dmitry 21.05.2013, 23:13, "Mark Boon" <[email protected]>: Sounds interesting. But 1 microsecond for a move is not particularly fast. There are already implementations that do that in the 100-300 nanoseconds range on one core. 1 microsecond is probably considered as 'semi-light' playout. I suppose the question then becomes, how many of these could your accelerator do in parallel? Mark On Tue, May 21, 2013 at 8:06 AM, Alexander Kozlovsky <[email protected]> wrote: Я тоже кстати из ЛИАПа, с четвертого факультета, может и пересекались :) On Tue, May 21, 2013 at 7:02 PM, Рождественский Дмитрий <[email protected]> wrote: Hi all, I have got an idea to create a hardware accelerator for Go playing software. It will probably be a USB (or maybe PCI-Express) device that will be able to do some basic, but very time-consuming for general-purpose CPU calculations very fast. For example load a goban layout, make a number of random moves (as used in Monte-Carlo algorithm) and unload result back to a computer. As long as it will be a hardware, it will be able to do specified calculations only, but the speed will be very high. For example, making just a copy of the particular goban layout will require typically about 10 nanoseconds only (one internal clock cycle). Calculation of the validity and results of a particular move (including a check for ko and captured stones) will probably take 1 microsecond. This as usual may vary during debugging, but the current move calculation engine draft I've started to develop is about this figures. My nearest aims here are: - to understand a demand from go playing software developers, and - to understand what particular calculation chains are most demanded for hardware acceleration. Dmitry _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go , _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
