I copied out the dc_block_cc block from 3.7.8 and ran some performance tests against it, which I've summarized in a table below.
I had to make some modifications to the original code, such as: * I removed the make wrapper. * I tested against different containers. * Different containers have different access/management methods which meant some changes to code body (I tried to be consistent). * On input I passed a std::vector to work() rather than complex*. Although this changes the flavor of work() I figure it's relative. * I only used long_form and deleted the short_form code. I used the key part of the original code. The three containers are the original std::deque then std::queue and std::list. The results are interesting. I probably should have looked at other containers such as std::vector but that might require recoding. I also compiled with and without -std=c++11 because when i looked at container source I saw a bunch of #ifdefs for >= c++0x. These are some of the problems with the original dc_block: * Passing by value rather than by reference. * No inlines. * const needed where const should be. So in a second copy of dc_block I did those things. I found a case (filter()) where it returns by value and I left that one alone. The table below summarizes the results. "Old" means my reasonable(?) facsimile of the original dc_block. "+c11" means I added -std=c++11 to the compile line. "Opt" is my optimized copy of the code where I added references, inlines, etc. "Special" is "opt" but with different compile options. All of the output is included at the end of this message. The numbers you'll see for old/c++1/etc is the amount of time it took to process /one/ sample. In "old+deque" for example (the first item), it took 701us to process a sample. One of the surprising numbers is that std::list sucks. Also, when looking at the assembly language for filter() (copy below) I see reallocs(). That's not surprising and probably badness. (BTW, "CPLX" is: "typedef std::complex<float> CPLX;".) inline const CPLX moving_averager_c_list::filter( const CPLX& x ) { d_out_d1 = d_out; d_delay_line.push_back(x); d_out = d_delay_line.front(); d_delay_line.pop_front(); CPLX y = x - d_out_d1 + d_out_d2; d_out_d2 = y; return (y / (float)(d_length)); } The "size" numbers in the table are the text segment size returned using "size a.out". The "block size" is simply a sizeof(d_delay_line), which is really sizeof(std:deque<CPLX>) for example. One other note. I compiled "special" with -Ofast and it failed content integrity check. Probably a bad option to use. :) My os: Ubuntu 15.04. My compiler: gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) My system: AMD FX(tm)-9590 Eight-Core Processor @ 4.7GHz I'm happy to send copies of the test code (two files) for review if someone wants to put them on the web. The three main code blocks are pretty simple: { dc_blocker_cc_deque dc( NUM_ELEM ); std::cout << "deque:" << std::endl; t_start = gr::high_res_timer_now(); for( int i = 0; i < NUM_LOOPS; ++i ) for( int j = 0; j < NUM_COMPLEX; ++j ) dc.work( data, dc_deque ); timing( t_start, gr::high_res_timer_now(), NUM_LOOPS*NUM_COMPLEX ); } #define NUM_LOOPS 5 #define NUM_COMPLEX 10000 #define NUM_ELEM 32 Here's the summary table: old old+c11 opt opt+c11 special deque: 0.000701038 0.000705963 0.000235234 0.00023607 0.000234233 queue: 0.00069784 0.000705617 0.00023619 0.00023222 0.000237184 list: 0.00194583 0.00243208 0.00191296 0.00193926 0.00194809 text size: 26502 28902 21712 29574 23112 text orig: 33821 26502 block size: deque: 80 queue: 80 list: 16 Original facsimile (not c++11): dennisg@Tori-Radio:~/dc_test$ c++ -O3 main.cc dennisg@Tori-Radio:~/dc_test$ size a.out text data bss dec hex filename 28902 856 280 30038 7556 a.out dennisg@Tori-Radio:~/dc_test$ ./a.out Building complex number data... Done. GNURadio hi-res clock tps: 1000000000 GNURadio sizeof(gr_complex): 8 GNURadio sizeof(CPLX): 8 dc_blocker_cc_deque: delay_line size=80 deque: Done: total_t: 35051914970, sec_t: 35.0519, t/ea: 0.000701038 dc_blocker_cc_queue: delay_line size=80 queue: Done: total_t: 34892023951, sec_t: 34.892, t/ea: 0.00069784 dc_blocker_cc_list: delay_line size=16 list: Done: total_t: 97291349192, sec_t: 97.2913, t/ea: 0.00194583 Original facsimile (c++11): dennisg@Tori-Radio:~/dc_test$ c++ -O3 -std=c++11 main.cc dennisg@Tori-Radio:~/dc_test$ size a.out text data bss dec hex filename 21712 848 280 22840 5938 a.out dennisg@Tori-Radio:~/dc_test$ ./a.out Building complex number data... Done. GNURadio hi-res clock tps: 1000000000 GNURadio sizeof(gr_complex): 8 GNURadio sizeof(CPLX): 8 dc_blocker_cc_deque: delay_line size=80 deque: Done: total_t: 35298153446, sec_t: 35.2982, t/ea: 0.000705963 dc_blocker_cc_queue: delay_line size=80 queue: Done: total_t: 35280849767, sec_t: 35.2808, t/ea: 0.000705617 dc_blocker_cc_list: delay_line size=16 list: Done: total_t: 121603777765, sec_t: 121.604, t/ea: 0.00243208 Optimized code (not c++11): dennisg@Tori-Radio:~/dc_test$ c++ -O3 -finline main_opt.cc dennisg@Tori-Radio:~/dc_test$ size a.out text data bss dec hex filename 29574 856 280 30710 77f6 a.out dennisg@Tori-Radio:~/dc_test$ ./a.out Building complex number data... Done. GNURadio hi-res clock tps: 1000000000 GNURadio sizeof(gr_complex): 8 GNURadio sizeof(CPLX): 8 dc_blocker_cc_deque: delay_line size=80 deque: Done: total_t: 11761720007, sec_t: 11.7617, t/ea: 0.000235234 dc_blocker_cc_queue: delay_line size=80 queue: Done: total_t: 11809516472, sec_t: 11.8095, t/ea: 0.00023619 dc_blocker_cc_list: delay_line size=16 list: Done: total_t: 95647805916, sec_t: 95.6478, t/ea: 0.00191296 Optimized code (c++11): dennisg@Tori-Radio:~/dc_test$ c++ -O3 -finline -std=c++11 main_opt.cc dennisg@Tori-Radio:~/dc_test$ size a.out text data bss dec hex filename 23080 848 280 24208 5e90 a.out dennisg@Tori-Radio:~/dc_test$ ./a.out Building complex number data... Done. GNURadio hi-res clock tps: 1000000000 GNURadio sizeof(gr_complex): 8 GNURadio sizeof(CPLX): 8 dc_blocker_cc_deque: delay_line size=80 deque: Done: total_t: 11803504003, sec_t: 11.8035, t/ea: 0.00023607 dc_blocker_cc_queue: delay_line size=80 queue: Done: total_t: 11610977298, sec_t: 11.611, t/ea: 0.00023222 dc_blocker_cc_list: delay_line size=16 list: Done: total_t: 96962902014, sec_t: 96.9629, t/ea: 0.00193926 special (opt+c++11): dennisg@Tori-Radio:~/dc_test$ c++ -Ofast -Wsign-compare -Wall -Wno-uninitialized -fvisibility=hidden -finline -std=c++11 main_opt.cc dennisg@Tori-Radio:~/dc_test$ size a.out text data bss dec hex filename 23112 856 280 24248 5eb8 a.out dennisg@Tori-Radio:~/dc_test$ ./a.out Building complex number data... Done. GNURadio hi-res clock tps: 1000000000 GNURadio sizeof(gr_complex): 8 GNURadio sizeof(CPLX): 8 dc_blocker_cc_deque: delay_line size=80 deque: Done: total_t: 11711630308, sec_t: 11.7116, t/ea: 0.000234233 dc_blocker_cc_queue: delay_line size=80 queue: Done: total_t: 11859205796, sec_t: 11.8592, t/ea: 0.000237184 dc_blocker_cc_list: delay_line size=16 list: Done: total_t: 97404287524, sec_t: 97.4043, t/ea: 0.00194809 Data error i=0 _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio