http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51078
Bug #: 51078 Summary: [PATCH] performance improvement of std::count algorithm Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: grygoriy.fuched...@gmail.com Created attachment 25778 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25778 patch itself This patch improves performance of std::count algorithm in case of random access iterators by loop unrolling. It doesn't affect std::count algorithm for usual input iterators. This is done by choosing different versions of function for different iterator types at compile time, similar technique is used for std::find algorithm. Besides the patch itself I've attached performance test which can be used to test this changes without even applying them. Test program has count_local function which implements improvements of this patch. Test program generates test bunch of test containers of different types and performs count_local and original count algorithm on them. It outputs table with following columns: container_size, type1_local_time, type1_orig_time, type1_performance_boost, type2_local_time, type2_orig_time, type2_performance_boost, ... container_size is length of data passed to each call to count algorithm. typeX_local_time is time elapsed counting bunch of containers of type X using count_local function. typeX_orig_time is time elapsed counting bunch of containers of type X using original count implementation. typeX_performance_boost is (typeX_orig_time - typeX_local_time)/typeX_orig_time. Number of containers for testing decreases with increasing of container_size to keep testing time relatively constant. There is also data file containing test results and gnuplot script to plot them. Results were obtained on my machine with i7 cpu. Also tried another machine with core2 processor, got similar results. Also tried changing range of generated random data from [1-128] to [1-2]. Got similar results. I also attached plots of performance boost against container size, first one is linear by size, second one is logarithmic. As you can see there is performance decrease for very small containers with length less then 10 elements. And around 30% performance boost for larger containers with fundamental types. Containers of complex data vector<vector<int> > have small performace boost around 1%. There is also list<int> to demonstrate that this patch doesn't affect non random access iterators.