http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51078

             Bug #: 51078
           Summary: [PATCH] performance improvement of std::count
                    algorithm
    Classification: Unclassified
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: libstdc++
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: grygoriy.fuched...@gmail.com


Created attachment 25778
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25778
patch itself

This patch improves performance of std::count algorithm in case of random
access iterators by loop unrolling. It doesn't affect std::count algorithm for
usual input iterators. This is done by choosing different versions of function
for different iterator types at compile time, similar technique is used for
std::find algorithm.

Besides the patch itself I've attached performance test which can be used to
test this changes without even applying them. Test program has count_local
function which implements improvements of this patch. Test program generates
test bunch of test containers of different types and performs count_local and
original count algorithm on them. It outputs table with following columns:
container_size, type1_local_time, type1_orig_time, type1_performance_boost,
type2_local_time, type2_orig_time, type2_performance_boost, ...

container_size is length of data passed to each call to count algorithm.
typeX_local_time is time elapsed counting bunch of containers of type X using
count_local function.
typeX_orig_time is time elapsed counting bunch of containers of type X using
original count implementation.
typeX_performance_boost is (typeX_orig_time -
typeX_local_time)/typeX_orig_time.
Number of containers for testing decreases with increasing of container_size to
keep testing time relatively constant.

There is also data file containing test results and gnuplot script to plot
them. Results were obtained on my machine with i7 cpu. Also tried another
machine with core2 processor, got similar results. Also tried changing range of
generated random data from [1-128] to [1-2]. Got similar results.

I also attached plots of performance boost against container size, first one is
linear by size, second one is logarithmic.

As you can see there is performance decrease for very small containers with
length less then 10 elements. And around 30% performance boost for larger
containers with fundamental types. Containers of complex data
vector<vector<int> > have small performace boost around 1%.

There is also list<int> to demonstrate that this patch doesn't affect non
random access iterators.

Reply via email to