On 15/05/2017 21:31, Marc Glisse wrote:
On Mon, 15 May 2017, François Dumont wrote:

I also added some optimizations. Especially replacement of std::fill with calls to __builtin_memset. Has anyone ever proposed to optimize std::fill in such a way ? It would require a test on the value used to fill the range but it might worth this additional runtime check, no ?

Note that with -O3, gcc recognizes the pattern in std::fill and generates a call to memset (there is a bit too much extra code around the memset, but a couple match.pd transformations should fix that).

Good to know, at least g++ will be able to spend more time on other optimizations :-) What is match.pd ?

That doesn't mean we can't save it the work. If you want to save the runtime check, there is always __builtin_constant_p...

Good point, I will give it a try.


The __fill_bvector part of the fill overload for vector<bool> could do with some improvements as well. Looping is unnecessary, one just needs to produce the right mask and and or or with it, that shouldn't take more than 4 instructions or so.
Yes, good idear, I'll submit another patch after this one.

There was a time when I suggested overloading std::count and std::find in order to use __builtin_popcount, etc. But from what I've seen of committee discussions, I expect that there will be specialized algorithms (possibly member functions) eventually, making the overload less useful.

ok, thanks for those feedbacks.

François

Reply via email to