[Bug c++/118033] [Missing optimization] Keep __builtin_unreachable for asserts in the release build

2024-12-13 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118033 --- Comment #3 from Dmytro Ovdiienko --- I believe in 99% cases whatever is passed to the assert() is a legal expression that returns bool. And there is an opportunity to optimize the output assembly in case we if want to reuse that expression f

[Bug c++/118033] [Missing optimization] Keep __builtin_unreachable for asserts in the release build

2024-12-13 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118033 --- Comment #1 from Dmytro Ovdiienko --- I'm not sure about how to handle the side effects caused by the expression. The code in the expression must not be executed but used by the compiler only for the optimization.

[Bug c++/118033] New: [Missing optimization] Keep __builtin_unreachable for asserts in the release build

2024-12-13 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dmitriy.ovdienko at gmail dot com Target Milestone: --- Could you define `assert` macro as following in case if `NDEBUG` macro is defined: #if defined(NDEBUG

[Bug c++/109127] New: More advanced constexpr value compile time evaluation

2023-03-14 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dmitriy.ovdienko at gmail dot com Target Milestone: --- Hello, I'd like to report the idea which could improve the application performance. The idea is related to `constexpr` math, which can be perform

[Bug c++/98840] Why does baz call the delete operator for moved unique_ptr

2021-01-26 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98840 --- Comment #4 from Dmitriy Ovdienko --- What if introduce new ABI version and encode into function name (function name mangling). And then have two options: * Either compile code and store both versions into lib file (ABI v1 and v2). Applies

[Bug c++/98840] Why does baz call the delete operator for moved unique_ptr

2021-01-26 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98840 --- Comment #3 from Dmitriy Ovdienko --- > This is not a GCC bug. No it is not. But can we improve that? That approach increases the binary size. In case if `baz` is called from many places, that is going to increase the binary size.

[Bug c++/98840] New: Why does baz call the delete operator for moved unique_ptr

2021-01-26 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dmitriy.ovdienko at gmail dot com Target Milestone: --- I'm trying to evaluate the overhead of the `unique_ptr` and I do not understand why does Gcc execute the destructor of the `uniqu

[Bug c++/97641] Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97641 --- Comment #7 from Dmitriy Ovdienko --- If I change the body of the loop like this, it also works ``` while ('\x01' != *ptr) { result = result * 10 - '0' + *ptr++; } ``` Looks like integer overflow happens on last iteration and compiler

[Bug c++/97641] Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97641 --- Comment #6 from Dmitriy Ovdienko --- This code does not work ``` #include int Parse1(char const* ptr) noexcept { int result = 0; while ('\x01' != *ptr) { result = result * 10 + *ptr++ - '0'; } return result; } i

[Bug c++/97641] Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97641 Dmitriy Ovdienko changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALI

[Bug c++/97641] Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97641 --- Comment #4 from Dmitriy Ovdienko --- It happens to 2147483646, 2147483647 and std::numeric_limits::min().

[Bug c++/97641] Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97641 --- Comment #1 from Dmitriy Ovdienko --- OS: Windows 10 Distribution: MSys2 (https://www.msys2.org/) Version: (Rev4, Built by MSYS2 project) 10.2.0 I tried to reproduce this issue on https://gcc.godbolt.org/. gcc (trunk) is also unable to compil

[Bug c++/97641] New: Wrong codegen if optimizer is enabled

2020-10-30 Thread dmitriy.ovdienko at gmail dot com via Gcc-bugs
++ Assignee: unassigned at gcc dot gnu.org Reporter: dmitriy.ovdienko at gmail dot com Target Milestone: --- g++ optimizer produces wrong code in case if -O3 is used. In case if -O2 and -O1 are used, app works as expected. Expected output: matches In fact output: does not match

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-10 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #42 from Dmitriy Ovdienko --- > The master branch has been updated by Jonathan Wakely : Very good commit and comment. I hope this change, made for synthetic benchmark, wont affect real production applications.

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-10 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #41 from Dmitriy Ovdienko --- @Jonathan Did you have chance to review why default-constructed M-B-R works faster than another one constructed with the initial buffer size?

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-10 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #38 from Dmitriy Ovdienko --- Wohoo! Accepted: https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/binarytrees.html

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-09 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #37 from Dmitriy Ovdienko --- > I assume you can't just preallocate a buffer for the pool? I dunno... here is a requirement: * When possible, use default GC; otherwise use per node allocation or use a library memory pool. * As a

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-09 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #36 from Dmitriy Ovdienko --- > It doesn't seem to make much difference. It is visible in the assembly. In case if you use __unlikelly, compiler moves this code out of hot path minimizing the amount of instructions to decode.

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #33 from Dmitriy Ovdienko --- @Jonathan Wakely I have one idea to improve code of p_m_r I expect that allocation are very rare operation. If true, it makes sense to add __unlikelly to `if (!__p)` inside the `do_allocate` member func

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #32 from Dmitriy Ovdienko --- What bothers me is does why Rust generate less cpucache-references.

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #31 from Dmitriy Ovdienko --- Created attachment 49202 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49202&action=edit Modified solution with 2-level memory pools I believe I'm done with this task. Attached is a solution based

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #30 from Dmitriy Ovdienko --- > Dividing estimated size by 2 to counter the over-allocation effect: Good idea... but it smells bad :) What if someone change allocation algorithm...? > Since the poolSize function actually returns siz

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #29 from Dmitriy Ovdienko --- Table above isn't readable. Value for `cache-references` .. faults are divided by 1000 Sorry for flood. | CPU counter | Rust | C++ before |C++ now | C++ malloc | |--|-

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #28 from Dmitriy Ovdienko --- Added CPU counters for malloc-based allocator as a base point | CPU counter | Rust | C++ before |C++ now | C++ malloc | |--|---:|---

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #27 from Dmitriy Ovdienko --- Following are CPU counters I've got for improved C++ test vs Rust and original C++ test by Danial Klimkin | CPU counter | Rust | C++ before |C++ now | |--|-

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #26 from Dmitriy Ovdienko --- Created attachment 49201 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49201&action=edit Modified solution (thread per iteration) Attached is a code similar to what Rust sample is doing (parallel

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #21 from Dmitriy Ovdienko --- > This is only the second time > I've ever received any indication anybody is even using the > header, so I've not wasted my time tuning it. I used it to create an order book :). It helped me to impro

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-08 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #10 from Dmitriy Ovdienko --- Looks like I know why C++ sample does not use all the CPU resources. C++ does not load threads equally. Last thread gets the most heavy task (MAX_DEPTH) and performs N iterations alone. Rust code instea

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-07 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #8 from Dmitriy Ovdienko --- Same as above for Depth = 19 | | PMR |Malloc | |---|---|---| | cache-references |16,571,923 |16,260,256 | | cache-misse

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-07 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #7 from Dmitriy Ovdienko --- Following are CPU counters for single threaded code. Pre-allocation is enabled. Memory pool is created inside the loop. ```cpp int poolSize(int depth) { return (1 << (depth + 1)) * sizeof(Node);

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-07 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #6 from Dmitriy Ovdienko --- > looking at cache-misses counter does not make sense here Well, if you compare Rust and C++, cache-misses CPU counter differs dramatically... and page-faults too... while amount of instructions is the sa

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-07 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #4 from Dmitriy Ovdienko --- Created attachment 49190 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49190&action=edit Modified solution with custom allocator based on malloc (simplified, single threaded) Attached is a benchmar

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-07 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #3 from Dmitriy Ovdienko --- Created attachment 49189 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49189&action=edit Original implementation (simplified, single threaded) Attached is a simplified original version of the bench

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-05 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #2 from Dmitriy Ovdienko --- Created attachment 49185 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49185&action=edit Modified solution with custom allocator based on malloc

[Bug libstdc++/96942] std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-05 Thread dmitriy.ovdienko at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96942 --- Comment #1 from Dmitriy Ovdienko --- Created attachment 49184 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49184&action=edit Original implementation with preallocated buffer

[Bug libstdc++/96942] New: std::pmr::monotonic_buffer_resource causes CPU cache misses

2020-09-05 Thread dmitriy.ovdienko at gmail dot com
Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: dmitriy.ovdienko at gmail dot com Target Milestone: --- Created attachment 49183 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49183&action=edit Original implementation