Guys, I have filed the ticket [1] to improve the TracingSpi interface and further reusing it for cluster profiling. I'll assign it to me and take it to work after profiling merge if nobody minds.
[1] https://issues.apache.org/jira/browse/IGNITE-13850 пн, 14 дек. 2020 г. в 13:38, Alexander Lapin <lapin1...@gmail.com>: > > Ok from my side. > Few more details about tracing spi updates, based on mentioned above > discussion with Nikolay and Nikita. > > Tracing provides enough data for a performance profiling tool, actually > only root spans are required. However, according to Nikinta, > root-span-tracing has a 7-8% performance drop in comparison to 1-2% > performance drop of the performance profiling tool. It's the main reason to > have given tool as is right now. In order to reuse TracingSPI for a > profiling tool internals, few modifications should be made to increase > tracing performance: > > - Add support for non-strings tags and log points: primitives, etc. > - Add ability to postpone adding span tags and log points to the very > and of span tree creation. > - Probably some sort of tags caching could also help. > > Best regards, > Alexander > > пн, 14 дек. 2020 г. в 12:48, Nikolay Izhikov <nizhi...@apache.org>: > > > Hello, Igniters. > > > > We discussed this feature privately with Alexander and Nikita. > > Here are the results we want to share with the community: > > > > 0. In the end, both, performance statistic tool and tracing should use the > > same API. > > 1. We should improve the Tracing API, so it able to be used for gathering > > information about all operations without a significant performance drop. > > > > I propose to go as follows: > > > > 1. Merge current PR as is after final review. My intention is to provide a > > tool for users that can be used in the real-world production environment. > > 2. Improve the Tracing API. > > 3. Combine both tools under the same API. > > > > > 14 дек. 2020 г., в 10:42, Alexander Lapin <lapin1...@gmail.com> > > написал(а): > > > > > > Hello Igniters, > > > > > > Because the tracing causes performance drop 52% [4] and can not be > > >> used for collecting statistics about all queries in production > > >> deployments. The performance drop of the profiling tool is less than > > >> 2% and it can be used in production. I have benchmarked the tracing > > >> and got the results: > > >> > > >> -2% when configured OpenCensusTracingSpi and all scopes disabled > > >> -52% for TX scope (IgnitePutTxBenchmark) > > >> -58% for SQL scope (IgniteSqlQueryBenchmark) > > >> > > >> Such a performance drop is significant to not use the tracing in > > >> production. > > >> > > > We've rerun tracing benchmarks based on more realistic scenarios and got > > a > > > 10-15% performance drop in case of sampling-rate 1 (all transactions were > > > traced). More realistic scenarios means that we don't test tracing > > > performance if the system is in overdraft state but add some sort of > > micro > > > throttling (1 millisecond) between operations, transactions in our case. > > > *IgnitePutTxBenchmark* > > > > > > Green: Case 1: NoopTracingSpi > > > > > > Blue: Case 2: OpenCensusTracingSpi (disabled) > > > > > > Red: Case 3: OpenCensusTracingSpi, --scope TX --sampling-rate 0.1 > > > > > > Black: Case 5: *ControlCenter* + OpenCensusTracingSpi, --scope TX > > > --sampling-rate 0.1 > > > > > > Violet: Case 4: OpenCensusTracingSpi, --scope TX --sampling-rate 1 > > > Yellow: Case 6: ControlCenter + OpenCensusTracingSpi, --scope TX > > > --sampling-rate > > > > > > I have considered the possibility to reuse the tracing API. If > > >> statistics collecting will be implemented with the TracingSpi then we > > >> get a performance drop due to: > > >> - Transferring tracing context over the network. > > >> - Using ThreadLocal for spans > > >> - Converting primitives and objects to string and vice versa. (API > > >> supports only String-based tags and values) > > >> - Generating span objects > > >> > > > @Nikita Amelchev Could you please share numbers? > > > > > > Best regards, > > > Alexander > > > > > > пн, 7 дек. 2020 г. в 17:24, Nikolay Izhikov <nizhi...@apache.org>: > > > > > >> Hello, Nikita. > > >> > > >> Makes sense. > > >> > > >> I will take a look. > > >> > > >>> 7 дек. 2020 г., в 15:29, Nikita Amelchev <nsamelc...@gmail.com> > > >> написал(а): > > >>> > > >>> Hello, Igniters. > > >>> > > >>> I have implemented the profiling tool [1, 2]. It writes duration and > > >>> other parameters of user operations (scan, SQL query, transactions, > > >>> tasks, jobs, CQ, etc) to a local file. This info can be used in > > >>> various cases. The main goal is to build the performance report to > > >>> analyze the count and duration of user queries [3]. > > >>> > > >>> We already have the tracing with similar functionality but I think > > >>> Ignite should have both tools - tracing and profiling. > > >>> > > >>> Because the tracing causes performance drop 52% [4] and can not be > > >>> used for collecting statistics about all queries in production > > >>> deployments. The performance drop of the profiling tool is less than > > >>> 2% and it can be used in production. I have benchmarked the tracing > > >>> and got the results: > > >>> > > >>> -2% when configured OpenCensusTracingSpi and all scopes disabled > > >>> -52% for TX scope (IgnitePutTxBenchmark) > > >>> -58% for SQL scope (IgniteSqlQueryBenchmark) > > >>> > > >>> Such a performance drop is significant to not use the tracing in > > >> production. > > >>> > > >>> I have considered the possibility to reuse the tracing API. If > > >>> statistics collecting will be implemented with the TracingSpi then we > > >>> get a performance drop due to: > > >>> - Transferring tracing context over the network. > > >>> - Using ThreadLocal for spans > > >>> - Converting primitives and objects to string and vice versa. (API > > >>> supports only String-based tags and values) > > >>> - Generating span objects > > >>> > > >>> I have benchmarked implementations on the yardstick’s > > >>> IgniteGetBenchmark. The tracing context transferring over the network > > >>> was disabled. The results: > > >>> - Tracing API implementation - 8% performance drop. > > >>> - Proposed implementation - 2% performance drop. > > >>> > > >>> I think this is a significant drop and implementation with reuse > > >>> tracing API should not be used. The cluster profiling should have as > > >>> little performance drop as possible to be used in production. The > > >>> tracing will be used for the detailed investigation. > > >>> > > >>> WDYT? > > >>> > > >>> The tool is ready to be reviewed [3, 5]. > > >>> > > >>> [1] https://issues.apache.org/jira/browse/IGNITE-12666 > > >>> [2] > > >> > > https://cwiki.apache.org/confluence/display/IGNITE/Cluster+performance+profiling+tool > > >>> [3] https://github.com/apache/ignite-extensions/pull/16 > > >>> [4] > > >> > > https://issues.apache.org/jira/secure/attachment/13016636/Tracing%20benchmarks.docx > > >>> [5] https://github.com/apache/ignite/pull/7693 > > >>> > > >>> ср, 24 июн. 2020 г. в 23:31, Saikat Maitra <saikat.mai...@gmail.com>: > > >>>> > > >>>> Hi Nikita, > > >>>> > > >>>> The changes in this PR looks good. > > >>>> > > >>>> https://github.com/apache/ignite-extensions/pull/16 > > >>>> > > >>>> Regards, > > >>>> Saikat > > >>>> > > >>>> On Mon, Jun 22, 2020 at 12:03 PM Nikolay Izhikov <nizhi...@apache.org > > > > > >>>> wrote: > > >>>> > > >>>>> Hello, Igniters. > > >>>>> > > >>>>> I think that inside Ignite core we should name this feature as > > >>>>> «performance statistics» > > >>>>> We already have «cache statistics». > > >>>>> Data that is collected by performance statistics can be used not only > > >> for > > >>>>> profiling but to solve other tasks. > > >>>>> > > >>>>> > > >>>>>> 22 июня 2020 г., в 14:00, Nikita Amelchev <nsamelc...@gmail.com> > > >>>>> написал(а): > > >>>>>> > > >>>>>> Hi, guys. > > >>>>>> > > >>>>>> I have mentioned components under the MIT license in the LICENSE > > file. > > >>>>>> > > >>>>>> Saikat, I have fixed PR according to your suggestions. Thanks for > > >> taking > > >>>>> a look. > > >>>>> > > >>>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Best wishes, > > >>> Amelchev Nikita > > >> > > >> > > > > -- Best wishes, Amelchev Nikita