Clarification: p.5 is about buffer pool and disk access statistics, not network.
On Tue, Nov 27, 2018 at 6:02 PM Vladimir Ozerov <voze...@gridgain.com> wrote: > Alexey, > > I would say that to implement this feature successfully, we first need to > clearly understand specific scenarios we want to target, and only then plan > implementation in small iterations. Please take in count the following > points: > 1) Splitting by message type might be too fine-grained thing for the first > iteration. For example, we have N different message types processed in a > system pool. Add several messages causing long-running tasks (e.g. invoke), > and deque time will grow for all other message types for the same queue. I > think we can start with a simpler metrics - per-pool queueing time, pool > throughput, pool task latencies. This would be enough for a wide number of > use cases. We may expand it in future if needed. > 2) Some IO statistics were already implemented as a part of IGNITE-6868 > [1]. See TcpCommunicationSpiMBeanImpl. You may find it useful for your task. > 3) Public API should be designed upfront. Normally it should include JMX > and system SQL views (e.g. like [2]). JMX is useful for external tools, SQL > views allow access to performance data from all platforms all at once. This > is critical. > 4) It should be possible to enable/disable such metrics in runtime. > Probably infrastructure from IGNITE-369 [3] might be reused for this > purpose (see GridCacheProcessor.EnableStatisticsFuture) > 5) IO access statistics are already under development [4]. First part will > be merged very soon. > 5) Performance effects must be measured extremely carefully in all modes > (in-memory, persistence, background/log_only/fsync), because you are likely > to change very performance sensitive code pieces. We had a lot of > performance issues when implementing IGNITE-6868 [1]. > > Vladimir. > > [1] https://issues.apache.org/jira/browse/IGNITE-6868 > [1] https://issues.apache.org/jira/browse/IGNITE-7700 > [3] https://issues.apache.org/jira/browse/IGNITE-369 > [4] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-27%3A+Page+IO+statistics > > On Tue, Nov 27, 2018 at 5:04 PM Alexey Kukushkin < > kukushkinale...@gmail.com> wrote: > >> Hi Alexei, >> >> Did you consider general-purpose APM >> <https://en.wikipedia.org/wiki/Application_performance_management> tools >> like free InspectIT <https://github.com/inspectIT/inspectIT> or >> commercial >> DynaTrace <https://www.dynatrace.com/> or AppDynamics >> <https://www.appdynamics.com/>? >> >> Java APM tools do not require writing code or even instrumenting Ignite >> binaries: you just attach them as javaagent >> <https://zeroturnaround.com/rebellabs/how-to-inspect-classes-in-your-jvm/> >> to >> Ignite JVM and then you can configure "sensors" to track whatever API you >> want. APM tools collect metrics from the sensors and provide sophisticated >> analysis tools. >> >> DynaTrace claims they have 2%-7% overhead (depending on application) but >> you can always detach the tool if you do not always need it in production. >> >> I did not try APM for Ignite myself but it might work well. >> >> On Tue, Nov 27, 2018 at 4:37 PM Alexei Scherbakov < >> alexey.scherbak...@gmail.com> wrote: >> >> > Igniters, >> > >> > At work I often have to solve performance issues with Ignite cluster >> > without having access to source code of running user application. >> > >> > Looks like Ignite have limited capabilities to identify bottlenecks >> without >> > extensive profiling on server and client side (JFR recording , sampling >> > profilers, regular thread dumps, etc), which is not always possible. >> > >> > Even having profiling data not always helpful for determining several >> types >> > of bottlenecks, on example, if where is a contention on single >> > key/partition. >> > >> > I propose to implement new feature, like lightweight profiling of >> message >> > processing. >> > >> > The feature will allow to have view on message processing statistics for >> > each node and for all grid nodes. >> > >> > In short, it's necessary to track each message execution in executors >> and >> > record execution statistics like synchronous execution time in executor >> > thread and waiting time in queue. >> > >> > Full description: >> > >> > 1. Implement detailed tracking of message waiting in queue and actual >> > processing by executors with splitting to several time bins. Example of >> > detailed statistics for each processed message: >> > >> > >> > Processing time(%) >> > Message Total Average(ms) >> > < 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms < >> > 500ms < 750ms < 1000ms > 1000ms >> > >> > >> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > GridNearSingleGetRequest 904311372 0.023000 >> > 904240521 57394 7242 3961 1932 229 >> > 61 24 4 4 >> > GridNearSingleGetResponse 340134416 0.041000 >> > 340118791 11660 1167 729 901 1001 >> > 158 8 1 0 >> > GridNearLockRequest 77088689 0.079000 >> > 77073458 11945 2299 643 311 31 >> > 2 0 0 0 >> > GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000 >> > 39580914 28222 6469 4638 9870 13414 >> > 2087 137 1 0 >> > GridDhtAtomicSingleUpdateRequest 37636829 0.277000 >> > 37579375 23247 5915 4210 8954 12917 >> > 2048 162 1 0 >> > GridDhtAtomicDeferredUpdateResponse 33580198 0.002000 >> > 33579805 337 51 3 1 1 >> > 0 0 0 0 >> > GridNearTxPrepareRequest 21667790 0.238000 >> > 21078069 580126 1622 1261 2531 3631 >> > 496 40 0 14 >> > GridDhtTxPrepareResponse 20949873 0.316000 >> > 17130161 3803105 4615 3162 4489 3721 >> > 577 34 1 8 >> > GridDhtTxPrepareRequest 20949838 0.501000 >> > 16158732 4750217 16183 5735 8472 8994 >> > 1353 88 11 53 >> > GridDhtTxFinishResponse 13835065 0.007000 >> > 13832519 2476 27 28 14 1 >> > 0 0 0 0 >> > GridDhtTxFinishRequest 13835028 0.547000 >> > 12084106 1736789 8971 2340 1792 807 >> > 118 41 4 60 >> > GridNearTxFinishRequest 13762197 0.725000 >> > 11811828 1942499 4441 1400 1201 524 >> > 89 34 19 162 >> > GridDhtAtomicNearResponse 2784422 0.122000 >> > 2783393 1022 5 2 0 0 >> > 0 0 0 0 >> > GridNearGetRequest 2360483 0.484000 >> > 2345937 14129 244 101 64 8 >> > 0 0 0 0 >> > GridNearGetResponse 1984243 0.054000 >> > 1981905 2327 8 1 1 1 >> > 0 0 0 0 >> > GridNearTxPrepareResponse 192856 0.153000 >> > 192660 188 1 5 1 1 >> > 0 0 0 0 >> > GridNearLockResponse 192780 0.091000 >> > 192667 107 3 0 3 0 >> > 0 0 0 0 >> > GridNearTxFinishResponse 177 0.822000 >> > 129 47 1 0 0 0 >> > 0 0 0 0 >> > GridNearAtomicSingleUpdateRequest 124 4.803000 >> > 52 53 19 0 0 0 >> > 0 0 0 0 >> > GridNearAtomicUpdateResponse 120 0.448000 >> > 110 10 0 0 0 0 >> > 0 0 0 0 >> > >> > >> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > 1544912252 >> > 1531765132 12965900 59283 28219 40537 45281 >> > 6989 568 42 301 >> > >> > >> > Queue waiting time(%) >> > Message Total Average(ms) >> > < 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms < >> > 500ms < 750ms < 1000ms > 1000ms >> > >> > >> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > GridNearSingleGetRequest 904311372 0.023000 >> > 903130400 1051499 33248 18049 34044 38809 >> > 4714 212 8 389 >> > GridNearSingleGetResponse 340134416 0.041000 >> > 339708055 419479 3732 1004 1222 737 >> > 117 2 1 67 >> > GridNearLockRequest 77088689 0.079000 >> > 76949792 129666 2003 1624 2585 2531 >> > 390 18 2 78 >> > GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000 >> > 39572801 53913 3949 2960 5487 5583 >> > 963 45 5 46 >> > GridDhtAtomicSingleUpdateRequest 37636829 0.277000 >> > 37561043 57294 3721 2736 5339 5716 >> > 879 54 6 41 >> > GridDhtAtomicDeferredUpdateResponse 33580198 0.002000 >> > 33512734 45873 4427 3340 6342 6363 >> > 1033 50 10 26 >> > GridNearTxPrepareRequest 21667790 0.238000 >> > 21584416 76283 1681 1026 1948 2027 >> > 330 31 2 46 >> > GridDhtTxPrepareResponse 20949873 0.316000 >> > 20862632 83578 1214 633 821 817 >> > 118 26 8 26 >> > GridDhtTxPrepareRequest 20949838 0.501000 >> > 20867551 72264 2265 1516 2773 2905 >> > 479 38 6 41 >> > GridDhtTxFinishResponse 13835065 0.007000 >> > 13808274 26487 150 68 55 21 >> > 3 1 0 6 >> > GridDhtTxFinishRequest 13835028 0.547000 >> > 13809380 25142 188 127 102 59 >> > 9 2 4 15 >> > GridNearTxFinishRequest 13762197 0.725000 >> > 13743925 17365 241 171 278 191 >> > 16 4 0 6 >> > GridDhtAtomicNearResponse 2784422 0.122000 >> > 2780600 3767 29 9 7 9 >> > 1 0 0 0 >> > GridNearGetRequest 2360483 0.484000 >> > 2355963 3762 194 114 184 228 >> > 38 0 0 0 >> > GridNearGetResponse 1984243 0.054000 >> > 1979404 4721 31 9 49 28 >> > 1 0 0 0 >> > GridNearTxPrepareResponse 192856 0.153000 >> > 192852 2 0 1 1 0 >> > 0 0 0 0 >> > GridNearLockResponse 192780 0.091000 >> > 192777 3 0 0 0 0 >> > 0 0 0 0 >> > GridNearTxFinishResponse 177 0.822000 >> > 177 0 0 0 0 0 >> > 0 0 0 0 >> > GridNearAtomicSingleUpdateRequest 124 4.803000 >> > 124 0 0 0 0 0 >> > 0 0 0 0 >> > GridNearAtomicUpdateResponse 120 0.448000 >> > 119 1 0 0 0 0 >> > 0 0 0 0 >> > >> > >> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > 1544912252 >> > 1542613019 2071099 57073 33387 61237 66024 >> > 9091 483 52 787 >> > >> > >> > This will allow to understand cluster load type and presence of problems >> > with increased latency in message processing. In general all messages >> > should be processed within time < 30ms. >> > >> > 2. Log messages which are waiting too long or have been processed for >> too >> > long (this is configured threshold). >> > >> > Track the following: message enqueue time, waiting in queue until >> > processing is started, head of queue before starting to wait, queue size >> > before starting to wait, total waiting time, total processing time, >> queue >> > size after processing, message content. Example: >> > >> > Slow message: *enqueueTs*=2018-11-27 15:10:22.241, *waitTime*=0.048, >> > *procTime*=305.186, *messageId*=3a3064a9, *queueSzBefore*=0, >> > *headMessageId*=null, >> > *queueSzAfter*=0, *message*=GridNearTxFinishRequest [miniId=1, >> > mvccSnapshot=null, super=GridDistributedTxFinishRequest >> > [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], >> > futId=199a3155761-f379f312-ad4b-4181-acc5-0aacb3391f07, threadId=296, >> > commitVer=null, invalidate=false, commit=true, baseVer=null, txSize=0, >> > sys=false, plc=2, subjId=dda703a0-69ee-47cf-9b9a-bf3dc9309feb, >> > taskNameHash=0, flags=32, syncMode=FULL_SYNC, txState=IgniteTxStateImpl >> > [activeCacheIds=[644280847], recovery=false, mvccEnabled=false, >> > txMap=HashSet [IgniteTxEntry [key=KeyCacheObjectImpl [part=8, val=8, >> > hasValBytes=true], cacheId=644280847, txKey=IgniteTxKey >> > [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], >> > cacheId=644280847], val=[op=READ, val=null], prevVal=[op=NOOP, >> val=null], >> > oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1, >> > conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, >> > filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=false, >> > entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=8, val=8, >> > hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, >> > nodeOrder=0], hash=8, extras=GridCacheObsoleteEntryExtras >> > [obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, >> nodeOrder=0]], >> > flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry >> > [rdrs=ReaderId[] [], part=8, super=], prepared=0, locked=false, >> > nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, >> > flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion >> > [topVer=154800617, order=1543320621588, nodeOrder=1]], IgniteTxEntry >> > [key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], >> > cacheId=644280847, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=11, >> > val=11, hasValBytes=true], cacheId=644280847], val=[op=READ, val=null], >> > prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], >> > entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, >> conflictVer=null, >> > explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], >> > filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry >> > [key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], val=null, >> > ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=11, >> > extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion >> > [topVer=2147483647, order=0, nodeOrder=0]], >> > flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry >> > [rdrs=ReaderId[] [], part=11, super=], prepared=0, locked=false, >> > nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, >> > flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion >> > [topVer=154800617, order=1543320621588, nodeOrder=1]]]], >> > super=GridDistributedBaseMessage [ver=GridCacheVersion >> [topVer=154800617, >> > order=1543320621587, nodeOrder=4], committedVers=null, >> rolledbackVers=null, >> > cnt=0, super=GridCacheIdMessage [cacheId=0]]]] >> > >> > This will allow to understand type of activity which produces latency >> > issues and possibly identify culprit code. >> > >> > 3. Track all synchronous waits while processing message: checkpoint >> locks, >> > segment locks, page locks, cold page reads from disk, etc. >> > >> > 4. Allow extensions to collection additional statistics for specific >> > message type. On example, for *TxFinishRequest we could track total >> > entities written for each cache and size in bytes. For >> GridNearLockRequest >> > we could track keys with max lock queue length for identifying >> contention. >> > >> > 5. Implement additional logging, JMX metrics and control.sh command for >> > printing currently collected profiling data. >> > >> > 6. Implementation should be as lightweight as possible. >> > >> > Maybe something else. >> > >> > BTW, I already started working on it and want to contribute my work to >> > community. Ticket with PR (not finished) [1], check for >> > CacheMessageStatsTest and CacheMessageStatsIndexingTest >> > >> > [1] https://issues.apache.org/jira/browse/IGNITE-10418 >> > >> > Thoughts ? >> > >> > -- >> > >> > Best regards, >> > Alexei Scherbakov >> > >> >> >> -- >> Best regards, >> Alexey >> >