Igniters, At work I often have to solve performance issues with Ignite cluster without having access to source code of running user application.
Looks like Ignite have limited capabilities to identify bottlenecks without extensive profiling on server and client side (JFR recording , sampling profilers, regular thread dumps, etc), which is not always possible. Even having profiling data not always helpful for determining several types of bottlenecks, on example, if where is a contention on single key/partition. I propose to implement new feature, like lightweight profiling of message processing. The feature will allow to have view on message processing statistics for each node and for all grid nodes. In short, it's necessary to track each message execution in executors and record execution statistics like synchronous execution time in executor thread and waiting time in queue. Full description: 1. Implement detailed tracking of message waiting in queue and actual processing by executors with splitting to several time bins. Example of detailed statistics for each processed message: Processing time(%) Message Total Average(ms) < 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms < 500ms < 750ms < 1000ms > 1000ms ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- GridNearSingleGetRequest 904311372 0.023000 904240521 57394 7242 3961 1932 229 61 24 4 4 GridNearSingleGetResponse 340134416 0.041000 340118791 11660 1167 729 901 1001 158 8 1 0 GridNearLockRequest 77088689 0.079000 77073458 11945 2299 643 311 31 2 0 0 0 GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000 39580914 28222 6469 4638 9870 13414 2087 137 1 0 GridDhtAtomicSingleUpdateRequest 37636829 0.277000 37579375 23247 5915 4210 8954 12917 2048 162 1 0 GridDhtAtomicDeferredUpdateResponse 33580198 0.002000 33579805 337 51 3 1 1 0 0 0 0 GridNearTxPrepareRequest 21667790 0.238000 21078069 580126 1622 1261 2531 3631 496 40 0 14 GridDhtTxPrepareResponse 20949873 0.316000 17130161 3803105 4615 3162 4489 3721 577 34 1 8 GridDhtTxPrepareRequest 20949838 0.501000 16158732 4750217 16183 5735 8472 8994 1353 88 11 53 GridDhtTxFinishResponse 13835065 0.007000 13832519 2476 27 28 14 1 0 0 0 0 GridDhtTxFinishRequest 13835028 0.547000 12084106 1736789 8971 2340 1792 807 118 41 4 60 GridNearTxFinishRequest 13762197 0.725000 11811828 1942499 4441 1400 1201 524 89 34 19 162 GridDhtAtomicNearResponse 2784422 0.122000 2783393 1022 5 2 0 0 0 0 0 0 GridNearGetRequest 2360483 0.484000 2345937 14129 244 101 64 8 0 0 0 0 GridNearGetResponse 1984243 0.054000 1981905 2327 8 1 1 1 0 0 0 0 GridNearTxPrepareResponse 192856 0.153000 192660 188 1 5 1 1 0 0 0 0 GridNearLockResponse 192780 0.091000 192667 107 3 0 3 0 0 0 0 0 GridNearTxFinishResponse 177 0.822000 129 47 1 0 0 0 0 0 0 0 GridNearAtomicSingleUpdateRequest 124 4.803000 52 53 19 0 0 0 0 0 0 0 GridNearAtomicUpdateResponse 120 0.448000 110 10 0 0 0 0 0 0 0 0 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1544912252 1531765132 12965900 59283 28219 40537 45281 6989 568 42 301 Queue waiting time(%) Message Total Average(ms) < 1 ms < 10 ms < 30ms < 50ms < 100ms < 250ms < 500ms < 750ms < 1000ms > 1000ms ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- GridNearSingleGetRequest 904311372 0.023000 903130400 1051499 33248 18049 34044 38809 4714 212 8 389 GridNearSingleGetResponse 340134416 0.041000 339708055 419479 3732 1004 1222 737 117 2 1 67 GridNearLockRequest 77088689 0.079000 76949792 129666 2003 1624 2585 2531 390 18 2 78 GridNearAtomicSingleUpdateInvokeRequest 39645752 0.298000 39572801 53913 3949 2960 5487 5583 963 45 5 46 GridDhtAtomicSingleUpdateRequest 37636829 0.277000 37561043 57294 3721 2736 5339 5716 879 54 6 41 GridDhtAtomicDeferredUpdateResponse 33580198 0.002000 33512734 45873 4427 3340 6342 6363 1033 50 10 26 GridNearTxPrepareRequest 21667790 0.238000 21584416 76283 1681 1026 1948 2027 330 31 2 46 GridDhtTxPrepareResponse 20949873 0.316000 20862632 83578 1214 633 821 817 118 26 8 26 GridDhtTxPrepareRequest 20949838 0.501000 20867551 72264 2265 1516 2773 2905 479 38 6 41 GridDhtTxFinishResponse 13835065 0.007000 13808274 26487 150 68 55 21 3 1 0 6 GridDhtTxFinishRequest 13835028 0.547000 13809380 25142 188 127 102 59 9 2 4 15 GridNearTxFinishRequest 13762197 0.725000 13743925 17365 241 171 278 191 16 4 0 6 GridDhtAtomicNearResponse 2784422 0.122000 2780600 3767 29 9 7 9 1 0 0 0 GridNearGetRequest 2360483 0.484000 2355963 3762 194 114 184 228 38 0 0 0 GridNearGetResponse 1984243 0.054000 1979404 4721 31 9 49 28 1 0 0 0 GridNearTxPrepareResponse 192856 0.153000 192852 2 0 1 1 0 0 0 0 0 GridNearLockResponse 192780 0.091000 192777 3 0 0 0 0 0 0 0 0 GridNearTxFinishResponse 177 0.822000 177 0 0 0 0 0 0 0 0 0 GridNearAtomicSingleUpdateRequest 124 4.803000 124 0 0 0 0 0 0 0 0 0 GridNearAtomicUpdateResponse 120 0.448000 119 1 0 0 0 0 0 0 0 0 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1544912252 1542613019 2071099 57073 33387 61237 66024 9091 483 52 787 This will allow to understand cluster load type and presence of problems with increased latency in message processing. In general all messages should be processed within time < 30ms. 2. Log messages which are waiting too long or have been processed for too long (this is configured threshold). Track the following: message enqueue time, waiting in queue until processing is started, head of queue before starting to wait, queue size before starting to wait, total waiting time, total processing time, queue size after processing, message content. Example: Slow message: *enqueueTs*=2018-11-27 15:10:22.241, *waitTime*=0.048, *procTime*=305.186, *messageId*=3a3064a9, *queueSzBefore*=0, *headMessageId*=null, *queueSzAfter*=0, *message*=GridNearTxFinishRequest [miniId=1, mvccSnapshot=null, super=GridDistributedTxFinishRequest [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], futId=199a3155761-f379f312-ad4b-4181-acc5-0aacb3391f07, threadId=296, commitVer=null, invalidate=false, commit=true, baseVer=null, txSize=0, sys=false, plc=2, subjId=dda703a0-69ee-47cf-9b9a-bf3dc9309feb, taskNameHash=0, flags=32, syncMode=FULL_SYNC, txState=IgniteTxStateImpl [activeCacheIds=[644280847], recovery=false, mvccEnabled=false, txMap=HashSet [IgniteTxEntry [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], cacheId=644280847, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], cacheId=644280847], val=[op=READ, val=null], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=8, val=8, hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=8, extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, nodeOrder=0]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=8, super=], prepared=0, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion [topVer=154800617, order=1543320621588, nodeOrder=1]], IgniteTxEntry [key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], cacheId=644280847, txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], cacheId=644280847], val=[op=READ, val=null], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=CacheEntryPredicate[] [], filtersPassed=false, filtersSet=false, entry=GridCacheMapEntry [key=KeyCacheObjectImpl [part=11, val=11, hasValBytes=true], val=null, ver=GridCacheVersion [topVer=0, order=0, nodeOrder=0], hash=11, extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, nodeOrder=0]], flags=2]GridDistributedCacheEntry [super=]GridDhtCacheEntry [rdrs=ReaderId[] [], part=11, super=], prepared=0, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=0, partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion [topVer=154800617, order=1543320621588, nodeOrder=1]]]], super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=154800617, order=1543320621587, nodeOrder=4], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0]]]] This will allow to understand type of activity which produces latency issues and possibly identify culprit code. 3. Track all synchronous waits while processing message: checkpoint locks, segment locks, page locks, cold page reads from disk, etc. 4. Allow extensions to collection additional statistics for specific message type. On example, for *TxFinishRequest we could track total entities written for each cache and size in bytes. For GridNearLockRequest we could track keys with max lock queue length for identifying contention. 5. Implement additional logging, JMX metrics and control.sh command for printing currently collected profiling data. 6. Implementation should be as lightweight as possible. Maybe something else. BTW, I already started working on it and want to contribute my work to community. Ticket with PR (not finished) [1], check for CacheMessageStatsTest and CacheMessageStatsIndexingTest [1] https://issues.apache.org/jira/browse/IGNITE-10418 Thoughts ? -- Best regards, Alexei Scherbakov