[ 
https://issues.apache.org/jira/browse/HDDS-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17915830#comment-17915830
 ] 

Aswin Shakil edited comment on HDDS-11345 at 1/22/25 8:37 PM:
--------------------------------------------------------------

The metrics for reconciliation tasks are already available as a part of 
*ReplicationSupervisor* class which includes:
 * numRequestedContainerReconciliations - Number of reconciliation tasks 
 * numQueuedContainerReconciliations - Number of queued tasks
 * numTimeoutContainerReconciliations - Number of timed-out tasks
 * numSuccessContainerReconciliations- Number of Success
 * numFailureContainerReconciliations - Number of Failures 
 * numSkippedContainerReconciliations - Number of Skipped Tasks
 
Latency/Count metrics for the tasks exposed by *CommandHandlerMetrics* for 
*ReconcileContainerCommandHandler*
* TotalRunTimeMs - The total runtime of the command handler in milliseconds
* AvgRunTimeMs - Average run time of the command handler in milliseconds
* QueueWaitingTaskCount - The number of queued tasks waiting for execution
* InvocationCount - The number of times the command handler has been invoked
* ThreadPoolActivePoolSize - The number of active threads in the thread pool
*  ThreadPoolMaxPoolSize - The maximum number of threads in the thread pool
*  CommandReceivedCount - The number of received SCM commands for each command 
type

Other container reconciliation-related tasks are encapsulated in the following 
classes,
{*}ContainerMerkleTreeMetrics{*}:
 * numMerkleTreeWriteFailure -  Number of Merkle tree write failure
 * numMerkleTreeReadFailure - Number of Merkle tree read failure
 * numMerkleTreeDiffFailure - Number of Merkle tree diff failure
 * numNoRepairContainerDiff - Number of container diff that doesn't require 
repair
 * numRepairContainerDiff - Number of container diff that require repair
 * merkleTreeWriteLatencyNS- Merkle tree write latency
 * merkleTreeReadLatencyNS - Merkle tree read latency
 * merkleTreeCreateLatencyNS - Merkle tree creation latency
 * merkleTreeDiffLatencyNS - Merkle tree diff latency

This task can be reduced to testing the reconciliation task metrics in 
Replication Supervisor.


was (Author: aswinshakil):
The metrics for reconciliation tasks are already available as a part of 
*ReplicationSupervisor* class which includes:
* Number of reconciliation tasks
* Number of queued tasks
* Number of timed-out tasks
* Number of Success
* Number of Failures
* Number of Skipped Tasks
* Latency metrics for the tasks

Other container reconciliation-related tasks are encapsulated in the following 
classes,
*ContainerMerkleTreeMetrics*:
*  Number of Merkle tree write failure
* Number of Merkle tree read failure
* Number of Merkle tree diff failure
* Number of container diff that doesn't require repair
* Number of container diff that require repair
* Merkle tree write latency
* Merkle tree read latency
* Merkle tree creation latency
* Merkle tree diff latency

This task can be reduced to testing the reconciliation task metrics in 
Replication Supervisor.

> Add metrics specific to reconciliation tasks
> --------------------------------------------
>
>                 Key: HDDS-11345
>                 URL: https://issues.apache.org/jira/browse/HDDS-11345
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ethan Rose
>            Assignee: Aswin Shakil
>            Priority: Major
>              Labels: pull-request-available
>
> HDDS-10373 added metrics specific to merkle tree generation and HDDS-11254 
> made reconciliation count towards the replication supervisor's metrics. 
> However we still need metrics specific to full reconciliation tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to