HADOOP-17450 added an API for IOStatistics -anything can be an IOStatistics source, which can be collected, aggregated, reported (There's a serializable form). This goes beyond simple counts -we include mins, maxes and means too
S3A and ABFS support this in their streams, s3a does it for its remote iterators and FS; ABFS adding for the FS. MR's LocatedFileStatusFetcher also collects/aggregates any stats from the filesystems it iterates over. Independent of any sampling based collection of stats on the server (I'd point to open telemetry) it'd be great if DFSClient collected IOStats for the FS and for the input/output streams which can then be included in reports. Example: stats of a single test case. Really interesting to see those mean costs of operations -but note also the min/max values. counters=((action_http_head_request=2) (audit_access_check_failure=1) (audit_request_execution=7) (audit_span_start=5) (directories_created=1) (object_list_request=4) (object_metadata_request=2) (object_put_request=1) (object_put_request_completed=1) (op_access=1) (op_access.failures=1) (op_mkdirs=2) (store_io_request=7)); gauges=(); minimums=((action_http_head_request.min=31) (object_list_request.min=41) (object_put_request.min=226) (op_access.failures.min=73) (op_mkdirs.min=385)); maximums=((action_http_head_request.max=45) (object_list_request.max=1039) (object_put_request.max=226) (op_access.failures.max=73) (op_mkdirs.max=1062)); means=((action_http_head_request.mean=(samples=2, sum=76, mean=38.0000)) (object_list_request.mean=(samples=4, sum=1171, mean=292.7500)) (object_put_request.mean=(samples=1, sum=226, mean=226.0000)) (op_access.failures.mean=(samples=1, sum=73, mean=73.0000)) (op_mkdirs.mean=(samples=2, sum=1447, mean=723.5000))); On Mon, 12 Apr 2021 at 17:30, Stephen O'Donnell <sodonn...@cloudera.com.invalid> wrote: > I have not tried to do this, but as we (Cloudera) deal with more and more > performance related problems, I feel something like this is needed. > > It is a tricky problem due to the number of requests the NN handles and how > performance sensitive it is. > > At the IPC Server level, we should be able to know the request queue time, > processing time, response queue time and the type of request. > > If we sampled X% of requests and then emitted one log line per interval (eg > per minute), we could perhaps build a histogram of queue size, queue times, > processing times per request type. > > From JMX, we can get the request counts and queue length, but I am not sure > if we can get something like percentiles or queue time and processing time > over the previous minute for example? > > Even given the above details, if we see a long queue length, it may still > remain a mystery about what was causing that queue. Often it is due to a > long running request (eg contentSummay, snapshotdiff etc) holding the NN > lock in write mode for too long. > > What would be very useful, is a way to see the percentage of time the NN > lock is held in Exclusive mode (write), shared mode (read) or not held at > all (rare on a busy cluster). Even better if we can somehow bubble up the > top requests holding the lock in exclusive mode. > > Perhaps sampling the time spent waiting to acquire the lock could be useful > too. > > I also think it would be useful to expose response times from the client > perspective. > > https://issues.apache.org/jira/browse/HDFS-14084 seemed interesting and > could be worth finishing. > > I also found https://issues.apache.org/jira/browse/HDFS-12861 some time > back to get the client to log data read speeds. > > Have you made any attempts in this area so far, and did you have any > success? > > Thanks, > > Stephen. > > On Thu, Mar 18, 2021 at 5:41 AM Fengnan Li <loyal...@gmail.com> wrote: > > > Hi community, > > > > > > > > Has someone ever tried to implement sampling logging for ipc Server? We > > would like to gain more observability for all of the traffics to the > > Namenode. > > > > > > > > Thanks, > > Fengnan > > > > >