Steve Loughran created HADOOP-18650: ---------------------------------------
Summary: improve s3a committer stats collected Key: HADOOP-18650 URL: https://issues.apache.org/jira/browse/HADOOP-18650 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.3.5 Reporter: Steve Loughran we can improve stats collected in the s3a committer and saved to the JSON. key ones # of task manifests read; duration of loads # size of each manifest I think we would also benefit if we could set the commit thread pools to be big -but then shared across all jobs (i.e. demand-created thread pool in s3a fs). that would allow for a pool size of say, 500, but still support many jobs actively committing at same time (busy spark driver) finally: should file commit pool size be > size of pool of manifest readers. I think it could be, but the ratio should be fairly low. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org