Maxim Martynov created HADOOP-18838: ---------------------------------------
Summary: Some fs.s3a.* config values are different in sources and documentation Key: HADOOP-18838 URL: https://issues.apache.org/jira/browse/HADOOP-18838 Project: Hadoop Common Issue Type: Bug Components: fs/s3 Affects Versions: 3.3.6 Reporter: Maxim Martynov For config option {{fs.s3a.retry.throttle.interval}} default value in source code is {{500ms}}: {code:java} public static final String RETRY_THROTTLE_INTERVAL_DEFAULT = "500ms"; {code} https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java#L921 In {{core-default.xml}} it has value {{100ms}}: {code:xml} <property> <name>fs.s3a.retry.throttle.interval</name> <value>100ms</value> <description> Initial between retry attempts on throttled requests, +/- 50%. chosen at random. i.e. for an intial value of 3000ms, the initial delay would be in the range 1500ms to 4500ms. Backoffs are exponential; again randomness is used to avoid the thundering heard problem. 500ms is the default value used by the AWS S3 Retry policy. </description> </property> {code} https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml#L1750 This change introduced in HADOOP-16823. In Hadoop-AWS module index it has value {{1000ms}}: {code:xml} <property> <name>fs.s3a.retry.throttle.interval</name> <value>1000ms</value> <description> Interval between retry attempts on throttled requests. </description> </property> {code} https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md?plain=1#L1223 File was created in HADOOP-13786, and value is left unchanged since when. In performance tuning page it has up-to-date value {{500ms}}: {code:xml} <property> <name>fs.s3a.retry.throttle.interval</name> <value>500ms</value> <description> Interval between retry attempts on throttled requests. </description> </property> {code} https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/performance.md?plain=1#L435 The same issue with: * {{fs.s3a.retry.throttle.limit}} - in source code it has value {{20}}, but in some documents still old value ${fs.s3a.attempts.maximum} * {{fs.s3a.connection.establish.timeout}} - in source code it has value {{50_000}}, in config file & documentation {{5_000}} * {{fs.s3a.attempts.maximum}} - in source code it has value {{10}}, in config file & documentation {{20}} * {{fs.s3a.threads.max} - in source & documentation code it has value {{10}}, in config file {{64}} * {{fs.s3a.max.total.tasks}} - in source code & config it has value {{32}}, in documentation {{5}} * {{fs.s3a.connection.maximum}} - in source code & config it has value {{96}}, in documentation {{15}} or {{30}} Please sync these values, outdated documentation is very painful to work with. As an idea, is it possible to use {{core-default.xml}} directly in documentation, or generate this documentation from docstrings in Java code? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org