[ 
https://issues.apache.org/jira/browse/HADOOP-19295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886182#comment-17886182
 ] 

Steve Loughran commented on HADOOP-19295:
-----------------------------------------

example 1: failure on 3.4.1 RC1 uploading the binary tarball itself. timeouts 
to failure, resulting in command failure

{code}
> time bin/hadoop fs -put 
> ../../../downloads/hadoop-3.4.1-RC2/hadoop-3.4.1.tar.gz 
> s3a://stevel-london/hadoop-3.4.1.tar.gz
2024-09-30 19:11:42,485 [s3a-transfer-stevel-london-bounded-pool1-t2] INFO  
s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(182)) - 
upload part #2 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: Retried 0: 
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: upload part #2 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 60000 millis
2024-09-30 19:11:42,485 [s3a-transfer-stevel-london-bounded-pool1-t1] INFO  
s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(182)) - 
upload part #1 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: Retried 0: 
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: upload part #1 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 60000 millis
2024-09-30 19:11:42,503 [s3a-transfer-stevel-london-bounded-pool1-t3] INFO  
s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(182)) - 
upload part #3 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: Retried 0: 
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: upload part #3 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 60000 millis
2024-09-30 19:11:42,520 [s3a-transfer-stevel-london-bounded-pool1-t4] INFO  
s3a.WriteOperationHelper (WriteOperationHelper.java:operationRetried(182)) - 
upload part #4 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: Retried 0: 
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: upload part #4 upload ID 
URs6yPM33nCd6epcixIrEO2_BUTBBIoESd4wkLh1KShNeZe.Jd_6ctBOFEyobUzvCJ7OKhwTjvrLZBLaRC1qxVgDyiHZNK8DfD0XF4g0HYMKBlLproB6fvH1DdBTE6g.
 on hadoop-3.4.1.tar.gz._COPYING_: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 60000 millis
^C2024-09-30 19:11:49,672 [main] ERROR util.BlockingThreadPoolExecutorService 
(BlockingThreadPoolExecutorService.java:rejectedExecution(141)) - Could not 
submit task to executor 
java.util.concurrent.ThreadPoolExecutor@54d901aa[Shutting down, pool size = 2, 
active threads = 2, queued tasks = 0, completed tasks = 2]
2024-09-30 19:11:49,719 [main] ERROR util.BlockingThreadPoolExecutorService 
(BlockingThreadPoolExecutorService.java:rejectedExecution(141)) - Could not 
submit task to executor 
java.util.concurrent.ThreadPoolExecutor@54d901aa[Shutting down, pool size = 1, 
active threads = 1, queued tasks = 0, completed tasks = 3]
2024-09-30 19:11:49,746 [main] ERROR util.BlockingThreadPoolExecutorService 
(BlockingThreadPoolExecutorService.java:rejectedExecution(141)) - Could not 
submit task to executor 
java.util.concurrent.ThreadPoolExecutor@54d901aa[Shutting down, pool size = 1, 
active threads = 1, queued tasks = 0, completed tasks = 3]
{code}
```

> S3A: fs.s3a.connection.request.timeout too low for large uploads over slow 
> links
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-19295
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19295
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0, 3.4.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low 
> for large uploads over slow connections.
> I suspect something changed between the v1 and v2 SDK versions so that put 
> was exempt from the normal timeouts, It is not and now surfaces in failures 
> to upload 1+ GB files over slower network connections. Smailer (for example 
> 128 MB) files work.
> The parallel queuing of writes in the S3ABlockOutputStream is helping create 
> this problem as it queues multiple blocks at the same time, so per-block 
> bandwidth becomes available/blocks ; four blocks cuts the capacity down by a 
> quarter.
> The fix is straightforward: use a much bigger timeout. I'm going to propose 
> 15 minutes. We need to strike a balance between upload time allocation and 
> other requests timing out.
> I do worry about other consequences; we've found that timeout exception happy 
> to hide the underlying causes of retry failures -so in fact this may be 
> better for all but a server hanging after the HTTP request is initiated.
> too bad we can't alter the timeout for different requests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to