Re: S3 Timeout waiting for connection from pool

2024-11-28 Thread William Wallace
hi, In order to to solve this you would either need to: - compile from source code (as you mentioned), Flink 1.18.2 is not yet released - change `state.storage.fs.memory-threshold` (could work as a temporary fix in your case until 1.18.2 is released) ``` # The minimum size of state data files. # Al

Re: S3 Timeout waiting for connection from pool

2024-11-27 Thread 王业楼
I saw the related fix commit on github, but now I want to use the 1.18 fixed version, where should I find it? I see that the 1.18 version provided by the official website is still in 2023, or can I only compile from the source code?在 2024年11月27日,17:56,William Wallace 写道:hi, It seems similar to is

S3 Timeout waiting for connection from pool

2024-11-27 Thread William Wallace
hi, It seems similar to issue described here: https://lists.apache.org/thread/g8yb4rlj0mlf1vgjl71815nts8r1w51p were we were not able to restore state because of the high number of S3 reads (in your case it might first encounter the connection limitation first). Have a look at https://issues.apache.

S3 Timeout waiting for connection from pool

2024-11-26 Thread wangye...@yeah.net
hi My Flink cluster uses S3 for storing the state backend. However, an exception occurs when the task runs for a long period of time. The content of the exception is "Timeout waiting for connection from pool". What could be the reason for this? The following is the specific error message. ja

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-26 Thread Shannon Carey
Haha, I see. Thanks. On 1/26/17, 1:48 PM, "Chen Qin" wrote: >We worked around S3 and had a beer with our Hadoop engineers... > > > >-- >View this message in context: >http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-snapshotting-to-S3

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-26 Thread Chen Qin
We worked around S3 and had a beer with our Hadoop engineers... -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-snapshotting-to-S3-Timeout-waiting-for-connection-from-pool-tp10994p11330.html Sent from the Apache Flink User Mailing List

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-24 Thread Shannon Carey
y, January 24, 2017 at 8:30 AM To: mailto:user@flink.apache.org>> Subject: Re: Flink snapshotting to S3 - Timeout waiting for connection from pool Hi Shannon! I was wondering if you still see this issue in Flink 1.1.4? Just thinking that another possible cause for the issue could be that there is a c

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-24 Thread Stephan Ewen
Hi Shannon! I was wondering if you still see this issue in Flink 1.1.4? Just thinking that another possible cause for the issue could be that there is a connection leak somewhere (Flink code or user code or vendor library) and thus the S3 connector's connection pool starves. For Flink 1.2, there

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-12 Thread Shannon Carey
Good to know someone else has had the same problem... What did you do about it? Did it resolve on its own? -Shannon On 1/12/17, 11:55 AM, "Chen Qin" wrote: >We have seen this issue back to Flink 1.0. Our finding back then was traffic >congestion to AWS in internal network. Many teams too d

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-12 Thread Shannon Carey
I can't predict when it will occur, but usually it's after Flink has been running for at least a week. Yes, I do believe we had several job restarts due to an exception due to a Cassandra node being down for maintenance and therefore a query failing to meet the QUORUM consistency level requeste

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-12 Thread Chen Qin
We have seen this issue back to Flink 1.0. Our finding back then was traffic congestion to AWS in internal network. Many teams too dependent on S3 and bandwidth is shared, cause traffic congestion from time to time. Hope it helps! Thanks Chen > On Jan 12, 2017, at 03:30, Ufuk Celebi wrote: >

Re: Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-12 Thread Ufuk Celebi
Hey Shannon! Is this always reproducible and how long does it take to reproduce it? I've not seen this error before but as you say it indicates that some streams are not closed. Did the jobs do any restarts before this happened? Flink 1.1.4 contains fixes for more robust releasing of resources i

Flink snapshotting to S3 - Timeout waiting for connection from pool

2017-01-11 Thread Shannon Carey
I'm having pretty frequent issues with the exception below. It basically always ends up killing my cluster after forcing a large number of job restarts. I just can't keep Flink up & running. I am running Flink 1.1.3 on EMR 5.2.0. I already tried updating the emrfs-site config fs.s3.maxConnectio