These were the parameters that I set btw:

akka.watch.heartbeat.interval: 100
akka.transport.heartbeat.interval: 1000

On Fri, Feb 19, 2016 at 7:43 PM, Saiph Kappa <saiph.ka...@gmail.com> wrote:

> I am not sure.
>
> For that particular machine I get messages like these:
> «
> myip:6123/user/jobmanager#291801197])) at akka://flink/user/$a from
> Actor[akka://flink/deadLetters].
> ^[[34m[INFO]^[[0;39m o.a.f.r.c.JobClientActor    - Connected to new
> JobManager akka.tcp://flink@myip:6123/user/jobmanager.
>
> ^[[34m[INFO]^[[0;39m o.a.f.r.c.JobClientActor    - Sending message to
> JobManager akka.tcp://flink@myip:6123/user/jobmanager to submit job JOB1
> (5f9cef0c2e4b69530bf1e2485e94d326) and wait for progress
>
>
> ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor    - Handled message
> LeaderSessionMessage(null,JobManagerActorRef(Actor[akka.tcp://flink@myip:6123/user/jobmanager#291801197]))
> in 48 ms from Actor[akka://flink/deadLetters].
>
>
> ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor    - Handled message
> LeaderSessionMessage(null,JobManagerActorRef(Actor[akka.tcp://flink@myip:6123/user/jobmanager#291801197]))
> in 48 ms from Actor[akka://flink/deadLetters].
>
> ^[[39m[DEBUG]^[[0;39m o.a.f.r.c.JobClientActor    - Received message
> JobSubmitSuccess(2575d5ff5c10336beb7820a052a63623) at akka://flink/user/$a
> from Actor[akka.tcp://flink@myip:6123/user/jobmanager#1144818256].
> »
>
> I tried to set the heartbeat interval in the cluster but it didn't solve
> the problem, should I try to set it in the client (how can I do it)? I see
> no other errors or exceptions on the log files.
>
>
>
>
> On Fri, Feb 19, 2016 at 7:07 PM, Robert Metzger <rmetz...@apache.org>
> wrote:
>
>> Hi Saiph,
>>
>> are you sure that the jobs are cancelled because the client disconnects?
>>
>> For the different timeouts, check the configuration page:
>> https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/config.html
>> and search for "heartbeat".
>>
>> On Fri, Feb 19, 2016 at 8:04 PM, Saiph Kappa <saiph.ka...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a Flink client application that launches jobs to remote clusters.
>>> However I'm getting my jobs cancelled:
>>> "18:25:29,650 WARN
>>> akka.remote.ReliableDeliverySupervisor                        - Association
>>> with remote system [akka.tcp://flink@127.0.0.1:52929] has failed,
>>> address is now gated for [5000] ms. Reason is: [Disassociated]."
>>>
>>> How can I increase the akka heartbeat interval? Where should I set that
>>> configuration parameter, in the client or in the Flink clusters, and in
>>> which file.
>>>
>>> Thanks.
>>>
>>>
>>
>

Reply via email to