My job is a batch one, not a streaming job. Is it possible that the cause
is the one you mentioned?

On Mon, 14 May 2018, 14:23 Stefan Richter, <s.rich...@data-artisans.com>
wrote:

> Hi,
>
> that looks like a known issue where Flink did not wait for the shutdown of
> the timer service before disposing state backends. This is problem fixed in
> the >= 1.4 branches.
>
> Best,
> Stefan
>
> Am 14.05.2018 um 14:12 schrieb Flavio Pompermaier <pomperma...@okkam.it>:
>
> Hi to all,
> I have a Flink 1.3.1 job that runs multiple times.
> Everything goes well for some time (e.g. 10 jobs). Then, one or more TMs
> suddently die.
>
> In the .out file I find something like this:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f6f3897712f, pid=18794, tid=140110535448320
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_72-b15) (build
> 1.8.0_72-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.72-b15 mixed mode
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libc.so.6+0x7f12f]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/user/hs_err_pid18794.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
>
>
> Attached the produced error report. Do you find anything useful?
> I can even send you the job's jar with the data but it requires about 200
> MB..
>
> Best,
> Flavio
> <hs_err_pid18794.log>
>
>
>

Reply via email to