Yes, they're both set to the same value. These are the JVM Options as
reported by the TM:

17:36:12,137 INFO  org.apache.flink.runtime.taskmanager.TaskManager
     -  JVM Options:
17:36:12,137 INFO  org.apache.flink.runtime.taskmanager.TaskManager
     -     -Xms51355M
17:36:12,137 INFO  org.apache.flink.runtime.taskmanager.TaskManager
     -     -Xmx51355M
17:36:12,137 INFO  org.apache.flink.runtime.taskmanager.TaskManager
     -     -XX:MaxDirectMemorySize=51355M

Same goes for JM:

7:36:10,837 INFO  org.apache.flink.runtime.jobmanager.JobManager
     -  JVM Options:
17:36:10,837 INFO  org.apache.flink.runtime.jobmanager.JobManager
     -     -Xms25677m
17:36:10,837 INFO  org.apache.flink.runtime.jobmanager.JobManager
     -     -Xmx25677m


On Fri, Oct 2, 2015 at 5:35 PM, Stephan Ewen <se...@apache.org> wrote:

> The delay you see happens when the TaskManager allocates the memory for
> its memory manager. Allocating that much in a JVM can take a bit, although
> 40 seconds looks a lot to me...
>
> How do you start the JVM? Are Xmx and Xms set to the same value? If not,
> the JVM incrementally grows through multiple garbage collections, which
> makes it quite slow.
>
> If the JVM starts with a large heap, it should actually not take as long
> as in your case...
>
> On Fri, Oct 2, 2015 at 5:26 PM, Robert Schmidtke <ro.schmid...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> I'm wondering about the startup times of the TMs:
>>
>> ...
>> 17:03:33,255 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Starting TaskManager actor
>> 17:03:33,262 INFO  org.apache.flink.runtime.io.network.netty.NettyConfig
>>         - NettyConfig [server address: cumu02-05/130.73.144.64, server
>> port: 45731, memory segment size (bytes): 32768, transport type: NIO,
>> number of server threads: 0 (use Netty's default), number of client
>> threads: 0 (use Netty's default), server connect backlog: 0 (use Netty's
>> default), client connect timeout (sec): 120, send/receive buffer size
>> (bytes): 0 (use Netty's default)]
>> 17:03:33,266 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Messages between TaskManager and JobManager have a max timeout of
>> 100000 milliseconds
>> 17:03:33,268 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Temporary file directory '/tmp': total 44 GB, usable 37 GB (84.09%
>> usable)
>> 17:03:33,295 INFO
>>  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated
>> 64 MB for network buffer pool (number of memory segments: 2048, bytes per
>> segment: 32768).
>> 17:03:33,554 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Using 0.7 of the currently free heap space for Flink managed heap
>> memory (34395 MB).
>>
>> // almost 40 seconds //
>>
>> 17:04:12,445 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager
>>        - I/O manager uses directory
>> /tmp/flink-io-922d9bf4-254e-41e7-b151-525157cd5bfe for spill files.
>> 17:04:12,455 INFO  org.apache.flink.runtime.filecache.FileCache
>>        - User file cache uses directory
>> /tmp/flink-dist-cache-792cf7f2-e2be-4950-a39f-d7a21326f054
>> 17:04:12,617 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Starting TaskManager actor at
>> akka://flink/user/taskmanager#1341641688.
>> 17:04:12,617 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - TaskManager data connection information: cumu02-05.zib.de
>> (dataPort=45731)
>> 17:04:12,618 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - TaskManager has 16 task slot(s).
>> 17:04:12,618 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Memory usage stats: [HEAP: 35502/49216/49216 MB, NON HEAP:
>> 25/52/214 MB (used/committed/max)]
>> 17:04:12,623 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Trying to register at JobManager akka.tcp://
>> flink@130.73.144.59:6123/user/jobmanager (attempt 1, timeout: 500
>> milliseconds)
>> 17:04:12,773 INFO  org.apache.flink.runtime.taskmanager.TaskManager
>>        - Successful registration at JobManager (akka.tcp://
>> flink@130.73.144.59:6123/user/jobmanager), starting network stack and
>> library cache.
>> ...
>>
>>
>> The same goes for the JM (obviously).
>>
>> ...
>> 17:03:31,632 INFO  org.apache.flink.runtime.jobmanager.JobManager
>>        - Starting JobManger web frontend
>> 17:03:31,636 INFO  org.apache.flink.runtime.jobmanager.web.WebInfoServer
>>         - Setting up web info server, using web-root directory
>> jar:file:/nfs/csr/bzcschmi/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT/lib/flink-dist-0.10-SNAPSHOT.jar!/web-docs-infoserver.
>> 17:03:31,753 INFO  org.eclipse.jetty.util.log
>>        - jetty-0.10-SNAPSHOT
>> 17:03:31,806 INFO  org.eclipse.jetty.util.log
>>        - Started SelectChannelConnector@0.0.0.0:8081
>> 17:03:31,806 INFO  org.apache.flink.runtime.jobmanager.web.WebInfoServer
>>         - Started web info server for JobManager on 0.0.0.0:8081
>>
>> // almost 35 seconds //
>>
>> 17:04:05,091 INFO  org.apache.flink.runtime.instance.InstanceManager
>>         - Registered TaskManager at cumu02-02 (akka.tcp://
>> flink@130.73.144.61:53549/user/taskmanager) as
>> e5ae92397a912c7360524524cf2d172a. Current number of registered hosts is 1.
>> Current number of alive task slots is 16.
>> ...
>>
>>
>> Is this to be expected? Any ideas what's happening in the meantime? I'm
>> asking because I'm running into errors when submitting my job too early
>> (and not enough TMs have connected).
>>
>> Cheers
>> Robert
>>
>> --
>> My GPG Key ID: 336E2680
>>
>
>


-- 
My GPG Key ID: 336E2680

Reply via email to