Hi all,
I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
cluster, i.e., one JobManager and one taskManager with 25GB of heapsize,
it took around two hours to finish counting this file while a simple
python script can do it in around 7 minutes. Just wondering what is
wrong with my setup. I ran the experiments on a cluster with six
taskManagers, but I still get very long execution time like 25 minutes
or so. I tried to increase the JVM heap size to have lower execution
time but it did not help. I attached the log file and the Flink
configuration file to this email.
Best,
Habib
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
#==============================================================================
# Common
#==============================================================================
# The external address of the host on which the JobManager runs and can be
# reached by the TaskManagers and any clients which want to connect. This
setting
# is only used in Standalone mode and may be overwritten on the JobManager side
# by specifying the --host <hostname> parameter of the bin/jobmanager.sh
executable.
# In high availability mode, if you use the bin/start-cluster.sh script and
setup
# the conf/masters file, this will be taken care of automatically. Yarn/Mesos
# automatically configure the host name based on the hostname of the node where
the
# JobManager runs.
jobmanager.rpc.address: localhost
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The heap size for the JobManager JVM
jobmanager.heap.size: 25000m
# The heap size for the TaskManager JVM
taskmanager.heap.size: 25000m
# The number of task slots that each TaskManager offers. Each slot runs one
parallel pipeline.
taskmanager.numberOfTaskSlots: 1
# The parallelism used for programs that did not specify and other parallelism.
parallelism.default: 1
# The default file system scheme and authority.
#
# By default file paths without scheme are interpreted relative to the local
# root file system 'file:///'. Use this to override the default and interpret
# relative paths relative to a different file system,
# for example 'hdfs://mynamenode:12345'
#
# fs.default-scheme
#==============================================================================
# High Availability
#==============================================================================
# The high-availability mode. Possible options are 'NONE' or 'zookeeper'.
#
# high-availability: zookeeper
# The path where metadata for master recovery is persisted. While ZooKeeper
stores
# the small ground truth for checkpoint and leader election, this location
stores
# the larger objects, like persisted dataflow graphs.
#
# Must be a durable file system that is accessible from all nodes
# (like HDFS, S3, Ceph, nfs, ...)
#
# high-availability.storageDir: hdfs:///flink/ha/
# The list of ZooKeeper quorum peers that coordinate the high-availability
# setup. This must be a list of the form:
# "host1:clientPort,host2:clientPort,..." (default clientPort: 2181)
#
# high-availability.zookeeper.quorum: localhost:2181
# ACL options are based on
https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_BuiltinACLSchemes
# It can be either "creator" (ZOO_CREATE_ALL_ACL) or "open"
(ZOO_OPEN_ACL_UNSAFE)
# The default value is "open" and it can be changed to "creator" if ZK security
is enabled
#
# high-availability.zookeeper.client.acl: open
#==============================================================================
# Fault tolerance and checkpointing
#==============================================================================
# The backend that will be used to store operator state checkpoints if
# checkpointing is enabled.
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
# <class-name-of-factory>.
#
# state.backend: filesystem
# Directory for checkpoints filesystem, when using any of the default bundled
# state backends.
#
# state.checkpoints.dir: hdfs://namenode-host:port/flink-checkpoints
# Default target directory for savepoints, optional.
#
# state.savepoints.dir: hdfs://namenode-host:port/flink-checkpoints
# Flag to enable/disable incremental checkpoints for backends that
# support incremental checkpoints (like the RocksDB state backend).
#
# state.backend.incremental: false
#==============================================================================
# Rest & web frontend
#==============================================================================
# The port to which the REST client connects to. If rest.bind-port has
# not been specified, then the server will bind to this port as well.
#
#rest.port: 8081
# The address to which the REST client will connect to
#
#rest.address: 0.0.0.0
# Port range for the REST and web server to bind to.
#
#rest.bind-port: 8080-8090
# The address that the REST & web server binds to
#
#rest.bind-address: 0.0.0.0
# Flag to specify whether job submission is enabled from the web-based
# runtime monitor. Uncomment to disable.
#web.submit.enable: false
#==============================================================================
# Advanced
#==============================================================================
# Override the directories for temporary files. If not specified, the
# system-specific Java temporary directory (java.io.tmpdir property) is taken.
#
# For framework setups on Yarn or Mesos, Flink will automatically pick up the
# containers' temp directories without any need for configuration.
#
# Add a delimited list for multiple directories, using the system directory
# delimiter (colon ':' on unix) or a comma, e.g.:
# /data1/tmp:/data2/tmp:/data3/tmp
#
# Note: Each directory entry is read from and written to by a different I/O
# thread. You can include the same directory multiple times in order to create
# multiple I/O threads against that directory. This is for example relevant for
# high-throughput RAIDs.
#
# io.tmp.dirs: /tmp
# Specify whether TaskManager's managed memory should be allocated when starting
# up (true) or when memory is requested.
#
# We recommend to set this value to 'true' only in setups for pure batch
# processing (DataSet API). Streaming setups currently do not use the
TaskManager's
# managed memory: The 'rocksdb' state backend uses RocksDB's own memory
management,
# while the 'memory' and 'filesystem' backends explicitly keep data as objects
# to save on serialization cost.
#
# taskmanager.memory.preallocate: false
# The classloading resolve order. Possible values are 'child-first' (Flink's
default)
# and 'parent-first' (Java's default).
#
# Child first classloading allows users to use different dependency/library
# versions in their application than those in the classpath. Switching back
# to 'parent-first' may help with debugging dependency issues.
#
# classloader.resolve-order: child-first
# The amount of memory going to the network stack. These numbers usually need
# no tuning. Adjusting them may be necessary in case of an "Insufficient number
# of network buffers" error. The default min is 64MB, the default max is 1GB.
#
# taskmanager.network.memory.fraction: 0.1
# taskmanager.network.memory.min: 64mb
# taskmanager.network.memory.max: 1gb
#==============================================================================
# Flink Cluster Security Configuration
#==============================================================================
# Kerberos authentication for various components - Hadoop, ZooKeeper, and
connectors -
# may be enabled in four steps:
# 1. configure the local krb5.conf file
# 2. provide Kerberos credentials (either a keytab or a ticket cache w/ kinit)
# 3. make the credentials available to various JAAS login contexts
# 4. configure the connector to use JAAS/SASL
# The below configure how Kerberos credentials are provided. A keytab will be
used instead of
# a ticket cache if the keytab path and principal are set.
# security.kerberos.login.use-ticket-cache: true
# security.kerberos.login.keytab: /path/to/kerberos/keytab
# security.kerberos.login.principal: flink-user
# The configuration below defines which JAAS login contexts
# security.kerberos.login.contexts: Client,KafkaClient
#==============================================================================
# ZK Security Configuration
#==============================================================================
# Below configurations are applicable if ZK ensemble is configured for security
# Override below configuration to provide custom ZK service name if configured
# zookeeper.sasl.service-name: zookeeper
# The configuration below must match one of the values set in
"security.kerberos.login.contexts"
# zookeeper.sasl.login-context-name: Client
#==============================================================================
# HistoryServer
#==============================================================================
# The HistoryServer is started and stopped via bin/historyserver.sh (start|stop)
# Directory to upload completed jobs to. Add this directory to the list of
# monitored directories of the HistoryServer as well (see below).
#jobmanager.archive.fs.dir: hdfs:///completed-jobs/
# The address under which the web-based HistoryServer listens.
#historyserver.web.address: 0.0.0.0
# The port under which the web-based HistoryServer listens.
#historyserver.web.port: 8082
# Comma separated list of directories to monitor for completed jobs.
#historyserver.archive.fs.dir: hdfs:///completed-jobs/
# Interval in milliseconds for refreshing the monitored directories.
#historyserver.archive.fs.refresh-interval: 10000
2019-10-28 15:26:20,900 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2019-10-28 15:26:20,903 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
StandaloneSessionClusterEntrypoint (Version: 1.8.2, Rev:6322618,
Date:04.09.2019 @ 22:07:41 CST)
2019-10-28 15:26:20,904 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current
user: xxx
2019-10-28 15:26:20,905 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current
Hadoop/Kerberos user: <no hadoop dependency found>
2019-10-28 15:26:20,905 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: OpenJDK
64-Bit Server VM - AdoptOpenJDK - 1.8/25.232-b09
2019-10-28 15:26:20,906 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum heap
size: 23958 MiBytes
2019-10-28 15:26:20,906 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME:
(not set)
2019-10-28 15:26:20,907 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop
Dependency available
2019-10-28 15:26:20,907 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options:
2019-10-28 15:26:20,907 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms25000m
2019-10-28 15:26:20,908 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx25000m
2019-10-28 15:26:20,908 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog.file=/home/xxx/flink-1.8.2/log/flink-xxx-standalonesession-0-xxx.log
2019-10-28 15:26:20,908 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog4j.configuration=file:/home/xxx/flink-1.8.2/conf/log4j.properties
2019-10-28 15:26:20,909 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlogback.configurationFile=file:/home/xxx/flink-1.8.2/conf/logback.xml
2019-10-28 15:26:20,909 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program
Arguments:
2019-10-28 15:26:20,909 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --configDir
2019-10-28 15:26:20,910 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
/home/xxx/flink-1.8.2/conf
2019-10-28 15:26:20,910 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--executionMode
2019-10-28 15:26:20,910 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - cluster
2019-10-28 15:26:20,911 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath:
/home/xxx/flink-1.8.2/lib/log4j-1.2.17.jar:/home/xxx/flink-1.8.2/lib/slf4j-log4j12-1.7.15.jar:/home/xxx/flink-1.8.2/lib/flink-dist_2.12-1.8.2.jar:::
2019-10-28 15:26:20,911 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2019-10-28 15:26:20,915 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX
signal handlers for [TERM, HUP, INT]
2019-10-28 15:26:20,960 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2019-10-28 15:26:20,961 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2019-10-28 15:26:20,961 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 25000m
2019-10-28 15:26:20,962 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.size, 25000m
2019-10-28 15:26:20,963 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2019-10-28 15:26:20,963 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2019-10-28 15:26:21,210 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
StandaloneSessionClusterEntrypoint.
2019-10-28 15:26:21,210 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install default
filesystem.
2019-10-28 15:26:21,224 INFO org.apache.flink.core.fs.FileSystem
- Hadoop is not in the classpath/dependencies. The extended set of
supported File Systems via Hadoop is not available.
2019-10-28 15:26:21,244 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install
security context.
2019-10-28 15:26:21,267 INFO
org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot create
Hadoop Security Module because Hadoop cannot be found in the Classpath.
2019-10-28 15:26:21,294 INFO org.apache.flink.runtime.security.SecurityUtils
- Cannot install HadoopSecurityContext because Hadoop cannot be
found in the Classpath.
2019-10-28 15:26:21,295 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing
cluster services.
2019-10-28 15:26:22,487 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start
actor system at localhost:6123
2019-10-28 15:26:23,908 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2019-10-28 15:26:24,117 INFO akka.remote.Remoting
- Starting remoting
2019-10-28 15:26:24,599 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://flink@localhost:6123]
2019-10-28 15:26:24,656 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system
started at akka.tcp://flink@localhost:6123
2019-10-28 15:26:24,699 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key 'jobmanager.rpc.address'
instead of key 'rest.address'
2019-10-28 15:26:24,719 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-019a456e-6b81-4c27-a845-a97f85958620
2019-10-28 15:26:24,726 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:39917 - max concurrent requests:
50 - max backlog: 1000
2019-10-28 15:26:24,761 INFO
org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics
reporter configured, no metrics will be exposed/reported.
2019-10-28 15:26:24,765 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Trying to start
actor system at localhost:0
2019-10-28 15:26:24,829 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2019-10-28 15:26:24,859 INFO akka.remote.Remoting
- Starting remoting
2019-10-28 15:26:24,903 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://flink-metrics@localhost:42399]
2019-10-28 15:26:24,909 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Actor system
started at akka.tcp://flink-metrics@localhost:42399
2019-10-28 15:26:24,922 INFO
org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore -
Initializing FileArchivedExecutionGraphStore: Storage directory
/tmp/executionGraphStore-c401b6f3-3747-4a3f-a81c-1578d747230b, expiration time
3600000, maximum cache size 52428800 bytes.
2019-10-28 15:26:24,990 INFO org.apache.flink.runtime.blob.TransientBlobCache
- Created BLOB cache storage directory
/tmp/blobStore-40b09dc7-6ba2-4bdf-a5c6-f8d9586ab180
2019-10-28 15:26:25,039 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key 'jobmanager.rpc.address'
instead of key 'rest.address'
2019-10-28 15:26:25,041 WARN
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload
directory /tmp/flink-web-d6048716-73f8-4f20-9b45-3560e90a356e/flink-web-upload
does not exist, or has been deleted externally. Previously uploaded files are
no longer available.
2019-10-28 15:26:25,043 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created
directory /tmp/flink-web-d6048716-73f8-4f20-9b45-3560e90a356e/flink-web-upload
for file uploads.
2019-10-28 15:26:25,046 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting rest
endpoint.
2019-10-28 15:26:25,586 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component log file:
/home/xxx/flink-1.8.2/log/flink-xxx-standalonesession-0-xxx.log
2019-10-28 15:26:25,587 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component stdout file:
/home/xxx/flink-1.8.2/log/flink-xxx-standalonesession-0-xxx.out
2019-10-28 15:26:25,961 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest endpoint
listening at localhost:8081
2019-10-28 15:26:25,964 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint -
http://localhost:8081 was granted leadership with
leaderSessionID=00000000-0000-0000-0000-000000000000
2019-10-28 15:26:25,964 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend
listening at http://localhost:8081.
2019-10-28 15:26:26,127 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at
akka://flink/user/resourcemanager .
2019-10-28 15:26:26,174 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.dispatcher.StandaloneDispatcher at
akka://flink/user/dispatcher .
2019-10-28 15:26:26,223 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
ResourceManager akka.tcp://flink@localhost:6123/user/resourcemanager was
granted leadership with fencing token 00000000000000000000000000000000
2019-10-28 15:26:26,233 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Starting
the SlotManager.
2019-10-28 15:26:26,245 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher
akka.tcp://flink@localhost:6123/user/dispatcher was granted leadership with
fencing token 00000000-0000-0000-0000-000000000000
2019-10-28 15:26:26,252 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering all
persisted jobs.
2019-10-28 15:26:30,918 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
Registering TaskManager with ResourceID 773f4482afa0b68e1027625563e9d382
(akka.tcp://flink@130.149.221.178:39095/user/taskmanager_0) at ResourceManager
2019-10-28 15:26:30,960 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
Registering TaskManager with ResourceID 773f4482afa0b68e1027625563e9d382
(akka.tcp://flink@130.149.221.178:39095/user/taskmanager_0) at ResourceManager
2019-10-28 15:26:40,162 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Received
JobGraph submission 1baba702980285493001f69a9531e507 (Streaming WordCount).
2019-10-28 15:26:40,163 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Submitting job
1baba702980285493001f69a9531e507 (Streaming WordCount).
2019-10-28 15:26:40,204 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/jobmanager_0 .
2019-10-28 15:26:40,226 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Initializing job Streaming WordCount
(1baba702980285493001f69a9531e507).
2019-10-28 15:26:40,253 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Using restart strategy NoRestartStrategy for Streaming WordCount
(1baba702980285493001f69a9531e507).
2019-10-28 15:26:40,296 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Job recovers
via failover strategy: full graph restart
2019-10-28 15:26:40,366 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Running initialization on master for job Streaming WordCount
(1baba702980285493001f69a9531e507).
2019-10-28 15:26:40,367 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Successfully ran initialization on master in 0 ms.
2019-10-28 15:26:40,416 INFO org.apache.flink.runtime.jobmaster.JobMaster
- No state backend has been configured, using default (Memory /
JobManager) MemoryStateBackend (data in heap memory / checkpoints to
JobManager) (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE,
maxStateSize: 5242880)
2019-10-28 15:26:40,439 INFO
org.apache.flink.runtime.jobmaster.JobManagerRunner - JobManager
runner for job Streaming WordCount (1baba702980285493001f69a9531e507) was
granted leadership with session id 00000000-0000-0000-0000-000000000000 at
akka.tcp://flink@localhost:6123/user/jobmanager_0.
2019-10-28 15:26:40,446 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Starting execution of job Streaming WordCount
(1baba702980285493001f69a9531e507) under job master id
00000000000000000000000000000000.
2019-10-28 15:26:40,449 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Streaming
WordCount (1baba702980285493001f69a9531e507) switched from state CREATED to
RUNNING.
2019-10-28 15:26:40,457 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
File Source (1/1) (b634190f95d13e90edb6b5c72e10aa92) switched from CREATED to
SCHEDULED.
2019-10-28 15:26:40,485 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot serve
slot request, no ResourceManager connected. Adding as pending request
[SlotRequestId{0bd9ac89763c59b40e9d2d0b6f1b65cd}]
2019-10-28 15:26:40,499 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Split Reader:
Custom File Source -> Flat Map (1/1) (354e2c664884e3fb95e61bedebf58192)
switched from CREATED to SCHEDULED.
2019-10-28 15:26:40,501 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed
Aggregation -> Sink: Print to Std. Out (1/1) (a9a0bcd06a8b162edc353479781da143)
switched from CREATED to SCHEDULED.
2019-10-28 15:26:40,506 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Connecting to ResourceManager
akka.tcp://flink@localhost:6123/user/resourcemanager(00000000000000000000000000000000)
2019-10-28 15:26:40,514 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Resolved ResourceManager address, beginning registration
2019-10-28 15:26:40,515 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Registration at ResourceManager attempt 1 (timeout=100ms)
2019-10-28 15:26:40,518 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
Registering job manager
00000000000000000000000000000...@akka.tcp://flink@localhost:6123/user/jobmanager_0
for job 1baba702980285493001f69a9531e507.
2019-10-28 15:26:40,529 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
Registered job manager
00000000000000000000000000000...@akka.tcp://flink@localhost:6123/user/jobmanager_0
for job 1baba702980285493001f69a9531e507.
2019-10-28 15:26:40,534 INFO org.apache.flink.runtime.jobmaster.JobMaster
- JobManager successfully registered at ResourceManager, leader id:
00000000000000000000000000000000.
2019-10-28 15:26:40,535 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Requesting new
slot [SlotRequestId{0bd9ac89763c59b40e9d2d0b6f1b65cd}] and profile
ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0,
nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
2019-10-28 15:26:40,538 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Request
slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1,
directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job
1baba702980285493001f69a9531e507 with allocation id
2e43a4446ce8f27f110f806c8716d18c.
2019-10-28 15:26:40,730 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
File Source (1/1) (b634190f95d13e90edb6b5c72e10aa92) switched from SCHEDULED to
DEPLOYING.
2019-10-28 15:26:40,731 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying
Source: Custom File Source (1/1) (attempt #0) to
773f4482afa0b68e1027625563e9d382 @ xxx.inet.tu-berlin.de (dataPort=39277)
2019-10-28 15:26:40,741 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Split Reader:
Custom File Source -> Flat Map (1/1) (354e2c664884e3fb95e61bedebf58192)
switched from SCHEDULED to DEPLOYING.
2019-10-28 15:26:40,742 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Split
Reader: Custom File Source -> Flat Map (1/1) (attempt #0) to
773f4482afa0b68e1027625563e9d382 @ xxx.inet.tu-berlin.de (dataPort=39277)
2019-10-28 15:26:40,750 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed
Aggregation -> Sink: Print to Std. Out (1/1) (a9a0bcd06a8b162edc353479781da143)
switched from SCHEDULED to DEPLOYING.
2019-10-28 15:26:40,750 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Deploying Keyed
Aggregation -> Sink: Print to Std. Out (1/1) (attempt #0) to
773f4482afa0b68e1027625563e9d382 @ xxx.inet.tu-berlin.de (dataPort=39277)
2019-10-28 15:26:41,008 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed
Aggregation -> Sink: Print to Std. Out (1/1) (a9a0bcd06a8b162edc353479781da143)
switched from DEPLOYING to RUNNING.
2019-10-28 15:26:41,010 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
File Source (1/1) (b634190f95d13e90edb6b5c72e10aa92) switched from DEPLOYING to
RUNNING.
2019-10-28 15:26:41,039 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Split Reader:
Custom File Source -> Flat Map (1/1) (354e2c664884e3fb95e61bedebf58192)
switched from DEPLOYING to RUNNING.
2019-10-28 15:26:41,804 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
File Source (1/1) (b634190f95d13e90edb6b5c72e10aa92) switched from RUNNING to
FINISHED.
2019-10-28 17:19:44,538 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Split Reader:
Custom File Source -> Flat Map (1/1) (354e2c664884e3fb95e61bedebf58192)
switched from RUNNING to FINISHED.
2019-10-28 17:19:44,967 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed
Aggregation -> Sink: Print to Std. Out (1/1) (a9a0bcd06a8b162edc353479781da143)
switched from RUNNING to FINISHED.
2019-10-28 17:19:44,971 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Streaming
WordCount (1baba702980285493001f69a9531e507) switched from state RUNNING to
FINISHED.
2019-10-28 17:19:44,972 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping
checkpoint coordinator for job 1baba702980285493001f69a9531e507.
2019-10-28 17:19:44,972 INFO
org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore -
Shutting down
2019-10-28 17:19:44,992 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Job
1baba702980285493001f69a9531e507 reached globally terminal state FINISHED.
2019-10-28 17:19:45,035 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Stopping the JobMaster for job Streaming
WordCount(1baba702980285493001f69a9531e507).
2019-10-28 17:19:45,051 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Suspending
SlotPool.
2019-10-28 17:19:45,053 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Close ResourceManager connection
77033e10cf42bad0f219e67e03d5ff09: JobManager is shutting down..
2019-10-28 17:19:45,053 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Stopping
SlotPool.
2019-10-28 17:19:45,058 INFO
org.apache.flink.runtime.jobmaster.JobManagerRunner -
JobManagerRunner already shutdown.
2019-10-28 17:19:45,059 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
Disconnect job manager
00000000000000000000000000000...@akka.tcp://flink@localhost:6123/user/jobmanager_0
for job 1baba702980285493001f69a9531e507 from the resource manager.