Dear flink community,

First I need provide some minimum information about my deployment scenario:
I'm running application inside of Flink docker, below original Dockerfile:
-----------------------------------------------------------------------------------------------------------

FROM flink:1.13.0-scala_2.11-java11

# Copy log and monitoring related JARs to flink lib dir
COPY kafka-clients-2.4.1.jar /opt/flink/lib/

RUN chmod 777 /tmp
RUN apt-get update && apt-get install -y htop

# configuration files
COPY Log4cpp.properties /opt/flink/
COPY Log4j.properties /opt/flink/conf/log4j.properties
COPY SessionOrganizer.json /opt/flink/


*COPY flink-conf.yaml /opt/flink/conf/COPY slaves /opt/flink/conf/*
# job file
COPY KafkaToSessions-shade.jar /opt/flink/lib/

# libraries
ADD libs /usr/local/lib/

# Add /usr/local/lib to ldconfig
RUN echo "/usr/local/lib/" > /etc/ld.so.conf.d/ips.conf && \
    ldconfig && \
    ulimit -c 0

RUN mkdir /opt/flink/ip-collection/ && \
    mkdir /opt/flink/checkpoints/ && \
    mkdir /opt/flink/ip-collection/incorrectIcs && \
    mkdir /opt/flink/ip-collection/storage && \
    mkdir /opt/flink/ip-collection/logs

CMD /opt/flink/bin/start-cluster.sh && /opt/flink/bin/flink run
/opt/flink/lib/KafkaToSessions-shade.jar

-------------------------------------------------------------------------------------------------------------------------------
If we will ignore irrelevant parts of Dockerfile, the only 2 things remains
( beside FROM statement)
1. Overwritten flink-conf.yml + slaves
2. CMD which executes start-cluster and job.

My flink-conf.yml:
---------------------------------------------------------------------------------------------------------

rest.address: localhost
rest.port: 8088
state.backend: filesystem
state.checkpoints.dir: file:///opt/flink/checkpoints
jobmanager.memory.process.size: 2224m
jobmanager.rpc.port: 6123
jobmanager.rpc.address: localhost
taskmanager.memory.flink.size: 2224m
taskmanager.memory.task.heap.size: 1000m
taskmanager.numberOfTaskSlots: 12
taskmanager.rpc.port: 50100
taskmanager.data.port: 50000
parallelism.default: 6
heartbeat.timeout: 120000
heartbeat.interval: 20000
env.java.opts: "-XX:+UseG1GC -XX:MaxGCPauseMillis=300"

---------------------------------------------------------------------------------------------------------

Slaves file contain single line with localhost.
After start of docker, I noticed that application doesn't work due lack of
slots. When I checked flink-conf.yml I noticed that
taskmanager.numberOfTaskSlots is set to 1.
P.S. during first time, daemon.sh complained that it doesn't have write
permissions to change flink-conf.yml, when I added chown flink.flink
/opt/flink/flink-conf.yml -
it stopped to complain & taskmanager.numberOfTaskSlots change occured.

Any suggestions ?

Best regards,
Alexander

Reply via email to