Several new questions:
- Stoppable job
I read threads mentioning that a streaming job can be stopped [1][2].
However looks like it can only be called through command line. Is it
possible to programmatically stop the streaming job from within the
job itself? For instance, a Kafka consumer streaming job reaches
predefined condition, then call stop() from within e.g. MapFunction?

- Web UI (jobmanager-host:8081) information
I have a Kafka consumer which reads records from Kafka. In web ui's
Subtasks tab where it has "Records sent", does it imply the records
read by consumer? For instance, I deliver say 1k string record
(SimpleStringSchema) to Kafka; can I expect 1k "Records sent"
displayed on web ui once all those records read by consumer?

This leads to another question. I have a streaming job which exploits
map function e.g. stream.map(new MyMapFunction). Within the
MyMapFunction impl I count per input and write the count to external
places. Later on I sum the count value for MyMapFunction based on
Parallelism supplied. So for example I run map(MyMapFunction) with 4
parallelism, MyMapFunction processes 400, 500, 400, 500 count
respectively. Later on the sum of all count is 1800. However this sum
value is different from web ui which has higher "Record sent" e.g. 8k.
Does that mean "Records sent" in web ui does not mean the records
processed by MyMapFunction? How do I interpret the value in this
column or how can I know if all messages delivered to Kafka are fully
processed i.e. 1k records delivered to Kafka and 1k records read out
of Kafka?

Thanks.

[1]. 
http://mail-archives.apache.org/mod_mbox/flink-user/201604.mbox/%3c57155c30.8010...@apache.org%3E

[2]. 
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/functions/StoppableFunction.html

Reply via email to