One improvement suggestion, please check if it is valid? For checking system whether be adequately reliability, testers usually designedly do some delete operation. Steps: 1.go to "flink\build-target\log" 2.delete “flink-xx-jobmanager-linux-3lsu.log" file 3.Run jobs along with writing log info, meanwhile the system didn't give any error info when the log info can't be wrote correctly. 4.when some jobs be run failed , go to check log file for finding the reason, can't find the log file. Must restart Job Manager to regenerate the log file, then continue to run jobs.
Regards Liang