I'm developing a Flink application and I'm still learning. For simplicity, most of the time I test by running the main method of the entry class as a regular Java application. Would that be running on what's called a mini cluster? I find it quote convenient and makes debugging job really easy. My question is, if it's a job that's small enough and can potentially be executed on a single machine, is there going to be a performance penalty to do it this way verses starting a Flink instance in local mode, or a full fledged Flink cluster? For jobs with low workload, is there any down side just to run it like a regular Java application?
A side question is, when running it with the mini cluster, I'd watch the log messages. I find that the process seems to focus on one operator for a while, then switch to another operator. For example, my simple Flink application has a kafka source, a windowed aggregator, and an elasticsearch sink. I'd see a lot of SourceFunction log messages pumping records into the pipeline, then they (the logs) would stop for a while. I then see some AggregateFunction logs as the records come in, and then SinkFunction logs after that when the window is up. After that, this could be a pause of 5-10 seconds or longer, SourceFunction logs would show up again. Because of the windowing operation, I expect the SinkFunction to fire once in a while but I was expecting to see interleaving SourceFunction and AggregateFunction logs showing all operators are being run at the same time, instead of logs from the loop inside SourceFunction.run(), followed by logs from AggregateFunction.add() method. Is this because I'm running it with the mini cluster, or is this how things are expected to work? Thanks in advance Jack