Trying to figure out the best design in Flink. Reading from a kafka topic which has messages with pointers to files to be processed. I am thinking to somehow kick off a batch job per file... unless there is an easy way to get a separate dataset per file. I can do almost all of this in the stream, parse file with flat map -> explode its contents into multiple data elements -> map, etc... On of these steps would be to grab another dataset from JDBC source and join with the stream's contents... I think I am mixing the two concepts here and the right approach would be to kick of this mini batch job per file, where I have file datase t+ jdbc dataset to join with.
So how would I go about kicking a batch from from streaming job? Thanks, Tomas