Enhancing File Processing and Kafka Integration with Flink Jobs

arjun s Thu, 26 Oct 2023 06:23:26 -0700

Hello team,
I'm currently in the process of configuring a Flink job. This job entails
reading files from a specified directory and then transmitting the data to
a Kafka sink. I've already successfully designed a Flink job that reads the
file contents in a streaming manner and effectively sends them to Kafka.
However, my specific requirement is a bit more intricate. I need the job to
not only read these files and push the data to Kafka but also relocate the
processed file to a different directory once all of its contents have been
processed. Following this, the job should seamlessly transition to
processing the next file in the source directory. Additionally, I have some
concerns regarding how the job will behave if it encounters a restart.
Could you please advise if this is achievable, and if so, provide guidance
or code to implement it?


I'm also quite interested in how the job will handle situations where the
source has a parallelism greater than 2 or 3, and how it can accurately
monitor the completion of reading all contents in each file.

Thanks and Regards,
Arjun

Enhancing File Processing and Kafka Integration with Flink Jobs

Reply via email to