I am coming from Apache Storm world. I am planning to switch from storm to flink. I was reading Flink documentation but, I couldn't find some requirements in Flink which was present in Storm.
I need to have a streaming pipeline Kafka->flink-> ElasticSearch. In storm, I have seen that I can specify number of tasks per bolt. Typically databases are slow in writes and hence I need more writers to the database. Reading from kafka is pretty fast when compared to ES writes. This means that I need to have more ES writer tasks than Kafka consumers. How can I achieve it in Flink? What are the concepts in Flink similar to Storm Parallelism concepts like workers, executors, tasks? I saw the implementation of elasticsearch sink in Flink which can do batching of messsges before writes. How can I batch data based on a custom logic? For eg: batch writes grouped on one of the message keys. This is possible in Storm via FieldGrouping. But I couldn't find an equivalent way to do grouping in Flink and control the overall number of writes to ES. Please help me with above questions and some pointers to flink parallelism.