Hi, firstly excuse me for the long post. I already read the documentation about parallelism, slots and the API about it but I still have some doubts about practical implementations of them. My program is composed essentially by three operations:
- get data from a kafka source - perform a machine learning operator on the retrieved stream - push out data to a cassandra sink I'd like to investigate and trying implement them in two different situations: 1) FIRST ONE Imagine I have a single dual core physical node and suppose I set NumberOfTaskSlot = NumberOfCore (As suggested by the doc). <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t985/tonabble.png> I suppose I can divide in a fixed way the operations into slots as described in the figure. Is this possible? Can I do that using slotSharingGroup(groupname) method ? Or have I to use startNewChain() between the operator? Example: *DataStream<MyEvent> stream = env .addSource(new FlinkKafkaConsumer010<>(TOPIC, new CustomDeserializer(), properties)) .assignTimestampsAndWatermarks(new CustomTimestampExtractor()) .map(...) .slotSharingGroup("source");* or *DataStream<MyEvent> stream = env .addSource(new FlinkKafkaConsumer010<>(TOPIC, new CustomDeserializer(), properties)) .startNewChain() .assignTimestampsAndWatermarks(new CustomTimestampExtractor()) .startNewChain() .map(...); * 2) SECOND ONE Imagine I have 3 dual core physical nodes. <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t985/tonabble2.png> I suppose I can reserve one physical NODE for each operation. Is this possible? In this case honestly I don't know how to implement that at level code. Moreover, I don't know if it would has sense set NumberTaskSlot = NumberOfCores or to leave this option to Flink's choice. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/