I am writing a SystemProducer and I would be grateful for any comments or documentation.
I want my SystemProducer to collect messages until flush is invoked. That is, I don't want to transmit each message 'one by one'. I want to wait until flush is invoked and then transmit 'in bulk'. I don't know how to do this without violating the 'at least once' guarantee. I am concerned that any message that is 'sent' from my StreamTask will be included in a checkpoint. The message might not have been transmitted, however. The message may reside in my SystemProducer pending flush. If the StreamTask is re-started from the checkpoint then the message will not be replayed. [The task input stream is a Kafka topic] How do I co-ordinate the checkpoint mechanism with SystemProducer so that the checkpoint is delayed until the message is flushed?