Hi, Our data getting into Kafka is transactional in nature and hence I am trying to understand EOS better. My present understanding is as below:
It is mentioned that when producer starts, it will have a new PID, but only valid till the session. Does that mean, is it a pre-requisite to have the same / single producer session for exactly-once guarantees? I presume it is not required. As per my understanding, this is where transactionl.id comes into picture which is user defined and hence can survive producer restarts. I have few questions regarding the same: 1. If the above statement is correct, why do we need PID in the first place and instead use transactionl.id all over? 2. I understand that sequence number is something that is generated by producer and increases monotonically. Does that mean, the sequence number changes across producer restarts along with a new PID? 3. Is PID meant mainly for idempotence where as transactional.id is for transactional support? 4. On the consumer side, only one config parameter is defined i.e. isolation.level. For EOS, I presume this needs to be set to ‘read_committed’ only. For EOS, it should never be set to ‘read_uncommitted’ 5. What is the impact of setting ‘enable.idempotence’ to true without setting ‘transactional.id’ on the producer side? Does it have any (side)effect? 6. How does EOS work for compacted topics? Will the EOS behaviour be any different for compacted topics? 7. How does EOS work when transactions are written to two different log segments? Can anyone please help me understand the nuances around EOS guarantees? Thanks