Done. Feel free to extend/correct/complete etc. -Matthias
On 2/20/19 9:56 AM, Guozhang Wang wrote: > Since we've seen quite a lot of questions recently about EOS on the > mailing list. I think it worth adding an FAQ entry here: > > https://cwiki.apache.org/confluence/display/KAFKA/FAQ > > So that we can refer future questions to the page than answering them > repeatedly. @Matthias J Sax <mailto:matth...@confluent.io> : would you > like to do it? > > > Guozhang > > On Tue, Feb 19, 2019 at 3:12 PM Matthias J. Sax <matth...@confluent.io > <mailto:matth...@confluent.io>> wrote: > > Even if the question was sent 4 times to the mailing list, I am only > answering is exactly-once (sorry for the bad joke -- could not > resist...) > > > You have to distinguish between "idempotent producer" and "transactional > producer". > > If you enable idempotent writes (config `enable.idempotence`), your > producer will get a cluster wide unique PID assigned. This PID, together > with the sequence number, is used broker side to de-duplicate messages > on write (in case the producer retries). Different producers can use the > same sequence numbers, so PID are used to distinguish different > producers and get unique PID-seqNum pairs. > > Idempotent writes, apply to single messages in isolation only. Consumer > side, there is no change because no transactions are used > (`isolation.level` config has no impact). > > > If you want to write multiple message in an atomic manner (ie, write all > 5 messages or none of them), you would need to use transactions. For > this case you also assign a `transactional.id > <http://transactional.id>` producer side and should > configure consumers with `read_committed` mode. The > `transactional.id <http://transactional.id>` > is required, to abort in-flight transactions, in case a producer has an > open transaction, crashes, and is restarted. (A PID is not sufficient, > because it's lost on a crash). When there is an open transaction, and a > producer crashes and is restarted, the broker will detect the open > transaction (ie, same `transactional.id <http://transactional.id>`) > and abort it automatically. > > For compacted topics or multi-segment transactions are no special case. > They work like regular transactions. > > > -Matthias > > > On 2/19/19 5:14 AM, Greenhorn Techie wrote: > > Hi, > > > > Our data getting into Kafka is transactional in nature and hence I am > > trying to understand EOS better. My present understanding is as below: > > > > It is mentioned that when producer starts, it will have a new PID, > but only > > valid till the session. Does that mean, is it a pre-requisite to > have the > > same / single producer session for exactly-once guarantees? I > presume it is > > not required. As per my understanding, this is where > transactionl.id <http://transactionl.id> comes > > into picture which is user defined and hence can survive producer > restarts. > > > > I have few questions regarding the same: > > > > 1. If the above statement is correct, why do we need PID in the > first place > > and instead use transactionl.id <http://transactionl.id> all over? > > 2. I understand that sequence number is something that is generated by > > producer and increases monotonically. Does that mean, the sequence > number > > changes across producer restarts along with a new PID? > > 3. Is PID meant mainly for idempotence where as transactional.id > <http://transactional.id> is for > > transactional support? > > 4. On the consumer side, only one config parameter is defined i.e. > > isolation.level. For EOS, I presume this needs to be set to > > ‘read_committed’ only. For EOS, it should never be set to > ‘read_uncommitted’ > > 5. What is the impact of setting ‘enable.idempotence’ to true without > > setting ‘transactional.id <http://transactional.id>’ on the > producer side? Does it have any > > (side)effect? > > 6. How does EOS work for compacted topics? Will the EOS behaviour > be any > > different for compacted topics? > > 7. How does EOS work when transactions are written to two > different log > > segments? > > > > Can anyone please help me understand the nuances around EOS > guarantees? > > > > Thanks > > > > > > -- > -- Guozhang
signature.asc
Description: OpenPGP digital signature