Done. Feel free to extend/correct/complete etc.

-Matthias

On 2/20/19 9:56 AM, Guozhang Wang wrote:
> Since we've seen quite a lot of questions recently about EOS on the
> mailing list. I think it worth adding an FAQ entry here:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ
> 
> So that we can refer future questions to the page than answering them
> repeatedly. @Matthias J Sax <mailto:matth...@confluent.io> : would you
> like to do it?
> 
> 
> Guozhang
> 
> On Tue, Feb 19, 2019 at 3:12 PM Matthias J. Sax <matth...@confluent.io
> <mailto:matth...@confluent.io>> wrote:
> 
>     Even if the question was sent 4 times to the mailing list, I am only
>     answering is exactly-once (sorry for the bad joke -- could not
>     resist...)
> 
> 
>     You have to distinguish between "idempotent producer" and "transactional
>     producer".
> 
>     If you enable idempotent writes (config `enable.idempotence`), your
>     producer will get a cluster wide unique PID assigned. This PID, together
>     with the sequence number, is used broker side to de-duplicate messages
>     on write (in case the producer retries). Different producers can use the
>     same sequence numbers, so PID are used to distinguish different
>     producers and get unique PID-seqNum pairs.
> 
>     Idempotent writes, apply to single messages in isolation only. Consumer
>     side, there is no change because no transactions are used
>     (`isolation.level` config has no impact).
> 
> 
>     If you want to write multiple message in an atomic manner (ie, write all
>     5 messages or none of them), you would need to use transactions. For
>     this case you also assign a `transactional.id
>     <http://transactional.id>` producer side and should
>     configure consumers with `read_committed` mode. The
>     `transactional.id <http://transactional.id>`
>     is required, to abort in-flight transactions, in case a producer has an
>     open transaction, crashes, and is restarted. (A PID is not sufficient,
>     because it's lost on a crash). When there is an open transaction, and a
>     producer crashes and is restarted, the broker will detect the open
>     transaction (ie, same `transactional.id <http://transactional.id>`)
>     and abort it automatically.
> 
>     For compacted topics or multi-segment transactions are no special case.
>     They work like regular transactions.
> 
> 
>     -Matthias
> 
> 
>     On 2/19/19 5:14 AM, Greenhorn Techie wrote:
>     > Hi,
>     >
>     > Our data getting into Kafka is transactional in nature and hence I am
>     > trying to understand EOS better. My present understanding is as below:
>     >
>     > It is mentioned that when producer starts, it will have a new PID,
>     but only
>     > valid till the session. Does that mean, is it a pre-requisite to
>     have the
>     > same / single producer session for exactly-once guarantees? I
>     presume it is
>     > not required. As per my understanding, this is where
>     transactionl.id <http://transactionl.id> comes
>     > into picture which is user defined and hence can survive producer
>     restarts.
>     >
>     > I have few questions regarding the same:
>     >
>     > 1. If the above statement is correct, why do we need PID in the
>     first place
>     > and instead use transactionl.id <http://transactionl.id> all over?
>     > 2. I understand that sequence number is something that is generated by
>     > producer and increases monotonically. Does that mean, the sequence
>     number
>     > changes across producer restarts along with a new PID?
>     > 3. Is PID meant mainly for idempotence where as transactional.id
>     <http://transactional.id> is for
>     > transactional support?
>     > 4. On the consumer side, only one config parameter is defined i.e.
>     > isolation.level. For EOS, I presume this needs to be set to
>     > ‘read_committed’ only. For EOS, it should never be set to
>     ‘read_uncommitted’
>     > 5. What is the impact of setting ‘enable.idempotence’ to true without
>     > setting ‘transactional.id <http://transactional.id>’ on the
>     producer side? Does it have any
>     > (side)effect?
>     > 6. How does EOS work for compacted topics? Will the EOS behaviour
>     be any
>     > different for compacted topics?
>     > 7. How does EOS work when transactions are written to two
>     different log
>     > segments?
>     >
>     > Can anyone please help me understand the nuances around EOS
>     guarantees?
>     >
>     > Thanks
>     >
> 
> 
> 
> -- 
> -- Guozhang

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to