Hi, Apurva,

Thanks for the reply. A couple of comment below.

On Wed, Feb 15, 2017 at 9:45 PM, Apurva Mehta <apu...@confluent.io> wrote:

> Hi Jun,
>
> Answers inline:
>
> 210. Pid snapshots: Is the number of pid snapshot configurable or hardcoded
> > with 2? When do we decide to roll a new snapshot? Based on time, byte, or
> > offset? Is that configurable too?
> >
>
> These are good questions. We haven't fleshed out the policy by which the
> snapshots will be generated. I guess there will be some scheduled task that
> takes snapshots, and we will retain the two latest ones. These will be map
> to different points in time, but each will be the complete view of the
> PID->Sequence map at the point it time it was created. At start time, the
> Pid mapping will be built from the latest snapshot unless it is somehow
> corrupt, in which case the older one will be used. Otherwise, the older one
> will be ignored.
>
> I don't think there are good reason to keep more than one, to be honest. If
> the snapshot is corrupt, we can always rebuild the map from the log itself.
> I have updated the doc to state that there will be exactly one snapshot
> file.
>
> With one snapshot, we don't need additional configs. Do you agree?
>
>
>
When a replica becomes a follower, we do a bit log truncation. Having an
older snapshot allows us to recover the PID->sequence mapping much quicker
than rescanning the whole log.



> >
> > 211. I am wondering if we should store ExpirationTime in the producer
> > transactionalId mapping message as we do in the producer transaction
> status
> > message. If a producer only calls initTransactions(), but never publishes
> > any data, we still want to be able to expire and remove the producer
> > transactionalId mapping message.
> >
> >
> Actually, the document was inaccurate. The transactionalId will be expired
> only if there is no active transaction, and the age of the last transaction
> with that transactionalId is older than the transactioanlId expiration
> time. With these semantics, storing the expiration time in the
> transactionalId mapping message won't be useful, since the expiration time
> is a moving target based on transaction activity.
>
> I have updated the doc with a clarification.
>
>
>
Currently, the producer transactionalId mapping message doesn't carry
ExpirationTime, but the producer transaction status message does.  It would
be useful if they are consistent.


>
> > 212. The doc says "The LSO is always equal to one less than the minimum
> of
> > the initial offsets across all active transactions". This implies that
> LSO
> > is inclusive. However, currently, both high watermark and log end offsets
> > are exclusive. For consistency, it seems that we should make LSO
> exclusive
> > as well.
> >
>
> Sounds good. Doc updated.
>
>
> >
> > 213. The doc says "If the topic is configured for compaction and
> deletion,
> > we will use the topic’s own retention limit. Otherwise, we will use the
> > default topic retention limit. Once the last message set produced by a
> > given PID has aged beyond the retention time, the PID will be expired."
> For
> > topics configured with just compaction, it seems it's more intuitive to
> > expire PID based on transactional.id.expiration.ms?
> >
>
> This could work. The problem is that all idempotent producers get a PID,
> but only transactional producers have a transactionalId. Since they are
> separate concepts, it seems better not to conflate PID expiration with the
> settings for transactionalId expiration.
>
> Another way of putting it: it seems more natural for the retention of the
> PID to be based on the retention of the messages in a topic. The
> transactionalId expiration is based on the transaction activity of a
> producer, which is quite a different thing.
>
> But I see your point: having these separate can put us in a situation where
> the PID gets expired even if the transactionalId is not. To solve this we
> should set the default transactionalId expiration time to be less than the
> default topic retention time, so that the transactionalId can be expired
> before the PID. Does that seem reasonable?
>
>
> >
> > 214. In the Coordinator-Broker request handling section, the doc says "If
> > the broker has a corresponding PID, verify that the received epoch is
> > greater than or equal to the current epoch. " Is the epoch the
> coordinator
> > epoch or the producer epoch?
> >
>
> This is the producer epoch, as known by the coordinator. I have clarified
> this in the document.  However, there may be producers with the same PID on
> a different epoch, which will be fenced out future calls.
>
>
> > 215. The doc says "Append to the offset topic, but skip updating the
> offset
> > cache in the delayed produce callback, until a WriteTxnMarkerRequest from
> > the transaction coordinator is received including the offset topic
> > partitions." How do we do this efficiently? Do we need to cache pending
> > offsets per pid?
> >
>
> I think we would need to cache the consumerGroup/partition-> offset for the
> written but uncommitted offsets. Once the transaction commits, these
> entries can be moved to the main cache.
>
> However, I think that the overhead for these duplicate temporary entries
> should be marginal because it would be proportional to the number of input
> topics per transaction.
>
>
> >
> > 216. Do we need to add any new JMX metrics? For example, on the broker
> and
> > transaction coordinator side, it would be useful to know the number of
> live
> > pids.
> >
>
> Yes, I think we will definitely need to introduce new metrics. Some would
> be:
>
> 1. Number of live PIDs (a proxy for the size of the PID->Sequence map)
> 2. Current LSO per partition (useful to detect stuck consumers and lost
> commit/abort markers).
> 3. Number of active transactionalIds (proxy for the memory consumed by the
> transaction coordinator).
>
>  I have added these to the docs, though I think we will probably think of
> more as the work progresses.
>

Thanks,

Jun

Reply via email to