Re: [DISCUSS] KIP-932: Queues for Kafka

Jun Rao Fri, 09 Feb 2024 15:15:34 -0800

Hi, Andrew,

Thanks for the KIP. A few comments below.

10. ShareFetchResponse: To consume transactional data, currently
FetchResponse includes the AbortedTransactions fields for the client to
properly skip aborted records. ShareFetchResponse doesn't include that. How
do we prevent the consumer from reading aborted records in a share group?

11. "adding "share"  to the existing group types of "consumer"  and
"classic" "
What's the "classic" type?

12. bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-name
group --entity-name G1 --alter --add-config group.type=share
So, one could change the group type? What happens to the states associated
with the group (members, epoch, offsets, etc)?

13. Why is Server Assignor at member level, instead of group level?

14. Member.metadata: How is that being used? It isn't exposed to the client.

15. What public interface does SimpleAssignor implement?

16. "This means that existing members will have to rejoin the share group
following a coordinator failover."
When an existing member issues a ShareGroupHeartbeatRequest to the new
coordinator, does the coordinator return UNKNOWN_MEMBER_ID and a new
memberId?

17. auto.offset.reset has the option to throw an exception to the client if
the current offset does not exist any more on the server (e.g. due to
retention). Should group.share.auto.offset.reset support  that too?

18. SHARE_CHECKPOINT and SHARE_DELTA records:
18.1 When does the coordinator write them?
18.2 If the client commits the acknowledgement successfully, could
the acknowledgement be lost on the broker if the coordinator fails over?

19. In the current consumer model, coordinator failover doesn't cause
duplicate records in consumers. In the share group model, I guess this is
no longer true since we are not persisting the acquired state?

20. "The calls to KafkaShareConsumer.acknowledge(ConsumerRecord,
AcknowledgeType) must be issued in the order in which the records appear in
the ConsumerRecords object, which will be in order of increasing offset for
each share-partition."
20.1  What happens if the acknowledge() call doesn't follow this? Does
the caller get an exception?
20.2 The example with Acknowledge 119. It seems the acknowledgement is out
of order since records at offset 111-118 haven't been acknowledged?

21. "Indeed, these message sets are not returned to consumer". Are we
excluding those control records for non-shared consumers too?

22. The design doesn't seem to support fetching from the followers. This
might be ok, but it will be useful to explicitly mention this.

23. Examples with control records for SHARE_DELTA:
23.1 Some of the state changes contain cumulative state instead of delta.
For example, "record 110 (available, delivery count 1), records 111-118
acquired, record 119 acknowledged" for "Acknowledge 119".
23.2 SHARE_DELTA sometimes include available records with DeliveryCount of
0. But we don't do that for every record. What's the convention?
    {
      "BaseOffset": 111,
      "LastOffset": 118,
      "State": 0 (Available),
      "DeliveryCount": 0
    }

24. "when a broker becomes the leader of a share-partition, it must read
the most recent SHARE_CHECKPOINT": How does a broker find this efficiently
on restart?

25. AcknowledgeCommitCallback: How would an application use it? It doesn't
indicate which record's acknowledgement has failed.

26. AlterShareGroupOffsetsResult alterShareGroupOffsets(String groupId,
Map<TopicPartition, OffsetAndMetadata> offsets): How is the metadata used?
It doesn't seem there is an API to use it in either the client application
or the broker.

27. It would be useful to add a section on downgradability since the KIP
changes the record format in the internal offset topic.

Thanks,

Jun

On Wed, Oct 11, 2023 at 8:25 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Jack,
> Thanks for your comments.
>
> I have added a new section on Log Retention which describes the behaviour
> of the SPSO as the LSO advances. That makes total sense
> and was an omission from the KIP.
>
> I have added the other ideas as potential future work. I do like the idea
> of having the SPSO influence the advancements of the LSO
> for topics which are primarily being using with share groups.
>
> I have published an updated version of the KIP.
>
> Thanks,
> Andrew
>
> > On 4 Oct 2023, at 10:09, Jack Vanlightly <vanligh...@apache.org> wrote:
> >
> > I would like to see more explicit discussion of topic retention and
> share groups. There are a few options here from simple to more
> sophisticated. There are also topic-level and share-group level options.
> >
> > The simple thing would be to ensure that the SPSO of each share group is
> bounded by the Log Start Offset (LSO) of each partition which itself is
> managed by the retention policy. This is a topic-level control which
> applies to all share-groups. I would say that this shared retention is the
> largest drawback of modeling queues on shared logs and this is worth noting.
> >
> > More sophisticated approaches can be to allow the LSO to advance not
> (only) by retention policy but by the advancement of the lowest SPSO. This
> can keep the amount of data lower by garbage collecting messages that have
> been acknowledged by all share groups. Some people may like that behaviour
> on those topics where share groups are the only consumption model and no
> replay is needed.
> >
> > There are per-share-group possibilities such as share-group TTLs where
> messages can be archived on a per share group basis.
> >
> > Thanks
> > Jack
>
>

Re: [DISCUSS] KIP-932: Queues for Kafka

Reply via email to