hi Jun > For 1, is it better to skip the missing value in the return map or include it? To me, including every item passed in from the input seems more intuitive and less error prone for the users.
Yes, I agree that using null for now is suitable. Especially considering the growing number of offset-related APIs, changing the specification would be a substantial change. @Jiunn-Yang WDYT? Best, Chia-Ping Jun Rao <j...@confluent.io.invalid> 於 2025年5月16日 週五 上午2:25寫道: > Hi, Chia-Ping, > > Yes, there are two things to consider. > > 1. What's the desired way for returning a map when some of the values are > not present? > > 2. Once we agreed on a desired way, how to make that consistent across all > APIs. > > For 1, is it better to skip the missing value in the return map or include > it? To me, including every item passed in from the input seems more > intuitive and less error prone for the users. > > Thanks, > > Jun > > > On Thu, May 15, 2025 at 1:14 AM Chia-Ping Tsai <chia7...@gmail.com> wrote: > > > hi Jun > > > > I understand the use case where the "Result" is expected to represent > both > > "exist" and "non-exist." However, I'm concerned about consistency across > > our APIs. Not all of Admin APIs follow this pattern; for example, > > listPartitionReassignments and describeProducers do not. Designing a > single > > API structure to fit all use cases is challenging, and we currently have > > many offset-related APIs that use null values. Therefore, it would be > > acceptable to maintain the current pattern. > > > > Additionally, I've opened KAFKA-19284 to add documentation regarding the > > null value in all offset-related results. > > > > Best, > > Chia-Ping > > > > > > > > > > Jun Rao <j...@confluent.io.invalid> 於 2025年5月15日 週四 上午1:47寫道: > > > > > Hi, Jiunn-Yang, > > > > > > Thanks for the reply. > > > > > > If we omit the non-existing value in the return map, for applications > > that > > > care about the missing value, is it convenient for them to write the > > code? > > > They need to save the input to find this out, right? > > > > > > Jun > > > > > > > > > > > > On Fri, May 9, 2025 at 4:39 AM 黃竣陽 <s7133...@gmail.com> wrote: > > > > > > > Hi Jun, chia > > > > > > > > Thanks for great examples. > > > > > > > > I still believe that when a map contains a key, its value should not > be > > > > null. Now that we've introduced > > > > the allow.null.offsets.entries config as a bridge between the two > > > > behaviors, I trust that most users who > > > > rely on null values will have a reasonable path to transition during > > this > > > > period. > > > > > > > > I can update the KIP to include the Kafka Streams use case and > > emphasize > > > > that the allow.null.offsets.entries > > > > config will remain available until the next major release, giving > most > > > > users enough time to adapt. > > > > > > > > Best Regards, > > > > Jiunn-Yang > > > > > > > > > Chia-Ping Tsai <chia7...@gmail.com> 於 2025年5月6日 凌晨1:53 寫道: > > > > > > > > > > hi Jun > > > > > > > > > > Thanks for the great example. > > > > > > > > > > Whether null has value depends on user expectations regarding the > > "map > > > > > structure." If users expect it to represent both existence and > > > > > non-existence, then null is meaningful. Otherwise, for users > > expecting > > > > the > > > > > map structure to represent only existence, null might lead them to > > add > > > > > extra filters to prevent NPEs [0][1]. > > > > > > > > > > [0] > > > > > > > > > > > > > > > https://github.com/akka/alpakka-kafka/blob/61cae8c04eff05f1408de8d7d0080a0be4c00aac/core/src/main/scala/akka/kafka/internal/KafkaConsumerActor.scala#L365 > > > > > [1] > > > > > > > > > > > > > > > https://github.com/zio/zio-kafka/blob/adfff39bdd86b75f6035bdfb90935f7800d30b52/zio-kafka/src/main/scala/zio/kafka/consumer/Consumer.scala#L630 > > > > > > > > > > Best, > > > > > Chia-Ping > > > > > > > > > > > > > > > Jun Rao <j...@confluent.io.invalid> 於 2025年5月6日 週二 上午1:19寫道: > > > > > > > > > >> Hi, Jiunn-Yang, > > > > >> > > > > >> Looking at the code again, it seems that the assumption "For a > > > > real-world > > > > >> example, we can see that developers typically do not check for > null > > > when > > > > >> using this." is not true. > > > > >> > > > > >> For example, the following is how Streams uses committed(). > > > > >> // those which do not have a committed offset would > > default > > > > to > > > > >> 0 > > > > >> committedOffsets = > > > > >> consumer.committed(partitions).entrySet().stream() > > > > >> .collect(Collectors.toMap(Map.Entry::getKey, e -> > > > > >> e.getValue() == null ? 0L : e.getValue().offset())); > > > > >> > > > > >> The following is how Streams uses offsetsForTimes(). > > > > >> for (final Map.Entry<TopicPartition, > > > > >> OffsetAndTimestamp> partitionAndOffset : > > > > >> mainConsumer.offsetsForTimes(seekToTimestamps).entrySet()) { > > > > >> final TopicPartition partition = > > > > >> partitionAndOffset.getKey(); > > > > >> final OffsetAndTimestamp seekOffset = > > > > >> partitionAndOffset.getValue(); > > > > >> if (seekOffset != null) { > > > > >> mainConsumer.seek(partition, new > > > > >> OffsetAndMetadata(seekOffset.offset())); > > > > >> } else { > > > > >> log.debug( > > > > >> "Cannot reset offset to > non-existing > > > > >> timestamp {} (larger than timestamp of last record)" + > > > > >> " for partition {}. Seeking to > > end > > > > >> instead.", > > > > >> seekToTimestamps.get(partition), > > > > >> partition > > > > >> ); > > > > >> > > > > >> > > > > > > > > > > mainConsumer.seekToEnd(Collections.singleton(partitionAndOffset.getKey())); > > > > >> } > > > > >> } > > > > >> > > > > >> In both cases, knowing that the offset doesn't exist is actually > > > > important > > > > >> to the application. Instead of ignoring the missing offset, the > > > > application > > > > >> wants to explicitly reset the offset to a default value. So the > > > > suggested > > > > >> change in the KIP actually will make it harder for the application > > > > >> developer to write the correct code. > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Jun > > > > >> > > > > >> On Fri, May 2, 2025 at 4:51 AM 黃竣陽 <s7133...@gmail.com> wrote: > > > > >> > > > > >>> Hi chia, Jun, > > > > >>> > > > > >>>> It would be useful to think through the consistency with > > > > >>> adminClient.listOffset() > > > > >>> > > > > >>> While the returned map entries are never null, a value of -1 is > > used > > > to > > > > >>> indicate the absence of an offset. > > > > >>> Given this approach, I believe we could apply a similar strategy > by > > > > >>> introducing a new (deprecated) configuration > > > > >>> for the Admin client. This would give users the ability to opt > into > > > the > > > > >>> new behavior in Kafka 4.x. > > > > >>> > > > > >>>> We should also take into account the behavior of > > > > >>> ListOffsetsResult.partitionResult(TopicPartition partition). > > > > >>> > > > > >>> There are four methods that currently follow a similar pattern: > > > > >>> 1. ListOffsetsResult#partitionResult > > > > >>> 2. ListConsumerGroupOffsetsResult#partitionsToOffset > > > > >>> 3. DescribeTransactionsResult#description > > > > >>> 4. DescribeProducersResult#partitionResult However, > > > > >>> > > > > >>> I'm not entirely sure how these methods relate to the issue of > map > > > > >> entries > > > > >>> being null. It seems that these result classes > > > > >>> behave differently from others in this regard. > > > > >>> > > > > >>>> It seems that consumer.committed() > > > > >>>> > https://github.com/apache/kafka/pull/19578#discussion_r2068913749 > > > > >>> > > > > >>> I will address these inconsistency method into KIP. > > > > >>> > > > > >>> Best Regards, > > > > >>> Jiunn-Yang > > > > >>> > > > > >>>> Chia-Ping Tsai <chia7...@gmail.com> 於 2025年5月1日 凌晨3:20 寫道: > > > > >>>> > > > > >>>> hi Jun > > > > >>>> > > > > >>>>> It seems that consumer.committed() has the same behavior. If > you > > > pass > > > > >>> in a > > > > >>>> non-existing partition, it returns a map with a null value. So, > if > > > we > > > > >>> want > > > > >>>> to make a change, it would be useful to make it consistent too. > > > > >>>> > > > > >>>> yes, both consumers have same behavior. > > > > >>>> > > > > >>>> For another, `listStreamsGroupOffsets` and > `listShareGroupOffsets` > > > > also > > > > >>> add > > > > >>>> null value. They are not in production, but maybe we should keep > > the > > > > >>>> behavior for consistency. > > > > >>>> > > > > >>>> Best, > > > > >>>> Chia-Ping > > > > >>>> > > > > >>>> Jun Rao <j...@confluent.io.invalid> 於 2025年5月1日 週四 上午2:38寫道: > > > > >>>> > > > > >>>>> Hi, Chia-Ping, > > > > >>>>> > > > > >>>>> It seems that consumer.committed() has the same behavior. If > you > > > pass > > > > >>> in a > > > > >>>>> non-existing partition, it returns a map with a null value. So, > > if > > > we > > > > >>> want > > > > >>>>> to make a change, it would be useful to make it consistent too. > > > > >>>>> > > > > >>>>> Thanks, > > > > >>>>> > > > > >>>>> Jun > > > > >>>>> > > > > >>>>> On Wed, Apr 30, 2025 at 9:05 AM Chia-Ping Tsai < > > chia7...@gmail.com > > > > > > > > >>> wrote: > > > > >>>>> > > > > >>>>>> hi Ken, > > > > >>>>>> > > > > >>>>>> there is another inconsistent case for the offset-related > APIs. > > > see > > > > >>>>>> > > https://github.com/apache/kafka/pull/19578#discussion_r2068913749 > > > > >>>>>> > > > > >>>>>> Best, > > > > >>>>>> Chia-Ping > > > > >>>>>> > > > > >>>>>> Chia-Ping Tsai <chia7...@gmail.com> 於 2025年4月30日 週三 下午7:03寫道: > > > > >>>>>> > > > > >>>>>>> hi all, > > > > >>>>>>> > > > > >>>>>>>> Also, it would be useful to think through the behavior > > > > >>>>>>> of ListOffsetsResult.partitionResult(final TopicPartition > > > > partition) > > > > >>>>> too. > > > > >>>>>>> > > > > >>>>>>> It seems there are three options: > > > > >>>>>>> > > > > >>>>>>> 1. Keep the current behavior - it's inconsistent, as other > > > result > > > > >>>>>>> classes, such as DeleteRecordsResult, don't have a similar > > > method. > > > > >>>>>>> 2. Deprecate the current method and add a new method that > > > returns > > > > >>>>> all > > > > >>>>>>> futures. I prefer this approach as it would align with other > > > > >> result > > > > >>>>>> classes. > > > > >>>>>>> 3. Return null or KafkaFuture<null> - this is a bad idea, so > > > let's > > > > >>>>> not > > > > >>>>>>> consider it. > > > > >>>>>>> > > > > >>>>>>> By the way, DescribeProducersResult#partitionResult(final > > > > >>>>> TopicPartition > > > > >>>>>>> partition) needs to be considered as well. > > > > >>>>>>> Best, > > > > >>>>>>> Chia-Ping > > > > >>>>>>> > > > > >>>>>>> Jun Rao <j...@confluent.io.invalid> 於 2025年4月30日 週三 上午6:30寫道: > > > > >>>>>>> > > > > >>>>>>>> Hi, Jiunn-Yang, > > > > >>>>>>>> > > > > >>>>>>>> Thanks for the KIP. > > > > >>>>>>>> > > > > >>>>>>>> It would be useful to think through the consistency with > > > > >>>>>>>> adminClient.listOffset(). Currently, if the offset for a > > > timestamp > > > > >>>>>> doesn't > > > > >>>>>>>> exist, consumer.offsetsForTimes() returns a null value for > > that > > > > >>>>>> partition > > > > >>>>>>>> while adminClient.listOffset().all().get() returns a > > > > >>>>>> ListOffsetsResultInfo > > > > >>>>>>>> with -1 as the offset. Should we make the behavior more > > > consistent > > > > >> in > > > > >>>>>> the > > > > >>>>>>>> KIP? Also, it would be useful to think through the behavior > > > > >>>>>>>> of ListOffsetsResult.partitionResult(final TopicPartition > > > > >> partition) > > > > >>>>>> too. > > > > >>>>>>>> > > > > >>>>>>>> Jun > > > > >>>>>>>> > > > > >>>>>>>> On Fri, Mar 14, 2025 at 4:33 AM 黃竣陽 <s7133...@gmail.com> > > wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hello everyone, > > > > >>>>>>>>> > > > > >>>>>>>>> I would like to start a discussion on KIP-1140: Avoid to > > return > > > > >> null > > > > >>>>>>>> value > > > > >>>>>>>>> in Map from public api of consumer > > > > >>>>>>>>> <https://cwiki.apache.org/confluence/x/mIuMEw> > > > > >>>>>>>>> > > > > >>>>>>>>> This proposal aims to improve the Kafka consumer API by > > > ensuring > > > > >>>>> that > > > > >>>>>>>> the > > > > >>>>>>>>> Map it returns contains only non-null values, > > > > >>>>>>>>> aligning with the design philosophy of Java collections. > This > > > > >> change > > > > >>>>>>>>> provides significantly more benefits than drawbacks, > > > > >>>>>>>>> enhancing API consistency and usability while reducing > errors > > > > >> caused > > > > >>>>>> by > > > > >>>>>>>>> developer misuse. > > > > >>>>>>>>> > > > > >>>>>>>>> Best Regards, > > > > >>>>>>>>> Jiunn-Yang > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > >