Hi Neha, 6. It seems like #4 can be avoided by using Map<TopicPartition, >> Long> or Map<TopicPartition, TopicPartitionOffset> as the argument type. > > How? lastCommittedOffsets() is independent of positions(). I'm not sure I > understood your suggestion.
I think of subscription as you're subscribing to a Set of TopicPartitions. Because the argument to positions() is TopicPartitionOffset ... it's conceivable that the method can be called with two offsets for the same TopicPartition. One way to handle this, is to accept either the first or the last offset for a TopicPartition. However, if the argument type is changed to Map<TopicPartition, Long> it precludes the possibility of getting duplicate offsets of the same TopicPartition. 7. To address #3, maybe we can return List<TopicPartitionOffset> that are >> invalid. > > I don't particularly see the advantage of returning a list of invalid partitions from position(). It seems a bit awkward to return a list to indicate what is obviously a bug. Prefer throwing an error since the user > should just fix that logic. I'm not sure if an Exception is needed or desirable here. I don't see this as a catastrophic failure or a non-recoverable failure. Even if we just write the bad offsets to a log file and call it a day, I'm ok with that. But my main goal is to communicate to the API users somehow that they've provided bad offests which are simply being ignored. Hi Jay, I would also like to shorten the name TopicOffsetPosition. Offset and > Position are duplicative of each other. So perhaps we could call it a > PartitionOffset or a TopicPosition or something like that. In general class > names that are just a concatenation of the fields (e.g. > TopicAndPartitionAndOffset) seem kind of lazy to me since the name doesn't > really describe it just enumerates. But that is more of a nit pick. 1. Did you mean to say TopicPartitionOffset instead of TopicOffsetPosition? 2. +1 on PartitionOffset The lastCommittedPosition is particular bothersome because: > 1. The name is weird and long > 2. It returns a list of results. But how can you use the list? The only way > to use the list is to make a map of tp=>offset and then look up results in > this map (or do a for loop over the list for the partition you want). This is sort of what I was talking about in my previous email. My suggestion was to change the return type to Map<TopicPartition, Long>. What if we made it: > long position(TopicPartition tp) > void seek(TopicOffsetPosition p) > long committed(TopicPartition tp) > void commit(TopicOffsetPosition...); 1. Absolutely love the idea of position(TopicPartition tp). 2. I think we also need to provide a method for accessing all positions positions() which maybe returns a Map<TopicPartition, Long>? 3. What is the difference between position(TopicPartition tp) and committed(TopicPartition tp)? 4. +1 on commit(PartitionOffset...) 5. +1 on seek(PartitionOffset p) 6. We should also provide a seek(PartitionOffset... offsets) Finally, in all the methods where we're using varargs, we should use an appropriate Collection data structure. For example, for the subscribe(TopicPartition... partitions) method, I think a more accurate API would be subscribe(Set<TopicPartition> partitions). This allows for the code to be self-documenting.