Actually, what I said above is inaccurate. In testSeekAndCommitWithBrokerFailures, TestUtils.waitUntilTrue blocks, not seek. My assumption is that seek did not update correctly. I will be digging further into this.
On Sat, Mar 17, 2018 at 4:16 PM, Richard Yu <yohan.richard...@gmail.com> wrote: > One more thing: when looking through tests, I have realized that seek() > methods can potentially block indefinitely. As you well know, seek() is > called when pollOnce() or position() is active. Thus, if position() blocks > indefinitely, then so would seek(). Should bounding seek() also be included > in this KIP? > > Thanks, Richard > > On Sat, Mar 17, 2018 at 1:16 PM, Richard Yu <yohan.richard...@gmail.com> > wrote: > >> Thanks for the advice, Jason >> >> I have modified KIP-266 to include the java doc for committed() and other >> blocking methods, and I also >> mentioned poll() which will also be bounded. Let me know if there is >> anything else. :) >> >> Sincerely, Richard >> >> >> >> >> >> On Sat, Mar 17, 2018 at 12:00 PM, Jason Gustafson <ja...@confluent.io> >> wrote: >> >>> Hi Richard, >>> >>> Thanks for the updates. I'm really glad you picked this up. A couple >>> minor >>> comments: >>> >>> 1. Can you list the full set of new APIs explicitly in the KIP? >>> Currently I >>> only see the javadoc for `position()`. >>> >>> 2. We should consider adding `TimeUnit` to the new methods to avoid unit >>> confusion. I know it's inconsistent with the poll() API, but I think it >>> was >>> probably a mistake not to include it there, so better not to double down >>> on >>> that mistake. And note that we do already have `close(long, TimeUnit)`. >>> >>> Other than that, I think the current KIP seems reasonable. >>> >>> Thanks, >>> Jason >>> >>> On Wed, Mar 14, 2018 at 5:00 PM, Richard Yu <yohan.richard...@gmail.com> >>> wrote: >>> >>> > Note to all: I have included bounding commitSync() and committed() in >>> this >>> > KIP. >>> > >>> > On Sun, Mar 11, 2018 at 5:05 PM, Richard Yu < >>> yohan.richard...@gmail.com> >>> > wrote: >>> > >>> > > Hi all, >>> > > >>> > > I updated the KIP where overloading position() is now the favored >>> > approach. >>> > > Bounding position() using requestTimeoutMs has been listed as >>> rejected. >>> > > >>> > > Any thoughts? >>> > > >>> > > On Tue, Mar 6, 2018 at 6:00 PM, Guozhang Wang <wangg...@gmail.com> >>> > wrote: >>> > > >>> > >> I agree that adding the overloads is most flexible. But going for >>> that >>> > >> direction we'd do that for all the blocking call that I've listed >>> above, >>> > >> with this timeout value covering the end-to-end waiting time. >>> > >> >>> > >> >>> > >> Guozhang >>> > >> >>> > >> On Tue, Mar 6, 2018 at 10:02 AM, Ted Yu <yuzhih...@gmail.com> >>> wrote: >>> > >> >>> > >> > bq. The most flexible option is to add overloads to the consumer >>> > >> > >>> > >> > This option is flexible. >>> > >> > >>> > >> > Looking at the tail of SPARK-18057, Spark dev voiced the same >>> choice. >>> > >> > >>> > >> > +1 for adding overload with timeout parameter. >>> > >> > >>> > >> > Cheers >>> > >> > >>> > >> > On Mon, Mar 5, 2018 at 2:42 PM, Jason Gustafson < >>> ja...@confluent.io> >>> > >> > wrote: >>> > >> > >>> > >> > > @Guozhang I probably have suggested all options at some point or >>> > >> another, >>> > >> > > including most recently, the current KIP! I was thinking that >>> > >> practically >>> > >> > > speaking, the request timeout defines how long the user is >>> willing >>> > to >>> > >> > wait >>> > >> > > for a response. The consumer doesn't really have a complex send >>> > >> process >>> > >> > > like the producer for any of these APIs, so I wasn't sure how >>> much >>> > >> > benefit >>> > >> > > there would be from having more granular control over timeouts >>> (in >>> > the >>> > >> > end, >>> > >> > > KIP-91 just adds a single timeout to control the whole send). >>> That >>> > >> said, >>> > >> > it >>> > >> > > might indeed be better to avoid overloading the config as you >>> > suggest >>> > >> > since >>> > >> > > at least it avoids inconsistency with the producer's usage. >>> > >> > > >>> > >> > > The most flexible option is to add overloads to the consumer so >>> that >>> > >> > users >>> > >> > > can pass the timeout directly. I'm not sure if that is more or >>> less >>> > >> > > annoying than a new config, but I've found config timeouts a >>> little >>> > >> > > constraining in practice. For example, I could imagine users >>> wanting >>> > >> to >>> > >> > > wait longer for an offset commit operation than a position >>> lookup; >>> > if >>> > >> the >>> > >> > > latter isn't timely, users can just pause the partition and >>> continue >>> > >> > > fetching on others. If you cannot commit offsets, however, it >>> might >>> > be >>> > >> > > safer for an application to wait availability of the coordinator >>> > than >>> > >> > > continuing. >>> > >> > > >>> > >> > > -Jason >>> > >> > > >>> > >> > > On Sun, Mar 4, 2018 at 10:14 PM, Guozhang Wang < >>> wangg...@gmail.com> >>> > >> > wrote: >>> > >> > > >>> > >> > > > Hello Richard, >>> > >> > > > >>> > >> > > > Thanks for the proposed KIP. I have a couple of general >>> comments: >>> > >> > > > >>> > >> > > > 1. I'm not sure if piggy-backing the timeout exception on the >>> > >> > > > existing requestTimeoutMs configured in "request.timeout.ms" >>> is a >>> > >> good >>> > >> > > > idea >>> > >> > > > since a) it is a general config that applies for all types of >>> > >> requests, >>> > >> > > and >>> > >> > > > 2) using it to cover all the phases of an API call, including >>> > >> network >>> > >> > > round >>> > >> > > > trip and potential metadata refresh is shown to not be a good >>> > idea, >>> > >> as >>> > >> > > > illustrated in KIP-91: >>> > >> > > > >>> > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>> > >> > > > 91+Provide+Intuitive+User+Timeouts+in+The+Producer >>> > >> > > > >>> > >> > > > In fact, I think in KAFKA-4879 which is aimed for the same >>> issue >>> > as >>> > >> > > > KAFKA-6608, >>> > >> > > > Jason has suggested we use a new config for the API. Maybe >>> this >>> > >> would >>> > >> > be >>> > >> > > a >>> > >> > > > more intuitive manner than reusing the request.timeout.ms >>> config. >>> > >> > > > >>> > >> > > > >>> > >> > > > 2. Besides the Consumer.position() call, there are a couple of >>> > more >>> > >> > > > blocking calls today that could result in infinite blocking: >>> > >> > > > Consumer.commitSync() and Consumer.committed(), should they be >>> > >> > considered >>> > >> > > > in this KIP as well? >>> > >> > > > >>> > >> > > > 3. There are a few other APIs that are today relying on >>> > >> > > request.timeout.ms >>> > >> > > > already for breaking the infinite blocking, namely >>> > >> > > Consumer.partitionFor(), >>> > >> > > > Consumer.OffsetAndTimestamp() and Consumer.listTopics(), if >>> we are >>> > >> > making >>> > >> > > > the other blocking calls to be relying a new config as >>> suggested >>> > in >>> > >> 1) >>> > >> > > > above, should we also change the semantics of these API >>> functions >>> > >> for >>> > >> > > > consistency? >>> > >> > > > >>> > >> > > > >>> > >> > > > Guozhang >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > On Sun, Mar 4, 2018 at 11:13 AM, Richard Yu < >>> > >> > yohan.richard...@gmail.com> >>> > >> > > > wrote: >>> > >> > > > >>> > >> > > > > Hi all, >>> > >> > > > > >>> > >> > > > > I would like to discuss a potential change which would be >>> made >>> > to >>> > >> > > > > KafkaConsumer: >>> > >> > > > > https://cwiki.apache.org/confluence/pages/viewpage. >>> > >> > > > action?pageId=75974886 >>> > >> > > > > >>> > >> > > > > Thanks, >>> > >> > > > > Richard Yu >>> > >> > > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > -- >>> > >> > > > -- Guozhang >>> > >> > > > >>> > >> > > >>> > >> > >>> > >> >>> > >> >>> > >> >>> > >> -- >>> > >> -- Guozhang >>> > >> >>> > > >>> > > >>> > >>> >> >> >