@Jungtaek, thanks for the explanation!
@Colin, please see the attached code in my previous mail for all the
details.

On Sun, Aug 11, 2019 at 1:20 PM Jungtaek Lim <kabh...@gmail.com> wrote:

> Btw, I'd like to ask you to move on thread for KIP discussion, as it will
> make us reaching conclusion faster and have single channel to discuss.
>
> On Sun, Aug 11, 2019 at 8:16 PM Jungtaek Lim <kabh...@gmail.com> wrote:
>
> > So we have some use case which we don't just rely on everything what
> Kafka
> > consumer provides. We want to know current assignment on this consumer,
> and
> > to get the latest assignment, we called the hack `poll(0)`.
> >
> > That said, we don't want to pull any records here, and there's no way to
> > accomplish this. Please correct me if I'm missing something.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > On Sat, Aug 10, 2019 at 7:32 AM Colin McCabe <cmcc...@apache.org> wrote:
> >
> >> Hi Gabor,
> >>
> >> What is it that you want to do here?  If you just want to check that the
> >> partitions exist, but not fetch any data, you could use
> >> AdminClient#describeTopics for that.  If you want to create the topics,
> you
> >> could use AdminClient#createTopics.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Fri, Aug 9, 2019, at 11:23, Gabor Somogyi wrote:
> >> > > Each KafkaConsumer method that returns metadata will already block
> >> until
> >> > such metadata is available
> >> > The old API was waiting infinitely but the new has a timeout which has
> >> > effect on the metadata fetch as well.
> >> >
> >> > Spark is interested in only the assigned partitions and/or
> >> > latest/earliest/... offsets.
> >> > Please see
> >> >
> >>
> https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala
> >> > At the moment the old poll(0) is used which can block infinitely and
> the
> >> > API is marked deprecated.
> >> > We would like to switch to the new API, see the background here:
> >> > https://github.com/apache/spark/pull/25135
> >> > However in such case there is no need to fetch any kind of data since
> >> > driver is not doing any data processing.
> >> >
> >> > G
> >> >
> >> >
> >> > On Fri, Aug 9, 2019 at 7:39 PM Ryanne Dolan <ryannedo...@gmail.com>
> >> wrote:
> >> >
> >> > > > pull some records even they're only interested in metadata.
> >> > >
> >> > > Jungtaek, what is the use-case here? Each KafkaConsumer method that
> >> returns
> >> > > metadata will already block until such metadata is available... so
> why
> >> > > would you need to apply this "hack" in the first place?
> >> > >
> >> > > Ryanne
> >> > >
> >> > > On Wed, Aug 7, 2019 at 2:24 AM Jungtaek Lim <kabh...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Hi devs,
> >> > > >
> >> > > > I'm trying to replace deprecated poll(long) with poll(Duration),
> and
> >> > > > realized there's no alternative which behaves exactly same as
> >> poll(0), as
> >> > > > poll(0) has been used as a hack to only update metadata instead of
> >> > > pulling
> >> > > > records. poll(Duration.ZERO) wouldn't behave same since even
> >> updating
> >> > > > metadata will be timed-out. So now end users would need to give
> more
> >> > > > timeout and even pull some records even they're only interested in
> >> > > > metadata.
> >> > > >
> >> > > > I looked back some KIPs which brought the change, and "discarded"
> >> KIP
> >> > > > (KIP-288 [1]) actually proposed a new API which only pulls
> metadata.
> >> > > > KIP-266 [2] is picked up instead but it didn't cover all the
> things
> >> what
> >> > > > KIP-288 proposed. I'm seeing some doc explaining poll(0) hasn't
> been
> >> > > > supported officially, but the hack has been widely used and they
> >> can't be
> >> > > > ignored.
> >> > > >
> >> > > > Kafka test code itself relies on either deprecated poll(0),
> >> > > > or updateAssignmentMetadataIfNeeded, which seems to be private API
> >> only
> >> > > for
> >> > > > testing.
> >> > > > (Btw, I'd try out replacing poll(0) to
> >> updateAssignmentMetadataIfNeeded
> >> > > as
> >> > > > avoiding deprecated method - if it works I'll submit a PR.)
> >> > > >
> >> > > > I'm feeling that it would be ideal to expose
> >> > > > `updateAssignmentMetadataIfNeeded` to the public API, maybe with
> >> renaming
> >> > > > as `waitForAssignment` which was proposed in KIP-288 if it feels
> too
> >> > > long.
> >> > > >
> >> > > > What do you think? If it sounds feasible I'd like to try out
> >> contribution
> >> > > > on this. I'm new to contribute Kafka community, so not sure it
> would
> >> > > > require a new KIP or not.
> >> > > >
> >> > > > Thanks,
> >> > > > Jungtaek Lim (HeartSaVioR)
> >> > > >
> >> > > > 1.
> >> > > >
> >> > > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-288%3A+%5BDISCARDED%5D+Consumer.poll%28%29+timeout+semantic+change+and+new+waitForAssignment+method
> >> > > > 2.
> >> > > >
> >> > > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-266%3A+Fix+consumer+indefinite+blocking+behavior
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> > --
> > Name : Jungtaek Lim
> > Blog : http://medium.com/@heartsavior
> > Twitter : http://twitter.com/heartsavior
> > LinkedIn : http://www.linkedin.com/in/heartsavior
> >
>
>
> --
> Name : Jungtaek Lim
> Blog : http://medium.com/@heartsavior
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>

Reply via email to