Hi Gabor,

What is it that you want to do here?  If you just want to check that the 
partitions exist, but not fetch any data, you could use 
AdminClient#describeTopics for that.  If you want to create the topics, you 
could use AdminClient#createTopics.

best,
Colin


On Fri, Aug 9, 2019, at 11:23, Gabor Somogyi wrote:
> > Each KafkaConsumer method that returns metadata will already block until
> such metadata is available
> The old API was waiting infinitely but the new has a timeout which has
> effect on the metadata fetch as well.
> 
> Spark is interested in only the assigned partitions and/or
> latest/earliest/... offsets.
> Please see
> https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala
> At the moment the old poll(0) is used which can block infinitely and the
> API is marked deprecated.
> We would like to switch to the new API, see the background here:
> https://github.com/apache/spark/pull/25135
> However in such case there is no need to fetch any kind of data since
> driver is not doing any data processing.
> 
> G
> 
> 
> On Fri, Aug 9, 2019 at 7:39 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:
> 
> > > pull some records even they're only interested in metadata.
> >
> > Jungtaek, what is the use-case here? Each KafkaConsumer method that returns
> > metadata will already block until such metadata is available... so why
> > would you need to apply this "hack" in the first place?
> >
> > Ryanne
> >
> > On Wed, Aug 7, 2019 at 2:24 AM Jungtaek Lim <kabh...@gmail.com> wrote:
> >
> > > Hi devs,
> > >
> > > I'm trying to replace deprecated poll(long) with poll(Duration), and
> > > realized there's no alternative which behaves exactly same as poll(0), as
> > > poll(0) has been used as a hack to only update metadata instead of
> > pulling
> > > records. poll(Duration.ZERO) wouldn't behave same since even updating
> > > metadata will be timed-out. So now end users would need to give more
> > > timeout and even pull some records even they're only interested in
> > > metadata.
> > >
> > > I looked back some KIPs which brought the change, and "discarded" KIP
> > > (KIP-288 [1]) actually proposed a new API which only pulls metadata.
> > > KIP-266 [2] is picked up instead but it didn't cover all the things what
> > > KIP-288 proposed. I'm seeing some doc explaining poll(0) hasn't been
> > > supported officially, but the hack has been widely used and they can't be
> > > ignored.
> > >
> > > Kafka test code itself relies on either deprecated poll(0),
> > > or updateAssignmentMetadataIfNeeded, which seems to be private API only
> > for
> > > testing.
> > > (Btw, I'd try out replacing poll(0) to updateAssignmentMetadataIfNeeded
> > as
> > > avoiding deprecated method - if it works I'll submit a PR.)
> > >
> > > I'm feeling that it would be ideal to expose
> > > `updateAssignmentMetadataIfNeeded` to the public API, maybe with renaming
> > > as `waitForAssignment` which was proposed in KIP-288 if it feels too
> > long.
> > >
> > > What do you think? If it sounds feasible I'd like to try out contribution
> > > on this. I'm new to contribute Kafka community, so not sure it would
> > > require a new KIP or not.
> > >
> > > Thanks,
> > > Jungtaek Lim (HeartSaVioR)
> > >
> > > 1.
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-288%3A+%5BDISCARDED%5D+Consumer.poll%28%29+timeout+semantic+change+and+new+waitForAssignment+method
> > > 2.
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-266%3A+Fix+consumer+indefinite+blocking+behavior
> > >
> >
>

Reply via email to