@Jungtaek, thanks for the explanation! @Colin, please see the attached code in my previous mail for all the details.
On Sun, Aug 11, 2019 at 1:20 PM Jungtaek Lim <kabh...@gmail.com> wrote: > Btw, I'd like to ask you to move on thread for KIP discussion, as it will > make us reaching conclusion faster and have single channel to discuss. > > On Sun, Aug 11, 2019 at 8:16 PM Jungtaek Lim <kabh...@gmail.com> wrote: > > > So we have some use case which we don't just rely on everything what > Kafka > > consumer provides. We want to know current assignment on this consumer, > and > > to get the latest assignment, we called the hack `poll(0)`. > > > > That said, we don't want to pull any records here, and there's no way to > > accomplish this. Please correct me if I'm missing something. > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > On Sat, Aug 10, 2019 at 7:32 AM Colin McCabe <cmcc...@apache.org> wrote: > > > >> Hi Gabor, > >> > >> What is it that you want to do here? If you just want to check that the > >> partitions exist, but not fetch any data, you could use > >> AdminClient#describeTopics for that. If you want to create the topics, > you > >> could use AdminClient#createTopics. > >> > >> best, > >> Colin > >> > >> > >> On Fri, Aug 9, 2019, at 11:23, Gabor Somogyi wrote: > >> > > Each KafkaConsumer method that returns metadata will already block > >> until > >> > such metadata is available > >> > The old API was waiting infinitely but the new has a timeout which has > >> > effect on the metadata fetch as well. > >> > > >> > Spark is interested in only the assigned partitions and/or > >> > latest/earliest/... offsets. > >> > Please see > >> > > >> > https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala > >> > At the moment the old poll(0) is used which can block infinitely and > the > >> > API is marked deprecated. > >> > We would like to switch to the new API, see the background here: > >> > https://github.com/apache/spark/pull/25135 > >> > However in such case there is no need to fetch any kind of data since > >> > driver is not doing any data processing. > >> > > >> > G > >> > > >> > > >> > On Fri, Aug 9, 2019 at 7:39 PM Ryanne Dolan <ryannedo...@gmail.com> > >> wrote: > >> > > >> > > > pull some records even they're only interested in metadata. > >> > > > >> > > Jungtaek, what is the use-case here? Each KafkaConsumer method that > >> returns > >> > > metadata will already block until such metadata is available... so > why > >> > > would you need to apply this "hack" in the first place? > >> > > > >> > > Ryanne > >> > > > >> > > On Wed, Aug 7, 2019 at 2:24 AM Jungtaek Lim <kabh...@gmail.com> > >> wrote: > >> > > > >> > > > Hi devs, > >> > > > > >> > > > I'm trying to replace deprecated poll(long) with poll(Duration), > and > >> > > > realized there's no alternative which behaves exactly same as > >> poll(0), as > >> > > > poll(0) has been used as a hack to only update metadata instead of > >> > > pulling > >> > > > records. poll(Duration.ZERO) wouldn't behave same since even > >> updating > >> > > > metadata will be timed-out. So now end users would need to give > more > >> > > > timeout and even pull some records even they're only interested in > >> > > > metadata. > >> > > > > >> > > > I looked back some KIPs which brought the change, and "discarded" > >> KIP > >> > > > (KIP-288 [1]) actually proposed a new API which only pulls > metadata. > >> > > > KIP-266 [2] is picked up instead but it didn't cover all the > things > >> what > >> > > > KIP-288 proposed. I'm seeing some doc explaining poll(0) hasn't > been > >> > > > supported officially, but the hack has been widely used and they > >> can't be > >> > > > ignored. > >> > > > > >> > > > Kafka test code itself relies on either deprecated poll(0), > >> > > > or updateAssignmentMetadataIfNeeded, which seems to be private API > >> only > >> > > for > >> > > > testing. > >> > > > (Btw, I'd try out replacing poll(0) to > >> updateAssignmentMetadataIfNeeded > >> > > as > >> > > > avoiding deprecated method - if it works I'll submit a PR.) > >> > > > > >> > > > I'm feeling that it would be ideal to expose > >> > > > `updateAssignmentMetadataIfNeeded` to the public API, maybe with > >> renaming > >> > > > as `waitForAssignment` which was proposed in KIP-288 if it feels > too > >> > > long. > >> > > > > >> > > > What do you think? If it sounds feasible I'd like to try out > >> contribution > >> > > > on this. I'm new to contribute Kafka community, so not sure it > would > >> > > > require a new KIP or not. > >> > > > > >> > > > Thanks, > >> > > > Jungtaek Lim (HeartSaVioR) > >> > > > > >> > > > 1. > >> > > > > >> > > > > >> > > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-288%3A+%5BDISCARDED%5D+Consumer.poll%28%29+timeout+semantic+change+and+new+waitForAssignment+method > >> > > > 2. > >> > > > > >> > > > > >> > > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-266%3A+Fix+consumer+indefinite+blocking+behavior > >> > > > > >> > > > >> > > >> > > > > > > -- > > Name : Jungtaek Lim > > Blog : http://medium.com/@heartsavior > > Twitter : http://twitter.com/heartsavior > > LinkedIn : http://www.linkedin.com/in/heartsavior > > > > > -- > Name : Jungtaek Lim > Blog : http://medium.com/@heartsavior > Twitter : http://twitter.com/heartsavior > LinkedIn : http://www.linkedin.com/in/heartsavior >