The underlying kafka consumer On Tue, Aug 16, 2016 at 2:17 PM, Srikanth <srikanth...@gmail.com> wrote: > Yes, SubscribePattern detects new partition. Also, it has a comment saying > >> Subscribe to all topics matching specified pattern to get dynamically >> assigned partitions. >> * The pattern matching will be done periodically against topics existing >> at the time of check. >> * @param pattern pattern to subscribe to >> * @param kafkaParams Kafka > > > Who does the new partition discover? Underlying kafka consumer or > spark-streaming-kafka-0-10-assembly?? > > Srikanth > > On Fri, Aug 12, 2016 at 5:15 PM, Cody Koeninger <c...@koeninger.org> wrote: >> >> Hrrm, that's interesting. Did you try with subscribe pattern, out of >> curiosity? >> >> I haven't tested repartitioning on the underlying new Kafka consumer, so >> its possible I misunderstood something. >> >> On Aug 12, 2016 2:47 PM, "Srikanth" <srikanth...@gmail.com> wrote: >>> >>> I did try a test with spark 2.0 + spark-streaming-kafka-0-10-assembly. >>> Partition was increased using "bin/kafka-topics.sh --alter" after spark >>> job was started. >>> I don't see messages from new partitions in the DStream. >>> >>>> KafkaUtils.createDirectStream[Array[Byte], Array[Byte]] ( >>>> ssc, PreferConsistent, Subscribe[Array[Byte], >>>> Array[Byte]](topics, kafkaParams) ) >>>> .map(r => (r.key(), r.value())) >>> >>> >>> Also, no.of partitions did not increase too. >>>> >>>> dataStream.foreachRDD( (rdd, curTime) => { >>>> logger.info(s"rdd has ${rdd.getNumPartitions} partitions.") >>> >>> >>> Should I be setting some parameter/config? Is the doc for new integ >>> available? >>> >>> Thanks, >>> Srikanth >>> >>> On Fri, Jul 22, 2016 at 2:15 PM, Cody Koeninger <c...@koeninger.org> >>> wrote: >>>> >>>> No, restarting from a checkpoint won't do it, you need to re-define the >>>> stream. >>>> >>>> Here's the jira for the 0.10 integration >>>> >>>> https://issues.apache.org/jira/browse/SPARK-12177 >>>> >>>> I haven't gotten docs completed yet, but there are examples at >>>> >>>> https://github.com/koeninger/kafka-exactly-once/tree/kafka-0.10 >>>> >>>> On Fri, Jul 22, 2016 at 1:05 PM, Srikanth <srikanth...@gmail.com> wrote: >>>> > In Spark 1.x, if we restart from a checkpoint, will it read from new >>>> > partitions? >>>> > >>>> > If you can, pls point us to some doc/link that talks about Kafka 0.10 >>>> > integ >>>> > in Spark 2.0. >>>> > >>>> > On Fri, Jul 22, 2016 at 1:33 PM, Cody Koeninger <c...@koeninger.org> >>>> > wrote: >>>> >> >>>> >> For the integration for kafka 0.8, you are literally starting a >>>> >> streaming job against a fixed set of topicapartitions, It will not >>>> >> change throughout the job, so you'll need to restart the spark job if >>>> >> you change kafka partitions. >>>> >> >>>> >> For the integration for kafka 0.10 / spark 2.0, if you use subscribe >>>> >> or subscribepattern, it should pick up new partitions as they are >>>> >> added. >>>> >> >>>> >> On Fri, Jul 22, 2016 at 11:29 AM, Srikanth <srikanth...@gmail.com> >>>> >> wrote: >>>> >> > Hello, >>>> >> > >>>> >> > I'd like to understand how Spark Streaming(direct) would handle >>>> >> > Kafka >>>> >> > partition addition? >>>> >> > Will a running job be aware of new partitions and read from it? >>>> >> > Since it uses Kafka APIs to query offsets and offsets are handled >>>> >> > internally. >>>> >> > >>>> >> > Srikanth >>>> > >>>> > >>> >>> >
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org