You could set this configuration "auto.offset.reset" through parameter
"kafkaParams" which is provided in some other overloaded APIs of
createStream.

By default Kafka will pick data from latest offset unless you explicitly
set it, this is the behavior Kafka, not Spark.

Thanks
Saisai

On Mon, Feb 22, 2016 at 5:52 PM, Paul Leclercq <paul.lecle...@tabmo.io>
wrote:

> Hi,
>
> Do you know why, with the receiver approach
> <http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-1-receiver-based-approach>
> and a *consumer group*, a new topic is not read from the beginning but
> from the lastest ?
>
> Code example :
>
>  val kafkaStream = KafkaUtils.createStream(streamingContext,
>      [ZK quorum], [consumer group id], [per-topic number of Kafka partitions 
> to consume])
>
>
> Is there a way to tell *only for new topic *to read from the beginning ?
>
> From Confluence FAQ
>
>> Alternatively, you can configure the consumer by setting
>> auto.offset.reset to "earliest" for the new consumer in 0.9 and "smallest"
>> for the old consumer.
>
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whydoesmyconsumernevergetanydata?
>
> Thanks
> --
>
> Paul Leclercq
>

Reply via email to