Spark Kafka Direct Streaming

abi_pat Tue, 07 Jul 2015 07:43:56 -0700

Hi,

I am using the new experimental Direct Stream API. Everything is working
fine but when it comes to fault tolerance, I am not sure how to achieve it.
Presently my Kafka config map looks like this


        configMap.put("zookeeper.connect","192.168.51.98:2181");
        configMap.put("group.id", UUID.randomUUID().toString());
        configMap.put("auto.offset.reset","smallest");
        configMap.put("auto.commit.enable","true");
        configMap.put("topics","IPDR31");
        configMap.put("kafka.consumer.id","kafkasparkuser");
        configMap.put("bootstrap.servers","192.168.50.124:9092");
        Set<String> topic = new HashSet<String>();
        topic.add("IPDR31");
        
        JavaPairInputDStream<byte[], byte[]> kafkaData =
KafkaUtils.createDirectStream(js,byte[].class,byte[].class,DefaultDecoder.class,DefaultDecoder.class,configMap,topic);

Questions -

Q1- Is my Kafka configuration correct or should it be changed? 

Q2- I also looked into the Checkpointing but in my usecase, Data
checkpointing is not required but meta checkpointing is required. Can I
achieve this, i.e. enabling meta checkpointing and not the data
checkpointing?



Thanks
Abhishek Patel



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Kafka-Direct-Streaming-tp23685.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Spark Kafka Direct Streaming

Reply via email to