Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-29 Thread Nicolas Phung
> > On Tue, Jul 28, 2015 at 9:30 AM, Nicolas Phung > wrote: > >> Hi, >> >> After using KafkaUtils.createDirectStream[Object, Object, >> KafkaAvroDecoder, KafkaAvroDecoder, Option[AnalyticEventEnriched]](ssc, >> kafkaParams, map, messageHandler), I'

Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-28 Thread Nicolas Phung
ailure when > deserializing the message from the MessageAndMetadata, I'd just go ahead > and do the work in the messageHandler. > > On Fri, Jul 24, 2015 at 2:46 AM, Nicolas Phung > wrote: > >> Hello, >> >> I manage to read all my data back with skipping offset tha

Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-24 Thread Nicolas Phung
u for your help Cody. Regards, Nicolas PHUNG On Tue, Jul 21, 2015 at 4:53 PM, Cody Koeninger wrote: > Yeah, I'm referring to that api. > > If you want to filter messages in addition to catching that exception, > have your mesageHandler return an option, so the type R

Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-21 Thread Nicolas Phung
7:08 PM, Cody Koeninger wrote: > Yeah, in the function you supply for the messageHandler parameter to > createDirectStream, catch the exception and do whatever makes sense for > your application. > > On Mon, Jul 20, 2015 at 11:58 AM, Nicolas Phung > wrote: > >> Hello, &

Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-20 Thread Nicolas Phung
KafkaUtils.createDirectStream ? Regards, Nicolas PHUNG On Mon, Jul 20, 2015 at 3:46 PM, Cody Koeninger wrote: > I'd try logging the offsets for each message, see where problems start, > then try using the console consumer starting at those offsets and see if > you can reproduce the p

Re: Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-20 Thread Nicolas Phung
mations or suggestions, please tell me. Regards, Nicolas PHUNG On Thu, Jul 16, 2015 at 7:08 PM, Cody Koeninger wrote: > Not exactly the same issue, but possibly related: > > https://issues.apache.org/jira/browse/KAFKA-1196 > > On Thu, Jul 16, 2015 at 12:03 PM, Cody Koeninger &g

Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing

2015-07-16 Thread Nicolas Phung
it resumes the checkpoint as expected without missing data. Did someone encounters something similar ? How did you solve this ? Regards, Nicolas PHUNG

Spark Streaming Kafka Consumer, Confluent Platform, Avro & StorageLevel

2015-04-13 Thread Nicolas Phung
ser": "Chrome", "browserVersion": "41.0.2272.101", "browserRenderer": "WEBKIT", "physicalServerOrigin": "01.front.local"} Avro content: {"group": "", "domainType": "job", "action": "apply_freely", "entity": "14564564132", "user": "user", "session": "session", "date": "20150326T154052.000+0100", "ip": "192.168.0.1", "userAgent": "Mozilla\/5.0 (X11; Linux x86_64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/41.0.2272.101 Safari\/537.36", "referer": "http:\/\/www.kb.com\/offre\/controleur-de-gestion-h-f-9775029", "os": "Linux", "deviceType": "Computer", "browser": "Chrome", "browserVersion": "41.0.2272.101", "browserRenderer": "WEBKIT", "physicalServerOrigin": "01.front.local"} If Kryo is disabled, the Java serialization seems to be able to read the record as expected in both _SER / NON _SER StorageLevel. I've tried several things with Kryo serialization with no good results (even just activate Kryo without registering any class manually). I don't know maybe I'm doing something wrong with the StorageLevel in Spark ? People in Confluent Platform mailing list <https://groups.google.com/forum/#!topic/confluent-platform/tNdzyR2Ce00> suggest me to use chill-avro 0.5.2, so I try the following Kryo registration with no luck (I've got the same issue described). kryo.register(classOf[GenericRecord], AvroSerializer.GenericRecordSerializer[GenericRecord]()) kryo.register(classOf[Event], AvroSerializer.SpecificRecordSerializer[Event]) I've tried with the serializer from ADAMKryoRegistrator.scala <https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/org/bdgenomics/adam/serialization/ADAMKryoRegistrator.scala> too but it doesn't work either. Regards, Nicolas PHUNG

Re: Spark streaming with Kafka, multiple partitions fail, single partition ok

2015-03-31 Thread Nicolas Phung
Hello, @Akhil Das I'm trying to use the experimental API https://github.com/apache/spark/blob/master/examples/scala-2.10/src/main/scala/org/apache/spark/examples/streaming/DirectKafkaWordCount.scala

Spark streaming with Kafka, multiple partitions fail, single partition ok

2015-03-30 Thread Nicolas Phung
wrong ? Can you help me find the right way to write this with kafka topic with multiple partitions. Regards, Nicolas PHUNG