Thanks Cody (forgot to reply-all earlier, apologies)!
One more question for the list: I'm now seeing a java.lang.ClassNotFoundException for kafka.OffsetRange upon relaunching the streaming job after a previous run (via spark-submit) 15/08/24 13:07:11 INFO CheckpointReader: Attempting to load checkpoint from file hdfs://namenode***/shared/sand_checkpoint/checkpoint-1440445995000 15/08/24 13:07:11 WARN CheckpointReader: Error reading checkpoint from file hdfs://namenode***/shared/sand_checkpoint/checkpoint-1440445995000 java.io.IOException: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.OffsetRange at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1242) at org.apache.spark.streaming.DStreamGraph.readObject(DStreamGraph.scala:188) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) ... Is there something I'm missing with checkpointing to cause the above error? I found this discussion for kafkaRDDPartition: https://github.com/apache/spark/pull/3798#discussion_r24019256, but it seems like that was resolved afterwards. Thanks! On Mon, Aug 24, 2015 at 10:22 AM, Cody Koeninger <c...@koeninger.org> wrote: > It doesn't matter if shuffling occurs. Just update ZK from the driver, > inside the foreachRDD, after all your dynamodb updates are done. Since > you're just doing it for monitoring purposes, that should be fine. > > > On Mon, Aug 24, 2015 at 12:11 PM, suchenzang <suchenz...@gmail.com> wrote: > >> Forgot to include the PR I was referencing: >> https://github.com/apache/spark/pull/4805/ >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Direct-Streaming-With-ZK-Updates-tp24423p24424.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >