Thanks Can you point me to the patch to fix the serialization stack? Maybe I can pull it in and rerun my job.
Chen On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das <t...@databricks.com> wrote: > Your streaming job may have been seemingly running ok, but the DStream > checkpointing must have been failing in the background. It would have been > visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so > that checkpointing failures dont get hidden in the background. > > The fact that the serialization stack is not being shown correctly, is a > known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the > next couple of days. That should help you to narrow down the culprit > preventing serialization. > > On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Can you show us your function(s) ? >> >> Thanks >> >> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <chen.song...@gmail.com> >> wrote: >> >>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded >>> to 1.4, I started seeing error as below. It appears that it fails in >>> validate method in StreamingContext. Is there anything changed on 1.4.0 >>> w.r.t DStream checkpointint? >>> >>> Detailed error from driver: >>> >>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw >>> exception: *java.io.NotSerializableException: DStream checkpointing has >>> been enabled but the DStreams with their functions are not serializable* >>> Serialization stack: >>> >>> java.io.NotSerializableException: DStream checkpointing has been enabled >>> but the DStreams with their functions are not serializable >>> Serialization stack: >>> >>> at >>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550) >>> at >>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587) >>> at >>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586) >>> >>> -- >>> Chen Song >>> >>> >> > -- Chen Song