Spark streaming data loss

2017-06-19 Thread vasanth kumar
Hi, I have spark kafka streaming job running in Yarn cluster mode with spark.task.maxFailures=4 (default) spark.yarn.max.executor.failures=8 number of executor=1 spark.streaming.stopGracefullyOnShutdown=false checkpointing enabled - When there is RuntimeException in a batch in executor then same

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Dirceu Semighini Filho
t;> >> In case we want to use a pseudo file-system (like S3) which does not >> support append what are our options? I am not familiar with the code yet >> but is it possible to generate a new file whenever conflict of this sort >> happens? >> >> >> Th

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Arijit
_ From: Dirceu Semighini Filho Sent: Thursday, November 17, 2016 6:50:28 AM To: Arijit Cc: Tathagata Das; user@spark.apache.org Subject: Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL Hi Arijit, Have you find a solution for this? I'

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-17 Thread Dirceu Semighini Filho
ijit > -- > *From:* Tathagata Das > *Sent:* Monday, November 7, 2016 7:59:06 PM > *To:* Arijit > *Cc:* user@spark.apache.org > *Subject:* Re: Spark Streaming Data loss on failure to write > BlockAdditionEvent failure to WAL > > For WAL in Spark t

Spark streaming data loss due to timeout in writing BlockAdditionEvent to WAL by the driver

2016-11-14 Thread Arijit
Hi, We are seeing another case of data loss/drop when the following exception happens. This particular Exception treated as WARN resulted in dropping 2095 events from processing. 16/10/26 19:24:08 WARN ReceivedBlockTracker: Exception thrown while writing record: BlockAdditionEvent(ReceivedB

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-08 Thread Arijit
am not familiar with the code yet but is it possible to generate a new file whenever conflict of this sort happens? Thanks again, Arijit From: Tathagata Das Sent: Monday, November 7, 2016 7:59:06 PM To: Arijit Cc: user@spark.apache.org Subject: Re: Spark St

Re: Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-07 Thread Tathagata Das
For WAL in Spark to work with HDFS, the HDFS version you are running must support file appends. Contact your HDFS package/installation provider to figure out whether this is supported by your HDFS installation. On Mon, Nov 7, 2016 at 2:04 PM, Arijit wrote: > Hello All, > > > We are using Spark 1

Spark Streaming Data loss on failure to write BlockAdditionEvent failure to WAL

2016-11-07 Thread Arijit
Hello All, We are using Spark 1.6.2 with WAL enabled and encountering data loss when the following exception/warning happens. We are using HDFS as our checkpoint directory. Questions are: 1. Is this a bug in Spark or issue with our configuration? Source looks like the following. Which file