>
> Chris Nauroth
>
>
> On Wed, Oct 19, 2022 at 8:18 AM Martin Andersson <
> martin.anders...@kambi.com> wrote:
>
>> Is your spark job batch or streaming?
>> ------
>> *From:* Sandeep Vinayak
>> *Sent:* Tuesday, October
tober 18, 2022 19:48
> *To:* dev@spark.apache.org
> *Subject:* Missing data in spark output
>
>
> EXTERNAL SENDER. Do not click links or open attachments unless you
> recognize the sender and know the content is safe. DO NOT provide your
> username or password.
>
> Hello Everyone,
>
Is your spark job batch or streaming?
From: Sandeep Vinayak
Sent: Tuesday, October 18, 2022 19:48
To: dev@spark.apache.org
Subject: Missing data in spark output
EXTERNAL SENDER. Do not click links or open attachments unless you recognize
the sender and know
Hi,
We have observed similar behavior in older versions of spark. But we
were are currently using 3.3.0 where we have not seen such issues.
Which version of Spark and Hadoop are you using?
On 18/10/2022 19:48, Sandeep Vinayak wrote:
Hello Everyone,
We are recently observing an intermittent
Hello Everyone,
We are recently observing an intermittent data loss in the spark with
output to GCS (google cloud storage). When there are missing rows, they are
accompanied by duplicate rows. The re-run of the job doesn't have any
duplicate or missing rows. Since it's hard to debug, we are first