hmm... I guess this is meant to cc @Bingkun Pan ?
On 2024/03/05 02:16:12 Hyukjin Kwon wrote:
> Is this related to https://github.com/apache/spark/pull/42428?
>
> cc @Yang,Jie(INF)
>
> On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim
> wrote:
>
> > Shall we revisit this functionality? The API doc i
That sounds like a great suggestion.
发件人: Jungtaek Lim
日期: 2024年3月5日 星期二 10:46
收件人: Hyukjin Kwon
抄送: yangjie01 , Dongjoon Hyun ,
dev , user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released
Yes, it's relevant to that PR. I wonder, if we want to expose version switcher,
it should be in versionle
Yes, it's relevant to that PR. I wonder, if we want to expose version
switcher, it should be in versionless doc (spark-website) rather than the
doc being pinned to a specific version.
On Tue, Mar 5, 2024 at 11:18 AM Hyukjin Kwon wrote:
> Is this related to https://github.com/apache/spark/pull/42
Is this related to https://github.com/apache/spark/pull/42428?
cc @Yang,Jie(INF)
On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim
wrote:
> Shall we revisit this functionality? The API doc is built with individual
> versions, and for each individual version we depend on other released
> versions. This
Thanks Jason for detailed information and big associated with it.
Hopefully someone provided more information about this pressing issue.
On Mon, Mar 4, 2024 at 1:26 PM Jason Xu wrote:
> Hi Prem,
>
> From the symptom of shuffle fetch failure and few duplicate data and few
> missing data, I think
Hi Prem,
>From the symptom of shuffle fetch failure and few duplicate data and few
missing data, I think you might run into this correctness bug:
https://issues.apache.org/jira/browse/SPARK-38388.
Node/shuffle failure is hard to avoid, I wonder if you have
non-deterministic logic and calling repa
super :(
On Mon, Mar 4, 2024 at 6:19 AM Mich Talebzadeh
wrote:
> "... in a nutshell if fetchFailedException occurs due to data node reboot
> then it can create duplicate / missing data . so this is more of
> hardware(env issue ) rather than spark issue ."
>
> As an overall conclusion your p
"... in a nutshell if fetchFailedException occurs due to data node reboot
then it can create duplicate / missing data . so this is more of
hardware(env issue ) rather than spark issue ."
As an overall conclusion your point is correct but again the answer is not
binary.
Spark core relies on a