hi,

I have a streaming job and quite often executors die (due to memory errors/
"unable to find location for shuffle etc) during the processing. I started
digging and found that some of the tasks are concentrated to one executor,
just as below:
[image: image.png]

Can this be the reason?
Should I repartition the underlying data before I execute a groupby on the
top of it?

Any advice is welcome

Thanks
Andras

Reply via email to