Re: Tasks are skewed to one executor

2021-04-12 Thread Gourav Sengupta
Hi, looks like you have answered some questions whcih I generally ask. Another thing, can you please let me know the environment? Is it AWS, GCP, Azure, Databricks, HDP, etc? Regards, Gourav On Sun, Apr 11, 2021 at 8:39 AM András Kolbert wrote: > Hi, > > Sure! > > Application: > - Spark versio

Re: Tasks are skewed to one executor

2021-04-11 Thread Mich Talebzadeh
Hi Andra, Thanks for the detail. So basically you are doing an ETL on the incoming stream. As I understand you have account, product and metric in your streaming data. Is it likely your data is skewed (non-uniform) due to excessive presentation of an account or product? What key(s) is used in yo

Re: Tasks are skewed to one executor

2021-04-11 Thread András Kolbert
Hi, Sure! Application: - Spark version 2.4 - Kafka Stream (DStream, from a kafka 0.8 brokers) - 7 executors, 2cores, 3700M memory size Logic: - Process initialises a dataframe that contains metrics for an account/product metrics (e.g. {"account":A, "product": X123, "metric"; 51} - After initiali

Re: Tasks are skewed to one executor

2021-04-10 Thread Mich Talebzadeh
Hi, Can you provide a bit more info please? How are you running this job and what is the streaming framework (kafka, files etc)? HTH Mich view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibilit

Tasks are skewed to one executor

2021-04-10 Thread András Kolbert
hi, I have a streaming job and quite often executors die (due to memory errors/ "unable to find location for shuffle etc) during the processing. I started digging and found that some of the tasks are concentrated to one executor, just as below: [image: image.png] Can this be the reason? Should I