Re: Logistic Regression Iterations causing High GC in Spark 2.3

2019-07-29 Thread Dhrubajyoti Hati
Actually I didn't have any of the GC tuning in the beginning and then adding them also didn't made any difference. As mentioned earlier I tried low number executors of higher configuration and vice versa. Nothing helps. About the code its simple logistic regression nothing with explicit broadcast o

Re: Logistic Regression Iterations causing High GC in Spark 2.3

2019-07-29 Thread Jörn Franke
I would remove the all GC tuning and add it later once you found the underlying root cause. Usually more GC means you need to provide more memory, because something has changed (your application, spark Version etc.) We don’t have your full code to give exact advise, but you may want to rethink

Logistic Regression Iterations causing High GC in Spark 2.3

2019-07-28 Thread Dhrubajyoti Hati
Hi, We were running Logistic Regression in Spark 2.2.X and then we tried to see how does it do in Spark 2.3.X. Now we are facing an issue while running a Logistic Regression Model in Spark 2.3.X on top of Yarn(GCP-Dataproc). In the TreeAggregate method it takes a huge time due to very High GC Acti