Hi Qian, Do your dataset use sparse vector format ?
On Mon, Apr 22, 2019 at 5:03 PM Qian He <hq.ja...@gmail.com> wrote: > Hi all, > > I'm using Spark provided LogisticRegression to fit a dataset. Each row of > the data has 1.7 million columns, but it is sparse with only hundreds of > 1s. The Spark Ui reported high GC time when the model is being trained. And > my spark application got stuck without any response. I have allocated 100 > executors and 8g for each executor. > > Is there any thing i should do to make the training process go > successfully? >