Hi Qian,

Do your dataset use sparse vector format ?



On Mon, Apr 22, 2019 at 5:03 PM Qian He <hq.ja...@gmail.com> wrote:

> Hi all,
>
> I'm using Spark provided LogisticRegression to fit a dataset. Each row of
> the data has 1.7 million columns, but it is sparse with only hundreds of
> 1s. The Spark Ui reported high GC time when the model is being trained. And
> my spark application got stuck without any response. I have allocated 100
> executors and 8g for each executor.
>
> Is there any thing i should do to make the training process go
> successfully?
>

Reply via email to