Re: pyspark - memory leak leading to OOM after submitting 100 jobs?

2019-11-01 Thread Holden Karau
On Thu, Oct 31, 2019 at 10:04 PM Nicolas Paris wrote: > have you deactivated the spark.ui ? > I have read several thread explaining the ui can lead to OOM because it > stores 1000 dags by default > > > On Sun, Oct 20, 2019 at 03:18:20AM -0700, Paul Wais wrote: > > Dear List, > > > > I've observed

Best practices for data like file storage

2019-11-01 Thread Patrick McCarthy
Hi List, I'm looking for resources to learn about how to store data on disk for later access. For a while my team has been using Spark on top of our existing hdfs/Hive cluster without much agency as far as what format is used to store the data. I'd like to learn more about how to re-stage my data

XGBoost Spark One Model Per Worker Integration

2019-11-01 Thread grp
Hi There Spark Users, Been trying to follow allow to this posted gxboost spark databricks notebook (https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1526931011080774/3624187670661048/6320440561800420/latest.html