So I think I have a better idea of the problem now. The environment is YARN client and IIRC PySpark doesn't run on YARN cluster.
So my client is heavily loaded which causes iy loose a lot of e executors which might be part of the problem. Btw any plans in supporting PySpark in YARN clusters mode? On Aug 7, 2014 3:04 PM, "Davies Liu" <dav...@databricks.com> wrote: > What is the environment ? YARN or Mesos or Standalone? > > It will be more helpful if you could show more loggings. > > On Wed, Aug 6, 2014 at 7:25 PM, Avishek Saha <avishek.s...@gmail.com> > wrote: > > Hi, > > > > I get a lot of executor lost error for "saveAsTextFile" with PySpark > > and Hadoop 2.4. > > > > For small datasets this error occurs but since the dataset is small it > > gets eventually written to the file. > > For large datasets, it takes forever to write the final output. > > > > Any help is appreciated. > > Avishek > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > >