I think you could use `repartition` to make sure there would be no empty
partitions.
You could also try `coalesce` to combine partitions , but it can't make sure
there are no more empty partitions.
Best Regards,
Yi Tian
tianyi.asiai...@gmail.com
On Oct 18, 2014, at 20:30, jan.zi...@centrum
Hi,
I am developing program using Spark where I am using filter such as:
cleanedData = distData.map(json_extractor.extract_json).filter(lambda x: x !=
None and x != '')
cleanedData.saveAsTextFile(sys.argv[3])
It happens to me that there is saved lot of empty files (probably from those
part