Hi Devies. Thank you for the quick answer. I have a code like this:
.... sc = SparkContext(appName="TAD") lines = sc.textFile(sys.argv[1], 1) result = lines.map(doSplit).groupByKey().map(lambda (k,vc): traffic_process_model(k,vc)) result.saveAsTextFile(sys.argv[2]) Can you please give short example what should I do? Also I found only saveAsTextFile. Does PySpark has saveAsBinary options or what is the way to change text format output files? Thanks Oleg. On Fri, Nov 14, 2014 at 3:26 PM, Davies Liu <dav...@databricks.com> wrote: > One option maybe call HDFS tools or client to rename them after > saveAsXXXFile(). > > On Thu, Nov 13, 2014 at 9:39 PM, Oleg Ruchovets <oruchov...@gmail.com> > wrote: > > Hi , > > I am running pyspark job. > > I need serialize final result to hdfs in binary files and having ability > to > > give a name for output files. > > > > I found this post: > > > http://stackoverflow.com/questions/25293962/specifying-the-output-file-name-in-apache-spark > > > > but it explains how to do it using scala. > > > > Question: > > How to do it using pyspark > > > > Thanks > > Oleg. > > >