On Fri, Nov 14, 2014 at 12:14 AM, Oleg Ruchovets wrote:
> Hi Devies.
> Thank you for the quick answer.
>
> I have a code like this:
>
>
>
> sc = SparkContext(appName="TAD")
> lines = sc.textFile(sys.argv[1], 1)
> result = lines.map(doSplit).groupByKey().map(lambda (k,vc):
> traffic_process_mo
Hi Devies.
Thank you for the quick answer.
I have a code like this:
sc = SparkContext(appName="TAD")
lines = sc.textFile(sys.argv[1], 1)
result = lines.map(doSplit).groupByKey().map(lambda (k,vc):
traffic_process_model(k,vc))
result.saveAsTextFile(sys.argv[2])
Can you please give short e
One option maybe call HDFS tools or client to rename them after saveAsXXXFile().
On Thu, Nov 13, 2014 at 9:39 PM, Oleg Ruchovets wrote:
> Hi ,
> I am running pyspark job.
> I need serialize final result to hdfs in binary files and having ability to
> give a name for output files.
>
> I found th
Hi ,
I am running pyspark job.
I need serialize final result to *hdfs in binary files* and having ability
to give a *name for output files*.
I found this post:
http://stackoverflow.com/questions/25293962/specifying-the-output-file-name-in-apache-spark
but it explains how to do it using scala.