Hi all,

I am saving some hive- query results into the local directory:

val hdfsFilePath = "hdfs://master:ip/ tempFile ";
val localFilePath = "file:///home/hduser/tempFile";
hiveContext.sql(s"""my hql codes here""")
res.printSchema()  --working
res.show()   --working
res.map{ x => tranRow2Str(x) }.coalesce(1).saveAsTextFile(hdfsFilePath)  
--still working
res.map{ x => tranRow2Str(x) }.coalesce(1).saveAsTextFile(localFilePath)  
--wrong!

then at last, I get the correct results in hdfsFilePath, but nothing in 
localFilePath.
Btw, the localFilePath was created, but the folder was only with a _SUCCESS 
file, no part**** file.

See the track: (any thougt?)

15/11/04 09:57:41 INFO scheduler.DAGScheduler: Got job 4 (saveAsTextFile at 
myApp.scala:112) with 1 output partitions (allowLocal=false)
// the 112 line is the place I am using saveAsTextFile function to save the 
results locally.

15/11/04 09:57:41 INFO scheduler.DAGScheduler: Final stage: ResultStage 
42(saveAsTextFile at MyApp.scala:112)
15/11/04 09:57:41 INFO scheduler.DAGScheduler: Parents of final stage: 
List(ShuffleMapStage 41)
15/11/04 09:57:41 INFO scheduler.DAGScheduler: Missing parents: List()
15/11/04 09:57:41 INFO scheduler.DAGScheduler: Submitting ResultStage 42 
(MapPartitionsRDD[106] at saveAsTextFile at MyApp.scala:112), which has no 
missing parents
15/11/04 09:57:41 INFO storage.MemoryStore: ensureFreeSpace(160632) called with 
curMem=3889533, maxMem=280248975
15/11/04 09:57:41 INFO storage.MemoryStore: Block broadcast_28 stored as values 
in memory (estimated size 156.9 KB, free 263.4 MB)
15/11/04 09:57:41 INFO storage.MemoryStore: ensureFreeSpace(56065) called with 
curMem=4050165, maxMem=280248975
15/11/04 09:57:41 INFO storage.MemoryStore: Block broadcast_28_piece0 stored as 
bytes in memory (estimated size 54.8 KB, free 263.4 MB)
15/11/04 09:57:41 INFO storage.BlockManagerInfo: Added broadcast_28_piece0 in 
memory on 192.168.70.135:32836 (size: 54.8 KB, free: 266.8 MB)
15/11/04 09:57:41 INFO spark.SparkContext: Created broadcast 28 from broadcast 
at DAGScheduler.scala:874
15/11/04 09:57:41 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from 
ResultStage 42 (MapPartitionsRDD[106] at saveAsTextFile at MyApp.scala:112)
15/11/04 09:57:41 INFO scheduler.TaskSchedulerImpl: Adding task set 42.0 with 1 
tasks
15/11/04 09:57:41 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 
42.0 (TID 2018, 192.168.70.129, PROCESS_LOCAL, 5097 bytes)
15/11/04 09:57:41 INFO storage.BlockManagerInfo: Added broadcast_28_piece0 in 
memory on 192.168.70.129:54062 (size: 54.8 KB, free: 1068.8 MB)
15/11/04 09:57:47 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 
42.0 (TID 2018) in 6362 ms on 192.168.70.129 (1/1)
15/11/04 09:57:47 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 42.0, whose 
tasks have all completed, from pool
15/11/04 09:57:47 INFO scheduler.DAGScheduler: ResultStage 42 (saveAsTextFile 
at MyApp.scala:112) finished in 6.360 s
15/11/04 09:57:47 INFO scheduler.DAGScheduler: Job 4 finished: saveAsTextFile 
at MyApp.scala:112, took 6.588821 s
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/metrics/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/api,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/static,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/threadDump,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/executors,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/environment,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/rdd,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/storage,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/pool/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/pool,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/stage,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/stages,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/job/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/job,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs/json,null}
15/11/04 09:57:47 INFO handler.ContextHandler: stopped 
o.s.j.s.ServletContextHandler{/jobs,null}
15/11/04 09:57:47 INFO ui.SparkUI: Stopped Spark web UI at 
http://192.168.70.135:4040
15/11/04 09:57:47 INFO scheduler.DAGScheduler: Stopping DAGScheduler
15/11/04 09:57:47 INFO cluster.SparkDeploySchedulerBackend: Shutting down all 
executors
15/11/04 09:57:47 INFO cluster.SparkDeploySchedulerBackend: Asking each 
executor to shut down
15/11/04 09:57:47 INFO spark.MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
15/11/04 09:57:47 INFO util.Utils: path = 
/home/hduser/sparkTmp/spark-9b7a61ab-73a6-47af-87f6-fce4a5bbddb7/blockmgr-c5b7fdb9-f5ec-46b6-a1f0-d24287778c41,
 already present as root for deletion.
15/11/04 09:57:47 INFO storage.MemoryStore: MemoryStore cleared
15/11/04 09:57:47 INFO storage.BlockManager: BlockManager stopped
15/11/04 09:57:47 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
15/11/04 09:57:47 INFO 
scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
15/11/04 09:57:47 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
15/11/04 09:57:47 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
15/11/04 09:57:48 INFO spark.SparkContext: Successfully stopped SparkContext
15/11/04 09:57:48 INFO util.Utils: Shutdown hook called
15/11/04 09:57:48 INFO util.Utils: Deleting directory 
/tmp/spark-436a46ea-71fa-4b1b-ba39-06ed95a1af06
15/11/04 09:57:48 INFO util.Utils: Deleting directory 
/home/hduser/sparkTmp/spark-9b7a61ab-73a6-47af-87f6-fce4a5bbddb7

Best regards,
Jack

Reply via email to