Does the job work when your spark master is configured as local in zeppelin 
spark interpreter config ?  If that works well then your real issue may be in 
your spark/yarn cluster.

Also, have you tried doing a spark-submit manually for a similar job on command 
line pointing to the cluster ?

-Rahul

On 11/23/18, 2:28 AM, "Nabeel Imtiaz" <nimti...@gmail.com> wrote:

    Nothing much useful. Following are the interpreter logs I can see before 
the job just hangs:
    
    INFO [2018-11-23 14:25:58,517] ({pool-2-thread-3} 
SchedulerFactory.java[jobStarted]:109) - Job 20181123-112240_1827913615 started 
by scheduler interpreter_34089707
    INFO [2018-11-23 14:25:59,241] ({pool-2-thread-3} 
FileInputFormat.java[listStatus]:253) - Total input paths to process : 1
    INFO [2018-11-23 14:25:59,299] ({pool-2-thread-3} 
Logging.scala[logInfo]:54) - Starting job: take at <console>:28
    INFO [2018-11-23 14:25:59,317] ({dag-scheduler-event-loop} 
Logging.scala[logInfo]:54) - Got job 0 (take at <console>:28) with 1 output 
partitions
    INFO [2018-11-23 14:25:59,319] ({dag-scheduler-event-loop} 
Logging.scala[logInfo]:54) - Final stage: ResultStage 0 (take at <console>:28)
    INFO [2018-11-23 14:25:59,320] ({dag-scheduler-event-loop} 
Logging.scala[logInfo]:54) - Parents of final stage: List()
    INFO [2018-11-23 14:25:59,323] ({dag-scheduler-event-loop} 
Logging.scala[logInfo]:54) - Missing parents: List()
    INFO [2018-11-23 14:25:59,328] ({dag-scheduler-event-loop} 
Logging.scala[logInfo]:54) - Submitting ResultStage 0 
(/tmp/earthquake/GEM-GHEC-v1_2.txt MapPartitionsRDD[1] at textFile at 
<console>:25), which has no missing parents
    
    
    Nabeel
    
    > On Nov 23, 2018, at 12:33 PM, 王刚 <zjuwa...@gmail.com> wrote:
    > 
    > Is there something useful information your local  spark process  log?
    > 
    >> 在 2018年11月23日,下午4:20,Nabeel Imtiaz <nimti...@gmail.com> 写道:
    >> 
    >> Hi,
    >> 
    >> 
    >> When I trying to even simply take first 10 lines of a file (like 
```batchData.take(10).foreach(println _)``` from sprakContext, the paragraph 
hangs. 
    >> 
    >> If I inspect the job in spark console, it shows the job in PENDING 
state. I check that I have more than enough memory in the system available. 
    >> 
    >> Is it a known issue? Any fixes or workaround?
    >> 
    >> 
    >> 
    >> Nabeel
    > 
    
    

Reply via email to