In yours file  /home/spark/real-estate/pullhttp/pull_apartments.py

replace import org.apache.spark.SparkContext with from pyspark import
SparkContext

man. 21. aug. 2023 kl. 15:13 skrev Kal Stevens <kalgstev...@gmail.com>:

> I am getting a class not found error
>     import org.apache.spark.SparkContext
>
> It sounds like this is because pyspark is not installed, but as far as I
> can tell it is.
> Pyspark is installed in the correct python verison
>
>
> root@namenode:/home/spark/# pip3.10 install pyspark
> Requirement already satisfied: pyspark in
> /usr/local/lib/python3.10/dist-packages (3.4.1)
> Requirement already satisfied: py4j==0.10.9.7 in
> /usr/local/lib/python3.10/dist-packages (from pyspark) (0.10.9.7)
>
>
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /__ / .__/\_,_/_/ /_/\_\   version 3.4.1
>       /_/
>
> Using Python version 3.10.12 (main, Jun 11 2023 05:26:28)
> Spark context Web UI available at http://namenode:4040
> Spark context available as 'sc' (master = yarn, app id =
> application_1692452853354_0008).
> SparkSession available as 'spark'.
> Traceback (most recent call last):
>   File "/home/spark/real-estate/pullhttp/pull_apartments.py", line 11, in
> <module>
>     import org.apache.spark.SparkContext
> ModuleNotFoundError: No module named 'org.apache.spark.SparkContext'
> 2023-08-20T19:45:19,242 INFO  [Thread-5] spark.SparkContext: SparkContext
> is stopping with exitCode 0.
> 2023-08-20T19:45:19,246 INFO  [Thread-5] server.AbstractConnector: Stopped
> Spark@467be156{HTTP/1.1, (http/1.1)}{0.0.0.0:4040}
> 2023-08-20T19:45:19,247 INFO  [Thread-5] ui.SparkUI: Stopped Spark web UI
> at http://namenode:4040
> 2023-08-20T19:45:19,251 INFO  [YARN application state monitor]
> cluster.YarnClientSchedulerBackend: Interrupting monitor thread
> 2023-08-20T19:45:19,260 INFO  [Thread-5]
> cluster.YarnClientSchedulerBackend: Shutting down all executors
> 2023-08-20T19:45:19,260 INFO  [dispatcher-CoarseGrainedScheduler]
> cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to
> shut down
> 2023-08-20T19:45:19,263 INFO  [Thread-5]
> cluster.YarnClientSchedulerBackend: YARN client scheduler backend Stopped
> 2023-08-20T19:45:19,267 INFO  [dispatcher-event-loop-29]
> spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint
> stopped!
> 2023-08-20T19:45:19,271 INFO  [Thread-5] memory.MemoryStore: MemoryStore
> cleared
> 2023-08-20T19:45:19,271 INFO  [Thread-5] storage.BlockManager:
> BlockManager stopped
> 2023-08-20T19:45:19,275 INFO  [Thread-5] storage.BlockManagerMaster:
> BlockManagerMaster stopped
> 2023-08-20T19:45:19,276 INFO  [dispatcher-event-loop-8]
> scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
> OutputCommitCoordinator stopped!
> 2023-08-20T19:45:19,279 INFO  [Thread-5] spark.SparkContext: Successfully
> stopped SparkContext
> 2023-08-20T19:45:19,687 INFO  [shutdown-hook-0] util.ShutdownHookManager:
> Shutdown hook called
> 2023-08-20T19:45:19,688 INFO  [shutdown-hook-0] util.ShutdownHookManager:
> Deleting directory
> /tmp/spark-9375452d-1989-4df5-9d85-950f751ce034/pyspark-2fcfbc8e-fd40-41f5-bf8d-e4c460332895
> 2023-08-20T19:45:19,689 INFO  [shutdown-hook-0] util.ShutdownHookManager:
> Deleting directory /tmp/spark-bf6cbc46-ad8b-429a-9d7a-7d98b7d7912e
> 2023-08-20T19:45:19,690 INFO  [shutdown-hook-0] util.ShutdownHookManager:
> Deleting directory /tmp/spark-9375452d-1989-4df5-9d85-950f751ce034
> 2023-08-20T19:45:19,691 INFO  [shutdown-hook-0] util.ShutdownHookManager:
> Deleting directory /tmp/localPyFiles-6c113b2b-9ac3-45e3-9032-d1c83419aa64
>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Reply via email to