Please keep in mind i'm fairly new to spark.
I have some spark code where i load two textfiles as datasets and after some
map and filter operations to bring the columns in a specific shape, i join
the datasets.
The join takes place on a common column (of type string).
Is there any way to avoid the
How can i check what exactly is stagnant? Do you mean on the DAG
visualization on Spark UI?
Sorry i'm new to spark.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@s
I've built a spark job in which an external program is called through the use
of pipe().
Job runs correctly on cluster when the input is a small sample dataset but
when the input is a real large dataset it stays on RUNNING state forever.
I've tried different ways to tune executor memory, executor
I'm trying to run a c++ program on spark cluster by using the rdd.pipe()
operation but the executors throw: java.lang.IllegalStateException:
Subprocess exited with status 132.
The spark jar runs totally fine on standalone and the c++ program runs just
fine on its own as well. I've tried with anoth
When using rdd pipe(script), i get the following error :
"java.lang.IllegalStateException: Subprocess exited with status 132. Command
ran: "./script -h"
I'm getting this while trying to run my external script with a simple "-h"
argument to test that its running smoothly through my Spark code.
Whe
Thanks a lot for the answer! It solved my problem.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi, im trying to run an external script on spark using rdd.pipe() and
although it runs successfully on standalone, it throws an error on cluster.
The error comes from the executors and it's : "Cannot run program
"path/to/program": error=2, No such file or directory".
Does the external script need