Thanks Marcelo. The code trying to read the file always runs in the driver. I
understand the problem with other master-deployment but will it work in
local, yarn-client & yarn-cluster deployments.. that's all I care for now
:-)
Also what is the suggested way to do something like this ? Put the fil
Hi chinchu,
Where does the code trying to read the file run? Is it running on the
driver or on some executor?
If it's running on the driver, in yarn-cluster mode, the file should
have been copied to the application's work directory before the driver
is started. So hopefully just doing "new FileIn
Thanks Andrew.
I understand the problem a little better now. There was a typo in my earlier
mail & a bug in the code (causing the NPE in SparkFiles). I am using the
--master yarn-cluster (not local). And in this mode, the
com.test.batch.modeltrainer.ModelTrainerMain - my main-class will run on the
Thanks Andrew. that helps
On Fri, Sep 19, 2014 at 5:47 PM, Andrew Or-2 [via Apache Spark User List] <
ml-node+s1001560n14708...@n3.nabble.com> wrote:
> Hey just a minor clarification, you _can_ use SparkFiles.get in your
> application only if it runs on the executors, e.g. in the following way:
>
Hey just a minor clarification, you _can_ use SparkFiles.get in your
application only if it runs on the executors, e.g. in the following way:
sc.parallelize(1 to 100).map { i => SparkFiles.get("my.file") }.collect()
But not in general (otherwise NPE, as in your case). Perhaps this should be
docum
Hi Chinchu,
SparkEnv is an internal class that is only meant to be used within Spark.
Outside of Spark, it will be null because there are no executors or driver
to start an environment for. Similarly, SparkFiles is meant to be used
internally (though it's privacy settings should be modified to ref