Hi,
once again lets start with the requirement. Why are you trying to pass xml
and json files to SPARK instead of reading them in SPARK?
Generally when people pass on files they are python or jar files.
Regards,
Gourav
On Sat, May 15, 2021 at 5:03 AM Amit Joshi
wrote:
> Hi KhajaAsmath,
>
> Cli
Hi KhajaAsmath,
Client vs Cluster: In client mode driver runs in the machine from where you
submit your job. Whereas in cluster mode driver runs in one of the worker
nodes.
I think you need to pass the conf file to your driver, as you are using it
in the driver code, which runs in one of the work
Here is my updated spark submit without any luck.,
spark-submit --master yarn --deploy-mode cluster --files
/appl/common/ftp/conf.json,/etc/hive/conf/hive-site.xml,/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
--num-executors 6 --executor-cores 3 --driver-cores 3 --driver-memory 7g
Sorry my bad, it did not resolve the issue. I still have the same issue.
can anyone please guide me. I was still running as a client instead of a
cluster.
On Fri, May 14, 2021 at 5:05 PM KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:
> You are right. It worked but I still don't understand
You are right. It worked but I still don't understand why I need to pass
that to all executors.
On Fri, May 14, 2021 at 5:03 PM KhajaAsmath Mohammed <
mdkhajaasm...@gmail.com> wrote:
> I am using json only to read properties before calling spark session. I
> don't know why we need to pass that to
I am using json only to read properties before calling spark session. I
don't know why we need to pass that to all executors.
On Fri, May 14, 2021 at 5:01 PM Longjiang.Yang
wrote:
> Could you check whether this file is accessible in executors? (is it in
> HDFS or in the client local FS)
> /appl
Hi,
I am having a weird situation where the below command works when the
deploy mode is a client and fails if it is a cluster.
spark-submit --master yarn --deploy-mode client --files
/etc/hive/conf/hive-site.xml,/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
--driver-memory 70g --n
I have a single source of data. The processing of records have to be directed
to multiple destinations. i.e
1. read the source data
2. based on condition route to the following sources
1. Kafka for error records
2. store success records with certain condition in s3 bucket, bucket
name : "A
Hi Meikel,
If you want to run Spark Thrift Server on Kubernetes, take a look at my blog
post: https://itnext.io/hive-on-spark-in-kubernetes-115c8e9fa5c1
Cheers,
- Kidong Lee.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
--
Hi all,
We migrate to k8s and I wonder whether there are already "good practices" to
run thrift2 on k8s?
Best,
Meikel
10 matches
Mail list logo