Hi,
I think you should write to HDFS then copy file (parquet or orc) from HDFS
to MinIO.
eabour
From: Prem Sahoo
Date: 2024-05-22 00:38
To: Vibhor Gupta; user
Subject: Re: EXT: Dual Write to HDFS and MinIO in faster way
On Tue, May 21, 2024 at 6:58 AM Prem Sahoo wrote:
Hello Vibhor,
Th
hdfs-site.xml, for instance,
fs.oss.impl, etc.
eabour
From: Eugene Miretsky
Date: 2023-11-16 09:58
To: eab...@163.com
CC: Eugene Miretsky; user @spark
Subject: Re: [EXTERNAL] Re: Spark-submit without access to HDFS
Hey!
Thanks for the response.
We are getting the error because there is no ne
Hi Eugene,
I think you should Check if the HDFS service is running properly. From the
logs, it appears that there are two datanodes in HDFS, but none of them are
healthy. Please investigate the reasons why the datanodes are not functioning
properly. It seems that the issue might be due t
.jar
2023/09/09 10:08 513,968 jackson-module-scala_2.12-2.15.2.jar
eabour
From: Bjørn Jørgensen
Date: 2023-11-02 16:40
To: eab...@163.com
CC: user @spark; Saar Barhoom; moshik.vitas
Subject: Re: jackson-databind version mismatch
[SPARK-43225][BUILD][SQL] Remove jackson-core-asl and
Hi,
Please check the versions of jar files starting with "jackson-". Make sure
all versions are consistent. jackson jar list in spark-3.3.0:
2022/06/10 04:3775,714 jackson-annotations-2.13.3.jar
2022/06/10 04:37 374,895 jackson-core-2.13.3.jar
2022/06/
d.
eabour
From: eab...@163.com
Date: 2023-10-20 15:56
To: user @spark
Subject: spark.stop() cannot stop spark connect session
Hi,
my code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.remote("sc://172.29.190.147").getOrCreate()
import pandas as pd
# 创建pa
Hi Team.
I use spark 3.5.0 to start Spark cluster with start-master.sh and
start-worker.sh, when I use ./bin/spark-shell --master
spark://LAPTOP-TC4A0SCV.:7077 and get error logs:
```
23/10/24 12:00:46 ERROR TaskSchedulerImpl: Lost an executor 1 (already
removed): Command exited with code
Hi,
my code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.remote("sc://172.29.190.147").getOrCreate()
import pandas as pd
# 创建pandas dataframe
pdf = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"gender": ["F", "M", "M"]
})
# 将pandas
start the spark connect server as a service for client tests.
So, I believe that by configuring the spark.plugins and starting the Spark
cluster on Kubernetes, clients can utilize sc://ip:port to connect to the
remote server.
Let me give it a try.
eabour
From: eab...@163.com
Date: 2023
Hi all,
Has the spark connect server running on k8s functionality been implemented?
From: Nagatomi Yasukazu
Date: 2023-09-05 17:51
To: user
Subject: Re: Running Spark Connect Server in Cluster Mode on Kubernetes
Dear Spark Community,
I've been exploring the capabilities of the Spark Conn
Hi All,
I have a CDH5.16.2 hadoop cluster with 1+3 nodes(64C/128G, 1NN/RM + 3DN/NM),
and yarn with 192C/240G. I used the following test scenario:
1.spark app resource with 2G driver memory/2C driver vcore/1 executor nums/2G
executor memory/2C executor vcore.
2.one spark app will use 5G4C on yar
-ff7ed3cd4076
2021-08-12 09:21:22 INFO SessionState:641 - Created HDFS directory:
/tmp/hive/hdfs/8bc342dd-aa0b-407b-b9ad-ff7ed3cd4076/_tmp_space.db
===
eab...@163.com
From: igyu
Date: 2021-08-12 11:33
To: user
Subject: How can I config hive.metastore.warehouse.dir
is way required absolute jaas.conf file and keyTab file, in other words,
these files must be placed in the same path and on each node, Is there a better
way?
Please help.
Regards
eab...@163.com
13 matches
Mail list logo