Hi community,

I am running hundreds of Spark jobs at the same time, which cause Hive Metastore connection numbers to be very high (> 1K), since the jobs do not use HMS really, so I wish to disable that, I have tried setting spark.sql.catalogImplementation config to in-memory, which is said to be useful but it turns out not. Any suggestion would be appreciated !

code:

spark = SparkSession \
    .builder \
    .appName(“test") \
    .config("spark.sql.catalogImplementation", "in-memory") \
    .config("spark.executor.memory", "1g") \
    .getOrCreate()

spark-submit command:

spark2-submit \
    --master yarn \
    --deploy-mode cluster \
    --name "test"\
    --conf spark.sql.catalogImplementation=in-memory \
    test.py \
   
Spark version: 2.2.0
Hadoop version: 2.6.0

--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to