All the parameters except spark.executor.instances are specified in spark-default.conf located in hive's conf folder. So I think it's a yes.
I also checked on spark's web page when a hive on spark job is running, the parameters shown on the web page are exactly what I specified in the config file including spark.shuffle.service.enabled and spark.dynamicAllocation.enabled. Should I specify a fixed executor.instances in the file? But it's not good for me. By the way, the data source of my query is parquet files. In hive side I just created a external table from the parquet. Thanks, Minghao Feng ________________________________ From: Mich Talebzadeh <mich.talebza...@gmail.com> Sent: Friday, September 9, 2016 4:49:55 PM To: user Subject: Re: hive on spark job not start enough executors when you start hive on spark do you set any parameters for the submitted job (or read them from init file)? set spark.master=yarn; set spark.deploy.mode=client; set spark.executor.memory=3g; set spark.driver.memory=3g; set spark.executor.instances=2; set spark.ui.port=7777; Dr Mich Talebzadeh LinkedIn https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw http://talebzadehmich.wordpress.com Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 9 September 2016 at 09:30, ?? ? <qiuff...@hotmail.com<mailto:qiuff...@hotmail.com>> wrote: Hi there, I encountered a problem that makes hive on spark with a very low performance. I'm using spark 1.6.2 and hive 2.1.0, I specified spark.shuffle.service.enabled true spark.dynamicAllocation.enabled true in my spark-default.conf file (the file is in both spark and hive conf folder) to make spark job to get executors dynamically. The configuration works correctly when I run spark jobs, but when I use hive on spark, it only started a few executors although there are more enough cores and memories to start more executors. For example, for the same SQL query, if I run on sparkSQL, it can start more than 20 executors, but with hive on spark, only 3. How can I improve the performance on hive on spark? Any suggestions please. Thanks, Minghao Feng