Execution efficiency slows down as the number of CPU cores increases

2022-02-10 Thread 15927907...@163.com
Hello,
I recently used spark3.2 to do a test based on the TPC-DS dataset, and the 
entire TPC-DS data scale is 1TB(located in HDFS). But I encountered a problem 
that I couldn't understand, and I hope to get your help.
The SQL statement tested is as follows:
 select count(*) from (
select distinct c_last_name, c_first_name, d_date
 from store_sales, data_dim, customer
where store_sales.ss_sold_date_sk = date_dim.d_date_sk
 and store_sales.ss_customer_sk = customer.c_customer_sk
 and d_month_seq between 1200 and 1200 + 11
)hot_cust
 limit 100;

Three tables are used here, and their sizes are: store_sales(387.97GB), 
date_dim(9.84GB),customer(1.52GB)

When submitting the application, the driver and executor parameters are set as 
follows:
DRIVER_OPTIONS="--master spark://xx.xx.xx.xx:7078 --total-executor-cores 48 
--conf spark.driver.memory=100g --conf spark.default.parallelism=100 --conf 
spark.sql.adaptive.enabled=false" 
 EXECUTOR_OPTIONS="--executor-cores 12 --conf spark.executor.memory=50g --conf 
spark.task.cpus=1 --conf spark.sql.shuffle.partitions=100 --conf 
spark.sql.crossJoin.enabled=true"

The Spark cluster is built on a physical node with only one worker, and the 
submitted tasks are in StandAlone mode. A total of 56 cores, 376GB of memory. I 
kept executor-num=4 unchanged (that is, the total executors remained 
unchanged), and adjusted executor-cores=2, 4, 8, 12. The corresponding 
application time was: 96s, 66s, 50s, 49s. It was found that the total- 
executor-cores >=32, the time-consuming is almost unchanged. 
I monitored CPU utilization, IO, and found no bottlenecks. Then I analyzed the 
task time in the longest stage of each application in the spark web UI 
interface(The longest time-consuming stage in the four test cases is stage4, 
and the corresponding time-consuming: 48s, 28s, 22s, 23s). We see that with the 
increase of cpu cores, the task time in stage4 will also increase. The average 
task time is: 3s, 4s, 5s, 7s. And each executor takes longer to run the first 
batch of tasks than subsequent tasks. The more cores there are, the greater the 
time gap:

Test-Case-ID total-executor-cores executor-cores total application time  stage4 
time average task time  The first batch of tasks takes time  The remaining 
batch tasks take time  
1 8 2 96s 48s 7s 5-6s 3-4s 
2 16 4 66s 28s 5s 7-8s ~4s 
3 32 8 50s 23s 4s 10-12s ~5s 
4 48 12 49s 22s 3s 14-15s ~6s 

This result confuses me, why after increasing the number of cores, the 
execution time of tasks becomes longer. I compared the data size processed by 
each task in different cases, they are all consistent. Or why is the execution 
efficiency almost unchanged after the number of cores increases to a certain 
number.
Looking forward to your reply.



15927907...@163.com


Which manufacturers' GPUs support Spark?

2022-02-16 Thread 15927907...@163.com
Hello,
We have done some Spark GPU accelerated work using the spark-rapids 
component(https://github.com/NVIDIA/spark-rapids). However, we found that this 
component currently only supports Nvidia GPU, and on the official Spark 
website, we did not see the manufacturer's description of the GPU supported by 
spark(https://spark.apache.org/docs/3.2.1/configuration.html#custom-resource-scheduling-and-configuration-overview).
 So, Can Spark also support GPUs from other manufacturers? such as AMD.
Looking forward to your reply.



15927907...@163.com