Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21277 )
Change subject: IMPALA-12988: Calculate an unbounded version of CpuAsk ...................................................................... IMPALA-12988: Calculate an unbounded version of CpuAsk Planner calculates CpuAsk through a recursive call beginning at Planner.computeBlockingAwareCores(), which is called after Planner.computeEffectiveParallelism(). It does blocking operator analysis over the selected degree of parallelism that was decided during computeEffectiveParallelism() traversal. That selected degree of parallelism, however, is already bounded by min and max parallelism config, derived from PROCESSING_COST_MIN_THREADS and MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly. This patch calculates an unbounded version of CpuAsk that is not bounded by min and max parallelism config. It is purely based on the fragment's ProcessingCost and query plan relationship constraint (for example, the number of JOIN BUILDER fragments should equal the number of destination JOIN fragments for partitioned join). Frontend will receive both bounded and unbounded CpuAsk values from TQueryExecRequest on each executor group set selection round. The unbounded CpuAsk is then scaled down once using a nth root based sublinear-function, controlled by the total cpu count of the smallest executor group set and the bounded CpuAsk number. Another linear scaling is then applied on both bounded and unbounded CpuAsk using QUERY_CPU_COUNT_DIVISOR option. Frontend then compare the unbounded CpuAsk after scaling against CpuMax to avoid assigning a query to a small executor group set too soon. The last executor group set stays as the "catch-all" executor group set. After this patch, setting COMPUTE_PROCESSING_COST=True will show following changes in query profile: - The "max-parallelism" fields in the query plan will all be set to maximum parallelism based on ProcessingCost. - The CpuAsk counter is changed to show the unbounded CpuAsk after scaling. - A new counter CpuAskBounded shows the bounded CpuAsk after scaling. If QUERY_CPU_COUNT_DIVISOR=1 and PLANNER_CPU_ASK slot counting strategy is selected, this CpuAskBounded is also the minimum total admission slots given to the query. - A new counter MaxParallelism shows the unbounded CpuAsk before scaling. - The EffectiveParallelism counter remains unchanged, showing bounded CpuAsk before scaling. Testing: - Update and pass FE test TpcdsCpuCostPlannerTest and PlannerTest#testProcessingCost. - Pass EE test tests/query_test/test_tpcds_queries.py - Pass custom cluster test tests/custom_cluster/test_executor_groups.py Change-Id: I5441e31088f90761062af35862be4ce09d116923 Reviewed-on: http://gerrit.cloudera.org:8080/21277 Reviewed-by: Kurt Deschler <[email protected]> Reviewed-by: Abhishek Rawat <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M be/src/scheduling/scheduler.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/planner/CostingSegment.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q28.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q36.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q38.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q43.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q44.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q47.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q50.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q51.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q57.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q58.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q59.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q62.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q65.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q67.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q70.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q72.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q75.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q76.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q78.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q83.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q84.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q85.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q86.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q87.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q93.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q97.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q99.test M testdata/workloads/functional-query/queries/QueryTest/processing-cost-admission-slots.test M tests/custom_cluster/test_executor_groups.py 72 files changed, 729 insertions(+), 522 deletions(-) Approvals: Kurt Deschler: Looks good to me, but someone else must approve Abhishek Rawat: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/21277 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5441e31088f90761062af35862be4ce09d116923 Gerrit-Change-Number: 21277 Gerrit-PatchSet: 16 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]>
