Hello Kurt Deschler, Abhishek Rawat, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/21277

to look at the new patch set (#9).

Change subject: IMPALA-12988: Calculate an unbounded version of CpuAsk
......................................................................

IMPALA-12988: Calculate an unbounded version of CpuAsk

Planner calculates CpuAsk through a recursive call beginning at
Planner.computeBlockingAwareCores(), which is called after
Planner.computeEffectiveParallelism(). It does blocking operator
analysis over the selected degree of parallelism that was decided during
computeEffectiveParallelism() traversal. That selected degree of
parallelism, however, is already bounded by min and max parallelism
config, derived from PROCESSING_COST_MIN_THREADS and
MAX_FRAGMENT_INSTANCES_PER_NODE options accordingly.

This patch calculates an unbounded version of CpuAsk that is not bounded
by min and max parallelism config. It is purely based on the fragment's
ProcessingCost and query plan relationship constraint (for example, the
number of JOIN BUILDER fragments should equal the number of destination
JOIN fragments for partitioned join).

Frontend will receive both bounded and unbounded CpuAsk values from
TQueryExecRequest on each executor group set selection round. The
unbounded CpuAsk is then scaled down once using a square-root-based
sublinear-function, controlled by the total cpu count of the smallest
executor group set and the bounded CpuAsk number. Another linear scaling
is then applied on both bounded and unbounded CpuAsk using
QUERY_CPU_COUNT_DIVISOR option. Frontend then picks the maximum between
bounded CpuAsk and unbounded CpuAsk numbers to avoid assigning a query
to a small executor group set too soon. The last executor group set
stays as the "catch-all" executor group set.

After this patch, the "max-parallelism" field in the query plan will all
be set with maximum parallelism based on ProcessingCost. The CpuAsk
counter is changed to show the unbounded CpuAsk after scaling. A new
counter CpuAskBounded shows the bounded CpuAsk after scaling. The
EffectiveParallelism counter remains unchanged, showing bounded CpuAsk
before scaling.

Testing:
- Update and pass FE test TpcdsCpuCostPlannerTest and
  PlannerTest#testProcessingCost.
- Pass EE test tests/query_test/test_tpcds_queries.py
- Pass custom cluster test tests/custom_cluster/test_executor_groups.py

Change-Id: I5441e31088f90761062af35862be4ce09d116923
---
M be/src/scheduling/scheduler.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/CostingSegment.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q12.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q14b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q20.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q22.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q28.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q36.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q37.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q38.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q43.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q44.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q47.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q50.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q51.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q54.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q57.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q58.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q59.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q62.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q64.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q65.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q67.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q70.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q72.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q74.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q75.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q76.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q77.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q78.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q80.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q83.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q84.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q85.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q86.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q87.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q92.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q93.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q95.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q97.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q99.test
M 
testdata/workloads/functional-query/queries/QueryTest/processing-cost-admission-slots.test
M tests/custom_cluster/test_executor_groups.py
73 files changed, 710 insertions(+), 528 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/21277/9
--
To view, visit http://gerrit.cloudera.org:8080/21277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5441e31088f90761062af35862be4ce09d116923
Gerrit-Change-Number: 21277
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>

Reply via email to