[
https://issues.apache.org/jira/browse/IMPALA-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Riza Suminto reassigned IMPALA-14006:
-------------------------------------
Assignee: Riza Suminto
> Scheduler::CreateInputCollocatedInstances may overparallelize
> -------------------------------------------------------------
>
> Key: IMPALA-14006
> URL: https://issues.apache.org/jira/browse/IMPALA-14006
> Project: IMPALA
> Issue Type: Bug
> Components: Distributed Exec
> Affects Versions: Impala 5.0.0
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Critical
> Attachments: profile_614699efbc5e2ed9_cb9d6aca00000000.txt
>
>
> IMPALA-11604 (part 2) change how many instances to create in
> Scheduler::CreateInputCollocatedInstances.
> [https://github.com/apache/impala/blame/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L915-L920]
> This work when left child fragment of a parent fragment is distributed across
> nodes. However, there is a possible corner case where the left child fragment
> instance is limited to only 1 node, like SELECT fragment in
> [^profile_614699efbc5e2ed9_cb9d6aca00000000.txt]
> {noformat}
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> Per-Instance Resources: mem-estimate=4.27MB mem-reservation=4.00MB
> thread-reservation=1
> max-parallelism=1 segment-costs=[17820]
> 04:SELECT
> | predicates: dense_rank() < CAST(10 AS BIGINT)
> | mem-estimate=0B mem-reservation=0B thread-reservation=0
> | tuple-ids=7,6 row-size=12B cardinality=730 cost=7300
> | in pipelines: 02(GETNEXT)
> |
> 03:ANALYTIC
> | functions: dense_rank()
> | order by: id ASC
> | window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
> | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB
> thread-reservation=0
> | tuple-ids=7,6 row-size=12B cardinality=7.30K cost=7300
> | in pipelines: 02(GETNEXT)
> |
> 06:MERGING-EXCHANGE [UNPARTITIONED]
> | order by: id ASC
> | mem-estimate=33.50KB mem-reservation=0B thread-reservation=0
> | tuple-ids=7 row-size=4B cardinality=7.30K cost=1904
> | in pipelines: 02(GETNEXT){noformat}
> Due to existence of MERGING-EXCHANGE, this fragment is limited to just 1
> instance in 1 node. If its parent fragment has high cost, the parent fragment
> might schedule hundreds of instances, but
> Scheduler::CreateInputCollocatedInstances will force them to colocate in
> single executor node where the SELECT fragment scheduled.
> This, in turn, will hit sanity check in
> Scheduler::CheckEffectiveInstanceCount, which enforce that no fragment should
> be parallelized over 128 instances per executor node.
> [https://github.com/apache/impala/blob/3c24706c72818a1668159a428d4f2afcadea9f27/be/src/scheduling/scheduler.cc#L456-L463]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]