[ 
https://issues.apache.org/jira/browse/IMPALA-10287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18012731#comment-18012731
 ] 

ASF subversion and git services commented on IMPALA-10287:
----------------------------------------------------------

Commit cc1cbb559a849bc75b49bc44f4869e8eb8d9e47f in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=cc1cbb559 ]

IMPALA-14263: Add broadcast_cost_scale_factor option

This commit enhances the distributed planner's costing model for
broadcast joins by introducing the `broadcast_cost_scale_factor` query
option. This option enables users to fine-tune the planner's decision
between broadcast and partitioned joins.

Key changes:
- The total broadcast cost is scaled by the new
  `broadcast_cost_scale_factor` query option, allowing users to favor or
  penalize broadcast joins as needed when setting query hint is not
  feasible.
- Updated the planner logic and test cases to reflect the new costing
  model and options.

This addresses scenarios where the default costing could lead to
suboptimal join distribution choices, particularly in a large-scale
cluster where the number of executors can increase broadcast cost, while
choosing a partitioned strategy can lead to data skew. Admin can set
`broadcast_cost_scale_factor` less than 1.0 to make DistributedPlanner
favor broadcast more than partitioned join (with possible downside of
higher memory usage per query and higher network transmission).

Existing query hints still take precedence over this option. Note that
this option is applied independent of `broadcast_to_partition_factor`
option (see IMPALA-10287). In MT_DOP>1 setup, it should be sufficient to
set `use_dop_for_costing=True` and tune `broadcast_to_partition_factor`
only.

Testing:
Added FE tests.

Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133
Reviewed-on: http://gerrit.cloudera.org:8080/23258
Reviewed-by: Wenzhe Zhou <[email protected]>
Reviewed-by: Aman Sinha <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Distribution strategy is sub-optimal for certain queries
> --------------------------------------------------------
>
>                 Key: IMPALA-10287
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10287
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.4.0
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>             Fix For: Impala 4.0.0
>
>
> I ran a simplified query (extracted from q78 of TPC-DS) on a 600GB dataset on 
> an 8 node cluster. I forced the distribution strategy for the left outer join 
> and compared Broadcast vs Hash Partition for different values of mt_dop.  The 
> example query and results are shown below (elapsed times are in seconds):
> {noformat}
> Query (with shuffle or broadcast hint):
> select count(*)
>    from store_sales
>    left join [shuffle] store_returns on sr_ticket_number=ss_ticket_number 
>          and ss_item_sk=sr_item_sk
>    join date_dim on ss_sold_date_sk = d_date_sk
>    where sr_ticket_number is null
>    and d_year=2002;
> {noformat}
> ||mt_dop||Broadcast||Partition||
> |1|45|15|
> |2|37|9|
> |4|33|5|
> |8|31|4|
> |12|31|4|
> Given the nearly 7.5x speedup for partition distribution at mt_dop = 12 
> (which is the default), it indicates that the cost formula comparing the 
> broadcast vs partition needs to be modified to take into account the mt_dop.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to