[ 
https://issues.apache.org/jira/browse/IMPALA-13645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933430#comment-17933430
 ] 

David Rorke commented on IMPALA-13645:
--------------------------------------

After some discussion with [~joemcdonnell] another potential solution would be 
some form of work stealing of scan ranges across executors.

> Account for impact of runtime filters when scheduling scan ranges
> -----------------------------------------------------------------
>
>                 Key: IMPALA-13645
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13645
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: David Rorke
>            Priority: Major
>
> Runtime filters can introduce significant skew in the number of scan ranges 
> scanned by a given executor or scan fragment.  Initial scan range scheduling 
> attempts to balance ranges across hosts and with multithreading (e.g. 
> mt_dop>0) there's additional balancing done locally within a given executor, 
> but neither of these balancing steps account for the impact of runtime 
> filters.
> One option for remote, partition filters might be to wait for the filters to 
> arrive at the coordinator, apply the filters prior to scheduling, and then 
> balance the surviving scan ranges during scheduling.  Another more limited 
> approach would be to only use the filters for balancing across fragments 
> within a given executor (wait for the filter to arrive and then prune the 
> ranges assigned to that executor and hand them out to fragments in a 
> deterministic way).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to