[ 
https://issues.apache.org/jira/browse/IMPALA-13548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-13548.
------------------------------------
    Fix Version/s: Impala 5.0.0
       Resolution: Fixed

> Add a mode to schedule scan ranges in order of modification time
> ----------------------------------------------------------------
>
>                 Key: IMPALA-13548
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13548
>             Project: IMPALA
>          Issue Type: Task
>          Components: Backend
>    Affects Versions: Impala 4.5.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>
> When a file gets added to a table, the scheduler can have some instability in 
> how it assigns scan ranges. The scheduler is walking through the scan ranges 
> and handing them out in a single pass. If the new scan range is at the end of 
> the list, then there is minimal disruption. Every assignment would be the 
> same except the node that got the new scan range. However, if the new scan 
> range is early in the list, it's assignment can change subsequent assignments 
> of other scan ranges. This can cascade and result in an entirely different 
> assignment.
> This is bad for the tuple cache, because it makes it difficult to get cache 
> hits for a table that is ingesting data.
> If the scan ranges were ordered by modification time (ascending), then new 
> scan ranges for an ingest would be at the end of the list and cause minimal 
> disruption.
> We should add a mode that does this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to