[jira] [Updated] (CALCITE-4522) Sort cost should account for the number of columns in collation

hqx (Jira) Sun, 14 Mar 2021 03:44:05 -0700


     [ 
https://issues.apache.org/jira/browse/CALCITE-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


hqx updated CALCITE-4522:
-------------------------
    Description: 
The old method to compute the cost of sort has some problem.
 # When the RelCollation is empty, there is no need to sort, but it still 
compute the cpu cost of sort.
 # use n * log\(n) * row_byte to estimate the cpu cost may be inaccurate, where 
n means the output row count of the sort operator, and row_byte means the 
average bytes of one row .

Instead, I give follow suggestion.
 # the cpu cost is zero if the RelCollation is empty.
 # let heap_size be min(offset + fetch, input_count), and use input_count * 
max(1, log(heap_size))* row_byte to compute the cpu cost.

  was:
The old method to compute the cost of sort has some problem.
 # When the RelCollation is empty, there is no need to sort, but it still 
compute the cpu cost of sort.
 # use n * log(n) * row_byte to estimate the cpu cost may be inaccurate, where 
n means the output row count of the sort operator, and row_byte means the 
average bytes of one row .

Instead, I give follow suggestion.
 # the cpu cost is zero if the RelCollation is empty.
 # let heap_size be min(offset + fetch, input_count), and use input_count * 
max(1, log(heap_size))* row_byte to compute the cpu cost.


> Sort cost should account for the number of columns in collation
> ---------------------------------------------------------------
>
>                 Key: CALCITE-4522
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4522
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: hqx
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> The old method to compute the cost of sort has some problem.
>  # When the RelCollation is empty, there is no need to sort, but it still 
> compute the cpu cost of sort.
>  # use n * log\(n) * row_byte to estimate the cpu cost may be inaccurate, 
> where n means the output row count of the sort operator, and row_byte means 
> the average bytes of one row .
> Instead, I give follow suggestion.
>  # the cpu cost is zero if the RelCollation is empty.
>  # let heap_size be min(offset + fetch, input_count), and use input_count * 
> max(1, log(heap_size))* row_byte to compute the cpu cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CALCITE-4522) Sort cost should account for the number of columns in collation

Reply via email to