[
https://issues.apache.org/jira/browse/CALCITE-4522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301261#comment-17301261
]
Julian Hyde commented on CALCITE-4522:
--------------------------------------
[~871], Thanks for hanging in there. It must be unpleasant to have reviewers
warring with each other, especially on your first contribution to a project.
Here is my proposed cost model:
* If fetch is zero, cpu cost is zero; otherwise,
* if there are no sort keys, cost is min(fetch + offset, inputRowCount) *
bytesPerRow; otherwise
* cost is inputRowCount * log(min(fetch + offset, inputRowCount)) * bytesPerRow.
I think a method {{Util.nLogM(n, m)}} would be useful, where {{n}} is the
number of input rows, and {{m}} is the number of active rows (and therefore
determines the number of times each row is compared and/or moved). We would
call it as follows: {{Util.nLogM(inputRowCount, fetch + offset))}}. It would
make sure that {{m}} is at least e (and therefore log is at least 1), and make
sure that {{m}} is no greater than {{n}}.
> Sort cost should account for the number of columns in collation
> ---------------------------------------------------------------
>
> Key: CALCITE-4522
> URL: https://issues.apache.org/jira/browse/CALCITE-4522
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: hqx
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 9h 20m
> Remaining Estimate: 0h
>
> The old method to compute the cost of sort has some problem.
> # When the RelCollation is empty, there is no need to sort, but it still
> compute the cpu cost of sort.
> # use n * log\(n) * row_byte to estimate the cpu cost may be inaccurate,
> where n means the output row count of the sort operator, and row_byte means
> the average bytes of one row .
> Instead, I give follow suggestion.
> # the cpu cost is zero if the RelCollation is empty.
> # let heap_size be min(offset + fetch, input_count), and use input_count *
> max(1, log(heap_size))* row_byte to compute the cpu cost.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)