Em ter., 20 de jul. de 2021 às 05:35, David Rowley <dgrowle...@gmail.com> escreveu:
> On Tue, 20 Jul 2021 at 01:10, James Coleman <jtc...@gmail.com> wrote: > > To be clear up front: I'm in favor of the patch, and I don't want to > > put unnecessary stumbling blocks up for it getting committed. So if we > > decide to proceed as is, that's fine with me. > > Thanks for making that clear. > > > But I'm not sure that the "cost model, unfortunately, does not account > > for that" is entirely accurate. The end of cost_tuplesort contains a > > cost "per extracted tuple". It does, however, note that it doesn't > > charge cpu_tuple_cost, which maybe is what you'd want to fully > > incorporate this into the model. But given this run_cost isn't about > > accounting for comparison cost (that's been done earlier) which is the > > part that'd be the same between tuple and datum sort, it seems to me > > that we could lower the cpu_operator_cost here by something like 10% > > if it's byref and 30% if it's byval? > > I failed to notice that last part that adds the additional > cpu_operator_cost. > > The default cpu_operator_cost is 0.0025, so with the 10k tuple > benchmark, that adds an additional charge of 25 to the total cost. > > If we take test 1 from my results on v5 as an example: > > > Test1 446.1 657.3 147.32% > > Looking at explain for that query: > > regression=# explain select two from tenk1 order by two offset 1000000; > QUERY PLAN > ---------------------------------------------------------------------- > Limit (cost=1133.95..1133.95 rows=1 width=4) > -> Sort (cost=1108.97..1133.95 rows=9995 width=4) > Sort Key: two > -> Seq Scan on tenk1 (cost=0.00..444.95 rows=9995 width=4) > (4 rows) > > If we want the costs to reflect reality again here then we'd have > reduce 1133.95 by something like 147.32% (the performance difference). > That would bring the cost down to 769.72, which is way more than we > have to play with than the 25 that the cpu_operator_cost * tuples > gives us. > > If we reduced the 25 by 30% in this case, we'd get 17.5 and the total > cost would become 1126.45. That's not great considering the actual > performance indicates that 769.72 would be a better number. > > If we look at John's result for test 1: He saw 588 tps on master and > 998 on v8. 1133.95 / 998.0 * 588.0 = 668.09, so we'd need even more > to get close to reality on that machine. > > My thoughts are that the small surcharge added at the end of > cost_tuplesort() is just not enough for us to play with. I think to > get us closer to fixing this correctly would require a redesign of the > tuplesort costing entirely. I think that would be about an order of > magnitude more effort than this patch was, so I really feel like I > don't want to do this. > I understand that redesign would require a lot of work, but why not do it step by step? > I kinda feel that since the comparison_cost is always just 2.0 * > cpu_operator_cost regardless of the number of columns in the sort, > then if we add too many new smarts to try and properly adjust for this > new optimization, unless we do a completly new cost modal for this, > then we might as well be putting lipstick on a pig. > I think one first step is naming this 2.0? Does this magic number don't have a good name? > > It sounds like James mostly just mentioned the sorting just to ensure > it was properly considered and does not really feel strongly that it > needs to be adjusted. Does anyone else feel that we should be > adjusting it? > I took a look at cost_tuplesort and I think that some small adjustments could be made as part of the improvement process. It is attached. 1. long is a very problematic type; better int64? 2. 1024 can be int, not long? 3. 2 changed all to 2.0 (double)? 4. If disk-based is not needed, IMO can we avoid calling relation_byte_size? Finally, to at least document (add comments) those conclusions, would be nice, wouldn't it? regards, Ranier Vilela
costsize.patch
Description: Binary data