On 30/06/18 03:03, Tomas Vondra wrote:
On 06/29/2018 04:51 PM, Teodor Sigaev wrote:
I tried to attack the cost_sort() issues and hope on that basis we
can solve problems with 0002 patch and improve incremental sort patch.
OK, will do. Thanks for working on this!
I hope, now we have a better cost_sort(). The obvious way is a try
all combination of pathkeys in get_cheapest_group_keys_order() and
choose cheapest one by cost_sort().
But it requires N! operations and potentially could be very
expensive in case of large number of pathkeys and doesn't solve the
issue with user-knows-what-he-does pathkeys.
Not sure. There are N! combinations, but this seems like a good
candidate for backtracking [1]. You don't have to enumerate and
evaluate all N! combinations, just construct one and then abandon
whole classes of combinations as soon as they get more expensive than
the currently best one. That's thanks to additive nature of the
comparison costing, because appending a column to the sort key can
only make it more expensive. My guess is this will make this a non-issue.
[1] https://en.wikipedia.org/wiki/Backtracking
We could suggest an order of pathkeys as patch suggests now and if
cost_sort() estimates cost is less than 80% (arbitrary chosen) cost
of user-suggested pathkeys then it use our else user pathkeys.
I really despise such arbitrary thresholds. I'd much rather use a more
reliable heuristics by default, even if it gets it wrong in some cases
(which it will, but that's natural).
regards
Additionally put an upper limit threshold on the number of combinations
to check, fairly large by default?
If first threshold is exceeded, could consider checking out a few more
selected at random from paths not yet checked, to avoid any bias caused
by stopping a systematic search. This might prove important when N! is
fairly large.
Cheers,
Gavin