Getting back to improvements for small sort runs, it might make sense to
consider using in-register based sorting via sorting networks for very small
runs.
This talk is specific to database sorting and illustrates how such an approach
can be vectorized:
https://youtu.be/HeFwVNHsDzc?list=PLSE8O
> "Looks like"?
I cannot find the reference, but I've read a while back that a well-known
company from Redwood uses it for their in-memory columnar storage. That might
have just been a rumor or might have been research only - not sure. It does not
really matter anyways.
> SortTuples are curren
I'd like to revamp this important discussion.
As is well described in this fairly recent paper here
https://www.vldb.org/pvldb/vol9/p204-leis.pdf (which also looks at Postgres)
"estimation errors quickly grow as the number of joins increases, and that
these errors are usually the reason for bad
> Right. But that seems fraught with difficulty. I suspect that the
> costs that the planner attributes to each plan often aren't very
> reliable in any absolute sense, even when everything is working very
> well by every available metric. Even a very noisy cost model with
> somewhat inaccurate sel
> In general I suspect that we'd be better off focussing on mitigating
> the impact at execution time. There are at least a few things that we
> could do there, at least in theory. Mostly very ambitious, long term
> things.
I think these things are orthogonal. No matter how good the cost model e
> Agreed, but dealing with uncertainty in those numbers is an enormous
> task if you want to do it right. "Doing it right", IMV, would start
> out by extending all the selectivity estimation functions to include
> error bars; then we could have error bars on rowcount estimates and
> then costs; th
> Actually, if we wanted to improve things in this area, we should have a
> set of queries that don't chose optimal plans we can test with. We used
> to see them a lot before we had extended statistics, but I don't
> remember seeing many recently, let alone a collection of them. I guess
> that is
> Sure, but the model isn't the problem here, really -- not to me. The
> problem is that the planner can in some cases choose a plan that is
> inherently unreasonable, at least in practical terms. You're talking
> about uncertainties. But I'm actually talking about the opposite thing
> -- certainty
iT1zBJ6A%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/flat/CAEYLb_W%2B%2BUhrcWprzG9TyBVF7Sn-c1s9oLbABvAvPGdeP2DFSQ%40mail.gmail.com
[6]
https://www.postgresql.org/message-id/flat/683635b8-381b-5b08-6069-d6a45de19a12%40enterprisedb.com#12683b7a6c566eb5b926369b5fd2d41f
--
Benjamin Co
such an approach before? Please let me know your thoughts.
Cheers,
Benjamin
[1]
https://www.postgresql.org/message-id/flat/CAFBsxsHanJTsX9DNJppXJxwg3bU%2BYQ6pnmSfPM0uvYUaFdwZdQ%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/flat/CAApHDvoTTtoQYfp3d0kTPF6y1pjexgLwquzKmjzvjC9N
> convenient if you know the type at compile time. See the attached,
> which I had laying around when I was looking at PDQuicksort. I believe
> it's similar to what you have in mind.
That looks very promising.
I also love your recent proposal of partitioning into null and non-null. I
suspect that
Greetings,
I would like to revisit the discussion and concur with John's perspective that
incremental progress through small, successive modifications is the appropriate
approach to move forward. Therefore, I would like to propose two distinct ideas
aimed at enhancing the functionality of inser
additional check for n > 7 in the quicksort code path (see
/src/include/lib/sort_template.h#L322). Currently we check n < 7 and a few
lines down we check for n > 7, if we check n < 8 for insertion sort then the
second check becomes obsolete.
Benjamin Coutu
http://www.zeyos.com
13 matches
Mail list logo