Re: Changing SQL Inlining Behaviour (or...?)

Paul Ramsey Tue, 13 Nov 2018 08:32:19 -0800

On Fri, Nov 9, 2018 at 11:14 AM Paul Ramsey <pram...@cleverelephant.ca>
wrote:


> > Neglected to include the footnotes...
>
> As I promised at PgConf.eu, here's a follow-up email about PostGIS
> parallelism and function inlining (an odd combination, it is true).
>
> So, context:
>
> - We want PostGIS to parallelize more. In order to achieve that we need to
> mark our functions with more realistic COSTs. Much much higher COSTs.
> - When we do that, we hit a different problem. Our most commonly used
> functions, ST_Intersects(), ST_DWithin() are actually SQL wrapper functions
> are more complex combinations of index operators and exact computational
> geometry functions.
> - In the presence of high cost parameters that are used multiple times in
> SQL functions, PostgreSQL will stop inlining those functions, in an attempt
> to save the costs of double-calculating the parameters.
> - For us, that's the wrong choice, because we lose the index operators at
> the same time as we "save" the cost of double calculation.
> - We need our wrapper functions inlined, even when they are carrying a
> high COST.
>
> At pgconf.eu, I canvassed this problem and some potential solutions:
>
> * Solution #1 - Quick and dirty and visible: Add an 'INLINE' function
> decorator, which tells PostgreSQL to just ignore costs and inline the
> function regardless. Pros: it's not too hard to implement and I'm happy to
> contribute this. Cons: it adds very specific single-purpose syntax to
> CREATE FUNCTION.
>
> * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that
> achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have
> the inlining logic look at the cost of the wrapper and the cost of
> parameters, and if the cost of the wrapper "greatly exceeded" the cost of
> the parameters, then inline. So the PostGIS project would just set the cost
> of our wrappers very high, and we'd get the behaviour we want, while other
> users who want to use wrappers to force caching of calculations would have
> zero coded wrapper functions.  Pros: Solves the problem and easy to
> implement, I'm happy to contribute. Cons: it's so clearly a hack involving
> hidden (from users) magic.
>
> * Solution #3 - Correct and globally helpful. When first presented with
> this problem last year, both Andres and Tom said [1] "but the right fix is
> to avoid the double-calculation of identical entries in the target list"
> because then it would be safe to inline functions with duplicate expensive
> parameters. This would not only address the proximate PostGIS problem but
> make a whole class of queries faster. There was some discussion of this
> approach last week [2]. Pros: The right thing! Improves a whole pile of
> other performance cases. Cons: Hard! Only experienced PgSQL developers need
> apply.
>
> Naturally, I would love to see #3 implemented, but there's only so much
> experienced developer time to go around, and it's beyond my current skill
> set. I would like to be able to start to improve PostGIS parallelism with
> PgSQL 12, so in order to make that not impossible, I'd like to implement
> either #1 or #2 in case #3 doesn't happen for PgSQL 12.
>
> So my question to hackers is: which is less worse, #1 or #2, to implement
> and submit to commitfest, in case #3 does not materialize in time for PgSQL
> 12?
>

Absent any preferences, I would be inclined to go with #2, having a high
personal tolerance for ugly hacks... :)

P



>
> [1]
> https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv36uh%40alap3.anarazel.de
> [2] https://www.postgresql.org/message-id/10355.1540926295%40sss.pgh.pa.us
>
>

Re: Changing SQL Inlining Behaviour (or...?)

Reply via email to