> > >> At pgconf.eu, I canvassed this problem and some potential solutions: >> >
I wonder if there is a middle ground between #2 and #3. A proper mechanism for deduplicating entries might be hard, but on the inlining stage we already know they're going to get duplicated. Can we make a subplan/lateral join/whatever for arguments and deduplicate just them, as we know about that duplication from structure already? Another thing I see here is PostGIS using indexes. Can we just always inline if we see an index-accelerated operator (or just an operator) on top level AND inside inlined function? > * Solution #1 - Quick and dirty and visible: Add an 'INLINE' function >> decorator, which tells PostgreSQL to just ignore costs and inline the >> function regardless. Pros: it's not too hard to implement and I'm happy to >> contribute this. Cons: it adds very specific single-purpose syntax to >> CREATE FUNCTION. >> >> * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that >> achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have >> the inlining logic look at the cost of the wrapper and the cost of >> parameters, and if the cost of the wrapper "greatly exceeded" the cost of >> the parameters, then inline. So the PostGIS project would just set the cost >> of our wrappers very high, and we'd get the behaviour we want, while other >> users who want to use wrappers to force caching of calculations would have >> zero coded wrapper functions. Pros: Solves the problem and easy to >> implement, I'm happy to contribute. Cons: it's so clearly a hack involving >> hidden (from users) magic. >> >> * Solution #3 - Correct and globally helpful. When first presented with >> this problem last year, both Andres and Tom said [1] "but the right fix is >> to avoid the double-calculation of identical entries in the target list" >> because then it would be safe to inline functions with duplicate expensive >> parameters. This would not only address the proximate PostGIS problem but >> make a whole class of queries faster. There was some discussion of this >> approach last week [2]. Pros: The right thing! Improves a whole pile of >> other performance cases. Cons: Hard! Only experienced PgSQL developers need >> apply. >> >> Naturally, I would love to see #3 implemented, but there's only so much >> experienced developer time to go around, and it's beyond my current skill >> set. I would like to be able to start to improve PostGIS parallelism with >> PgSQL 12, so in order to make that not impossible, I'd like to implement >> either #1 or #2 in case #3 doesn't happen for PgSQL 12. >> >> So my question to hackers is: which is less worse, #1 or #2, to implement >> and submit to commitfest, in case #3 does not materialize in time for PgSQL >> 12? >> > > Absent any preferences, I would be inclined to go with #2, having a high > personal tolerance for ugly hacks... :) > > P > > > >> >> [1] >> https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv36uh%40alap3.anarazel.de >> [2] >> https://www.postgresql.org/message-id/10355.1540926295%40sss.pgh.pa.us >> >> -- Darafei Praliaskouski Support me: http://patreon.com/komzpa