On Thu, Nov 9, 2017 at 12:11 PM, Paul Ramsey <pram...@cleverelephant.ca>
wrote:

> All,
>
> As we try and make PostGIS more "parallel sensitive" we have been added
> costs to our functions, so that their relative CPU cost is more accurately
> reflected in parallel plans.
>
> This has resulted in an odd side effect: some of our "wrapper" functions
> stop giving index scans in plans [1]. This is a problem!
>
> An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an
> index operation (geom1 && geom2) with an exact spatial test
> (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and
> has worked for us well for a decade and more. Having this construct stop
> working is definitely a problem.
>
> As we add costs to our functions, the odds increase that one of the
> parameters to a wrapper might be a costed function. It's not uncommon to
> see:
>
> ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326))
>
> It's fair to say that we really do depend on our wrappers getting inlined
> basically all the time. They are simple functions, they do nothing other
> than 'SELECT func1() AND func2() AND arg1 && arg2'.
>
> However, once costs are added to the parameters, the inlining can be
> turned off relatively quickly. Here's a PgSQL native example:
>
>     -- Create data table and index. Analyze.
>     DROP TABLE IF EXISTS boxen;
>     CREATE TABLE boxen AS
>         SELECT row_number() OVER() As gid,
>         box(point(x, y),point(x+1, y+1)) AS b, x, y
>         FROM generate_series(-100,100) As y, generate_series(-100,100) As
> x;
>     CREATE INDEX idx_b_geom_gist ON boxen USING gist(b);
>     ANALYZE boxen;
>
>     -- An inlined function
>     -- When set 'STRICT' it breaks index access
>     -- However 'IMMUTABLE' doesn't seem to bother it
>     CREATE OR REPLACE FUNCTION good_box(box, box)
>         RETURNS boolean
>         AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2)))
> < 3'
>         LANGUAGE 'sql';
>
>     -- Start with a low cost circle()
>     ALTER FUNCTION circle(point, double precision) COST 1;
>
>     -- [A] Query plan hits index
>     EXPLAIN SELECT gid
>     FROM boxen
>     WHERE good_box(
>         boxen.b,
>         box(circle(point(20.5, 20.5), 2))
>         );
>
>     -- [B] Query plan hits index
>     EXPLAIN SELECT gid
>     FROM boxen,
>         (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
>     WHERE good_box(
>         boxen.b,
>         box(circle(point(c.x, c.y), 2))
>         );
>
>     -- Increase cost of circle
>     ALTER FUNCTION circle(point, double precision) COST 100;
>
>     -- [B] Query plan does not hit index!
>     EXPLAIN SELECT gid
>     FROM boxen,
>         (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c
>     WHERE good_box(
>         boxen.b,
>         box(circle(point(c.x, c.y), 2))
>         );
>
> The inlining is getting tossed out on a test of how expensive the function
> parameters are [2]. As a result, we lose what is really the correct plan,
> and get a sequence scan instead of an index scan.
>
> The test of parameter cost seems quite old (15+ years) and perhaps didn't
> anticipate highly variable individual function costs (or maybe it did). As
> it stands though, PostGIS is currently stuck choosing between having costs
> on our functions or having our inlined wrappers, because we cannot have
> both at the same time.
>

I personally find it hard to make the case for not inlining all the time,
but that's probably a lack of imagination. The functions in question pass
all the other tests of "inlinability" the function cost one seems arbitrary.



>
> Any thoughts?
>
> Thanks!
>
> P.
>
>
> [1] https://trac.osgeo.org/postgis/ticket/3675#comment:18
> [2] https://github.com/postgres/postgres/blob/
> ae20b23a9e7029f31ee902da08a464d968319f56/src/backend/
> optimizer/util/clauses.c#L4581-L4584
>

Reply via email to