Dear List,

I know how tedious mails with a subject of the type "I don't understand
what the planner does" are, but on this one I'm really stumped.
Regrettably, the situation is also a bit complex.  Hopefully, someone
will bear with me.

So, in a PostgreSQL 15.12 I have a view over a single table with ~20
columns (the only relevant columns here are the ones that somehow
contain "pub[lisher]_did", the others are just there for context; I'm
going to call these "pubdids" from here on in the prose):

  CREATE OR REPLACE VIEW ivoa.obs_radio AS
   SELECT main.obs_publisher_did,
      main.s_resolution_min,
      main.s_resolution_max,
      NULL::real AS s_fov_min,
      [...]
     FROM emi.main

(emi.main is a physical table).

There is another view made up of about 20 tables, looking somewhat
like this:

CREATE OR REPLACE VIEW ivoa.obscore AS
 SELECT 'image'::text AS dataproduct_type,
    NULL::text AS dataproduct_subtype,
    2::smallint AS calib_level,
    'PPAKM31'::text AS obs_collection,
                [...]
    'ivo://org.gavo.dc/~?'::text || gavo_urlescape(maps.accref) AS 
obs_publisher_did,
   [...]
   FROM ppakm31.maps
UNION ALL
        [lots of similar definitions]
UNION ALL
 SELECT ssa.ssa_dstype AS dataproduct_type,
    NULL::text AS dataproduct_subtype,
    [...]
    ssa.ssa_pubdid AS obs_publisher_did,
    [...]
   FROM dfbsspec.ssa
UNION ALL
    [and still more]

The dfbsspec.ssa in this definition is another view:

CREATE OR REPLACE VIEW dfbsspec.ssa AS
 SELECT q.accref,
    q.owner,
    [...]
    q.ssa_pubdid,
    [...]
   FROM ( SELECT raw_spectra.accref,
           [...]
           raw_spectra.pub_did AS ssa_pubdid,
               [...]
           FROM dfbsspec.raw_spectra
             LEFT JOIN dfbsspec.platemeta ON platemeta.plateid = 
raw_spectra.plate) q

raw_spectra finally is a physical table that has an index:

    "raw_spectra_pub_did" btree (pub_did)

The first view, ivoa.obs_radio, is just a few hundred records,
dfbsspec.raw_spectra is about 23 Megarows, the total ivoa.obscore is
about 100 MRows which occasionally change, so materialising it is
*really* unattractive.  The pubdids are strings of about 40 characters.

You may argue that this whole system looks a bit insane, but of course
this is part of a large metadata handling suite, and all these views
are, in some sense, more or less automatic adaptations to different
metadata schemes, and dramatic simplifications are at least not entriely
trivial.  So, can you assume for the moment that I can't get rid of the
nested views?

Now, when I say

  EXPLAIN ANALYZE SELECT COUNT(*)
    FROM ivoa.obscore
    JOIN ivoa.obs_radio
    USING (obs_publisher_did);

I get:

         Finalize Aggregate  (cost=5114082.70..5114082.71 rows=1 width=8) 
(actual time=22595.715..22731.950 rows=1 loops=1)
        [...]
                         ->  Parallel Append  (cost=0.56..4800918.33 
rows=19267799 width=40) (actual time=1.566..18985.964 rows=15410027 loops=5)
                                 ->  Parallel Index Only Scan using phot_r_pkey 
on phot_r  (cost=0.56..754384.72 rows=5118036 width=32) (actual 
time=0.854..7995.762 rows=10197024 loops=2)
                                         Heap Fetches: 0

        [...and  lot more of these that have simple pubdid indexes on plain
        tables, the point being: Postgres *does* use pubdid indexes...]

                                 ->  Subquery Scan on "*SELECT* 13"  
(cost=0.00..2685028.32 rows=5803266 width=58) (actual time=0.142..7554.269 
rows=4642657 loops=5)
                                         ->  Parallel Seq Scan on raw_spectra  
(cost=0.00..2626995.66 rows=5803266 width=756) (actual time=0.137..6841.379 
rows=4642657 loops=5)
        [... and a few more seqscans where there's no index on the pubdid
        because they are small, and one or two similar cases]

My problem is: I can't seem to figure out why Postgres chooses to ignore
the pubdid index on raw_spectra.pub_did and instead does the
time-consuming seqscan.

I thought maybe the genetic optimiser has kicked in because of the large
number of tables and SELECTs in there and chose a suboptimal plan.  But
switching off the genetic optimiser doesn't change the plan.

Trying to investigate more closely, I wanted to simplify the
situation and created a view like ivoa.obscore but only having the
evil table in it:

CREATE TEMPORARY VIEW bla AS (SELECT
                       [...]
                       CAST(ssa_pubdid AS text) AS obs_publisher_did,
                       [...]
FROM dfbsspec.ssa)

When I then say

EXPLAIN ANALYZE SELECT COUNT(*)
  FROM ivoa.obs_radio
  JOIN bla USING (obs_publisher_did);

the query plan looks like this:

 Aggregate  (cost=4873.00..4873.01 rows=1 width=8) (actual time=2.484..2.486 
rows=1 loops=1)
   ->  Nested Loop  (cost=0.56..4871.60 rows=561 width=0) (actual 
time=2.478..2.479 rows=0 loops=1)
         ->  Seq Scan on main  (cost=0.00..52.61 rows=561 width=48) (actual 
time=0.011..0.317 rows=561 loops=1)
         ->  Index Scan using raw_spectra_pub_did on raw_spectra  
(cost=0.56..8.58 rows=1 width=66) (actual time=0.003..0.003 rows=0 loops=561)
               Index Cond: (pub_did = main.obs_publisher_did)
 Planning Time: 5.386 ms
 Execution Time: 2.750 ms

-- exactly as it should.

So, when the SELECT statement on dfbsspec.ssa stands along in the view
definition, Postgres does the right thing; when the exact same query
stands in a UNION ALL with other tables, Postgres doesn't use the
index.  Hu?

Is there anything that would explain that behaviour given I've switched
off the genetic optimiser and postgres has hopefully exhaustively
searched the space of plans in both cases?

Thanks a lot!

         -- Markus



Reply via email to