Dear PostgreSQL Hacker Community,

I am facing a tricky bug which makes the Query Planner crashes when using 
COUNT(*) function.
Without any upgrade suddenly a table of a database instance could not be 
queried this way:

SELECT COUNT(*) FROM items;

-- ERROR:  variable not found in subplan target list
-- SQL state: XX000

Message and behaviour seem related to the Query Planner:


EXPLAIN SELECT COUNT(*) FROM item;

-- ERROR:  variable not found in subplan target list

-- SQL state: XX000

Looks like a column name could not be found (see 
https://github.com/postgres/postgres/blob/ce4f46fdc814eb1b704d81640f6d8f03625d0f53/src/backend/optimizer/plan/setrefs.c#L2967-L2972)
 in some specific context that is somehow hard to reproduce.

Interesting facts:


SELECT COUNT(id) FROM items;  -- 213



SELECT COUNT(*) FROM items WHERE id > 0; -- 213

Work as expected.

I can see that other people are recently facing a similar problem 
(https://www.postgresql.org/message-id/flat/4c347490-d734-5fdd-d613-1327601b4e7e%40mit.edu).
If it is the same bug then it is not related to the PGroonga extension as I 
don't use it all.

Anyway, the bug is difficult to reproduce on my application.
At the time of writing, I could just isolate it on a specific database but I 
could not draw a MCVE from it.
I am looking for help to make it reproducible and feed your knowledge database.

My first guess was to open a post of SO (see for details 
https://stackoverflow.com/questions/72498741/how-can-i-reproduce-a-database-context-to-debug-a-tricky-postgresql-error-vari),
 but digging deeper in the investigation it seems it will require people with 
strong insights on how PostgreSQL actually works under the hood.
Therefore, I chose this specific mailing list.

The bug is tricky to reproduce, I could not succeed to replicate elsewhere 
(dump/restore does not preserve it).
Anyway it makes my database unusable and looks like a potential bug for your 
product and applications relying on it.

Faulty setup is about:


SELECT version();

-- PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit



SELECT extname, extversion FROM pg_extension;

-- "plpgsql"    "1.0"

-- "postgis"    "3.1.1"

By now, the only workarounds I have found are:


  *   Dump database and recreate a new instance (problem seems to vanish but 
there is no guarantee it is solved or it will not happened later on);
  *   Add dummy filter on all queries (more a trick than a solution).

I am writing to this mailing list to raise you attention on it.
I'll be happy to help you investigate it deeper.

Best regards,

Landercy Jean

Reply via email to