> On 23 Feb 2026, at 20:59, Laurenz Albe <[email protected]> wrote: > > On Mon, 2026-02-23 at 16:10 +0100, Attila Soki wrote: >>> On 23 Feb 2026, at 10:41, Laurenz Albe <[email protected]> wrote: >>> >>> On Mon, 2026-02-23 at 10:37 +0100, Attila Soki wrote: >>>> When upgrading from PostgreSQL 14.4, I noticed that one of my somewhat >>>> complex >>>> analytical queries sometimes gets an inefficient plan under PostgreSQL 16, >>>> 17, and 18. >>>> Under 14.4, the query runs with a stable plan and completes in 19 to 22 >>>> seconds. >>>> In newer versions, the plan seems to be unstable, sometimes the query >>>> completes >>>> in 17 to 20 seconds, sometimes it runs for 5 to 18 minutes with the >>>> inefficient plan. >>>> This also happens even if the data is not significantly changed. >>> >>> This is very likely owing to a bad estimate. >>> >>> Could you turn on "track_io_timing" and send us the EXPLAIN (ANALYZE, >>> BUFFERS) output >>> for both the good and the bad plan? >> >> Thank you for your reply. Here are the two explains. >> In order to be able to publish the plans here, I have obfuscated the table >> and field names, but this is reversible, so I can provide more info if >> needed. >> >> plan-ok: >> https://explain.depesz.com/s/hQvM >> >> plan-wrong: >> https://explain.depesz.com/s/uLvl > > Thanks. > > The difference in the plans is under the "Subquery Scan on odg", starting with > plan node 50 (everything under the "Sort"). I suspect that the mis-estimate > that is at the root of the problem is here: > > -> Index Scan using table_k_late_spec_dp_end_dat_key on schema1.table_k kal > (... rows=196053 ...) (... rows=471.00 ...) > Index Cond: (kal.dp_end_dat < ('now'::cstring)::date) > Index Searches: 1 > Buffers: shared hit=230 read=49 > I/O Timings: shared read=0.142 > > PostgreSQL overestimates the row count by a factor of over 400. > Try to fix that estimate and see if that gets PostgreSQL to do the right > thing. > > Perhaps a simple ANALYZE on the table can do the trick.
In the examples I used table_k to flip the plan with vacuumed -Upostgres -vZ -t schema1.tbl_used_in_query db1 in the explain output schema1.tbl_used_in_query is table_k > The right side of the comparison looks awkward, as if you wrote > 'now'::text::date > My experiments show that PostgreSQL v18 estimates well even with such a weird > condition, but perhaps if you write "current_date" instead, you'd get better > results. I didn't realize that made a difference. I will replace all occurrences. It also looks more clean with current_date. > > I'd play just with a query like > > EXPLAIN (ANALYZE) > SELECT * FROM schema1.table_k AS kal > WHERE dp_end_dat < current_date; > > until I get a good estimate. I will try to set custom statistics for dp_end_dat and the fields used by the table_k_late_spec_dp_end_dat_key index. Let’s see if that helps. I am on UTC+1. I will try all of this tomorrow and get back to you with the results later. Thank you regards, Attila
