Michael Many thanks for spending your time on this. Your alternative does not help unfortunately (see execution plan)
Still a sequential scan on the complete table. I have tried many alternatives and somehow whenever I add a column that is not in the index (I64_01) the optimizer decides not to use the index. If I remove that column, the index is used. I guess it estimates that the extra indirection from index pages to the row pages is more costly than scanning the 168M records. Pretty sure it’s not, but I cannot explain it to the stubborn thing :) Btw, thanks for the >= tip (I was aware of it) Wkr, Peter Hash Join (cost=683.93..7270857.46 rows=458127 width=20) Hash Cond: (mv_inner.device_id = d.short_id) Join Filter: ((mv_inner."timestamp" > ('2020-11-06 00:00:00'::timestamp without time zone - pg_timezone_names.utc_offset)) AND (mv_inner."timestamp" < ('2020-11-07 00:00:00'::timestamp without time zone - pg_timezone_names.utc_offset))) -> Seq Scan on measurement_value mv_inner (cost=0.00..7166797.33 rows=1287989 width=1006) Filter: (("timestamp" > '2020-11-05 00:00:00'::timestamp without time zone) AND ("timestamp" < '2020-11-08 00:00:00'::timestamp without time zone)) -> Hash (cost=656.61..656.61 rows=2186 width=20) -> Hash Join (cost=77.87..656.61 rows=2186 width=20) Hash Cond: (dc.timezone = pg_timezone_names.name) -> Hash Join (cost=55.37..533.83 rows=615 width=18) Hash Cond: (dc.device_id = d.id) -> Seq Scan on device_configuration dc (cost=0.00..470.01 rows=615 width=30) Filter: latest -> Hash (cost=46.83..46.83 rows=683 width=20) -> Seq Scan on device d (cost=0.00..46.83 rows=683 width=20) -> Hash (cost=10.00..10.00 rows=1000 width=48) -> Function Scan on pg_timezone_names (cost=0.00..10.00 rows=1000 width=48) > On 10 Nov 2020, at 01:15, Michael Lewis <mle...@entrata.com> wrote: > > On Mon, Nov 9, 2020 at 1:11 PM Peter Coppens <peter.copp...@datylon.com > <mailto:peter.copp...@datylon.com>> wrote: > Adding the tzn.utc_offset results in the fact that the execution plan no > longer considers to use the index on the measurement_value table. Is there > any way the SQL can be rewritten so that the index is used? Or any other > solution so that the query with the timezone offset returns in a comparable > time? > > I am not aware of a best practice to handle this. Your where condition on > mv.timestamp now depends on several joins to do a filtering that used to be a > static range that can be scanned into the index as a first node in the plan. > I have sometimes used a sub-query on a broader condition that allows the use > of the index, and then fully reducing the set later. Something like this- > > select d.short_id,mv.timestamp,mv.I64_01 > from device d, device_configuration dc, ( > select mv.* > from measurement_value AS mv_inner > where mv.timestamp > '2020-11-06'::timestamp - interval '1 day' and > mv.timestamp < '2020-11-07'::timestamp + interval '1 day' > offset 0 /* to prevent in-lining the join to the outside set */ > ) mv, pg_timezone_names tzn > where mv.device_id=d.short_id and dc.device_id = d.id <http://d.id/> and > dc.latest=true and dc.timezone=tzn.name <http://tzn.name/> and > mv.timestamp > '2020-11-06'::timestamp - tzn.utc_offset and > mv.timestamp < '2020-11-07'::timestamp - tzn.utc_offset > > By the way, it seems a little odd to be exclusive on both the begin and end. > I'd usually expect timestamp >= start_date and timestamp < end_date + > interval '1 day' to fully capture a 24 hour period. Right now, you are > excluding any data that happens to have a timestamp value with .000000 > seconds (midnight exactly).