There are indices: table:measurement_events "measurement_events_pkey" PRIMARY KEY, btree (measurement_source_id, measurement_time) table:measurement_sources "measurement_sources_pkey" PRIMARY KEY, btree (measurement_source_id)
-Spiros On 8 July 2014 18:10, Andy Colson <a...@squeakycode.net> wrote: > On 7/8/2014 4:47 AM, Spiros Ioannou wrote: > >> While executing the following query through psql : >> >> SELECT me.* FROM measurement_events me JOIN msrcs_timestamps mt ON >> me.measurement_source_id=mt.measurement_source_id WHERE >> measurement_time > last_update_time >> >> there are two behaviors observed by postgresql (8.4): >> 1) Either the query performs lots of reads on the database and completes >> in about 4 hours (that is the normal-expected behavior) >> 2) Either the query starts filling-up pgsql_tmp and this causes large >> write I/O on the server, and the query never actually completes on a >> reasonable time (we stop it after 10h). >> >> For some strange reason, behaviour 2 is always observed when running >> psql through a bash script, while behavior 1 is only observed while >> running psql interactively from command line (but not always). >> >> explain: >> # explain SELECT me.* FROM measurement_events me JOIN msrcs_timestamps >> mt ON me.measurement_source_id=mt.measurement_source_id WHERE >> measurement_time > last_update_time; >> QUERY PLAN >> ------------------------------------------------------------ >> ---------------------------------- >> Hash Join (cost=10111.78..422893652.69 rows=2958929695 width=103) >> Hash Cond: (me.measurement_source_id = mt.measurement_source_id) >> Join Filter: (me.measurement_time > mt.last_update_time) >> -> Seq Scan on measurement_events me (cost=0.00..234251772.85 >> rows=8876789085 width=103) >> -> Hash (cost=5733.57..5733.57 rows=350257 width=24) >> -> Seq Scan on msrcs_timestamps mt (cost=0.00..5733.57 >> rows=350257 width=24) >> (6 rows) >> >> >> We have tried so far fiddling with work_mem up to 512M - no difference. >> Any suggestions? >> >> >> >> Thanks for any help, >> -Spiros Ioannou >> inaccess >> >> > Is there any reason you don't have an index? > > One, or both, of these will help: > > create index measurement_events_pk on measurement_events( > measurement_source_id); > > create index msrcs_timestamps_pk on msrcs_timestamps(measurement_ > source_id); > > > > measurement_events has 8 billion rows, so expect it to take a while, but > its a one time cost and should _dramatically_ increase your query > performance. > > -Andy > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >