2009/1/21 Herouth Maoz <hero...@unicell.co.il> > Hello. > > I have a daily process that synchronizes our reports database from our > production databases. In the past few days, it happened a couple of times > that an update query took around 7-8 hours to complete, which seems a bit > excessive. This is the query: > > UPDATE rb > SET service = b.service, > status = b.status, > has_notification = gateway_id NOT IN (4,101,102), > operator = COALESCE( > b.actual_target_network_id, > b.requested_target_network_id > ) > FROM sms.billing b > WHERE b.time_arrived >= :date_start > AND rb.time_stamp >= :date_start > AND rb.delivered = 0 > AND rb.sms_user = b.user_id > AND rb.reference = b.user_reference > AND OVERLAY( rb.msisdn placing '972' from 1 for 1 ) = b.msisdn > AND NOT mo_billed > AND system_id <> 6 -- Exclude Corporate, as it aleady has > service/status > ; > > The variable ":date_start" is set to a date 3 days ago. > > I ran explain for this query and it gave me this: > > > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=21567.12..854759.82 rows=1 width=210) > Join Filter: (((rb.sms_user)::text = (b.user_id)::text) AND > (rb.reference = b.user_reference)) > -> Bitmap Heap Scan on rb (cost=21546.02..23946.16 rows=819 width=198) > Recheck Cond: ((delivered = 0) AND (time_stamp >= '2009-01-18 > 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 > 00:00:00'::timestamp without time zone)) > Filter: ((NOT mo_billed) AND (system_id <> 6)) > -> BitmapAnd (cost=21546.02..21546.02 rows=819 width=0) > -> Bitmap Index Scan on rb_delivered_ind > (cost=0.00..1419.99 rows=45768 width=0) > Index Cond: (delivered = 0) > -> Bitmap Index Scan on rb_timestamp_ind > (cost=0.00..20125.37 rows=188994 width=0) > Index Cond: ((time_stamp >= '2009-01-18 > 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 > 00:00:00'::timestamp without time zone)) > -> Bitmap Heap Scan on billing b (cost=21.10..1004.77 rows=351 > width=49) > Recheck Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text, > 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2))) > Filter: ((b.time_arrived >= '2009-01-18 00:00:00'::timestamp > without time zone) AND (b.time_arrived < '2009-01-21 00:00:00'::timestamp > without time zone)) > -> Bitmap Index Scan on billing_msisdn_sme_reference > (cost=0.00..21.10 rows=351 width=0) > Index Cond: ((b.msisdn)::text = > (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || > "substring"((rb.msisdn)::text, 2))) > > I'm not an expert on reading plans, but it seems to me that it uses indices > on both tables that participate in this query, so it shouldn't take such a > long time. > > The number of records in the table rb for the past three days is 386833. On > the sms.billing table it seems to select the index on the msisdn and > sme_reference fields and use it partially (only using the msisdn field). > Looking at that table, the frequency of each value in the msisdn field is at > most 17678 for the current data, where mostly it's a couple of thousands. > How can this take so long? >
1. which postgres version? 2. can you post results of EXPLAIN ANALYZE (please note it actually executes the query)? -- Filip Rembiałkowski