[SQL] Egroupware infolog query slow (includes query plan)

2008-07-06 Thread Mark Stosberg

Hello,

I could use some help figuring out how to speed up a query. Below is the
SQL and query plan for a common SELECT done by the open source
eGroupware project. 

Now that there are about 16,000 rows in egw_infolog and 32,000 in
egw_infolog_extra, the below query takes about 6 minutes to finish!

I'm not skilled enough at reading the "Explain Analzyze" output to
understand what the primary problem is. 

Thanks!

Mark

###

SELECT DISTINCT main.* ,(
SELECT count(*) FROM egw_infolog sub WHERE
sub.info_id_parent=main.info_id AND  (info_owner=6 OR
((','||info_responsible||',' LIKE '%,-2,%' OR
','||info_responsible||',' LIKE '%,-1,%' OR
','||info_responsible||',' LIKE '%,6,%') AND
info_access='public') OR info_owner IN (6) OR
(info_access='public'
AND info_owner IN(6))) 
) AS info_anz_subs  FROM egw_infolog main
LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id
WHERE (
(info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR
','||info_responsible||',' LIKE '%,-1,%' OR
','||info_responsible||',' LIKE '%,6,%') AND
info_access='public') OR info_owner IN (6) OR
(info_access='public'
AND info_owner IN(6)))  AND info_status <> 'deleted'  )
ORDER BY
info_datemodified DESC LIMIT 15 OFFSET 0

Query plan:

Limit  (cost=68624989.18..68624991.31 rows=15 width=1011) (actual
time=686260.735..686260.878 rows=15 loops=1)
  ->  Unique  (cost=68624989.18..68627288.59 rows=16212 width=1011)
(actual time=686260.733..686260.857 rows=15 loops=1)
->  Sort  (cost=68624989.18..68625068.47 rows=31716 width=1011)
(actual time=686260.730..686260.766 rows=29 loops=1)
  Sort Key: main.info_datemodified, main.info_id,
main.info_type, main.info_from, main.info_addr, main.info_subject,
main.info_des, main.info_owner, main.info_responsible, main.info_access,
main.info_cat, main.info_startdate, main.info_enddate,
main.info_id_parent, main.info_planned_time, main.info_used_time,
main.info_status, main.info_confirm, main.info_modifier,
main.info_link_id, main.info_priority, main.pl_id, main.info_price,
main.info_percent, main.info_datecompleted, main.info_location,
main.info_custom_from, (subplan)
  ->  Merge Left Join  (cost=0.00..68594428.95 rows=31716
width=1011) (actual time=21.358..684226.134 rows=32424 loops=1)
Merge Cond: (main.info_id =
egw_infolog_extra.info_id)
->  Index Scan using egw_infolog_pkey on egw_infolog
main  (cost=0.00..3025.84 rows=16212 width=1011) (actual
time=0.060..135.766 rows=16212 loops=1)
  Filter: (((info_owner = 6) OR (','::text
|| (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR
(((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1,
%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~
'%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR
(info_owner = 6) OR (((info_access)::text = 'public'::text) AND
(info_owner = 6))) AND ((info_status)::text <> 'deleted'::text))
->  Index Scan using egw_infolog_extra_pkey on
egw_infolog_extra  (cost=0.00..1546.30 rows=32424 width=4) (actual
time=0.025..317.272 rows=32424 loops=1)
SubPlan
  ->  Aggregate  (cost=2162.60..2162.61 rows=1
width=0) (actual time=21.073..21.073 rows=1 loops=32424)
->  Seq Scan on egw_infolog sub
(cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065
rows=0 loops=32424)
  Filter: ((info_id_parent = $0) AND
((info_owner = 6) OR (','::text || (info_responsible)::text) ||
','::text) ~~ '%,-2,%'::text) OR (((','::text ||
(info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR
(((','::text || (info_responsible)::text) || ','::text) ~~ '%,6,
%'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner =
6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6
Total runtime: 686278.730 ms





-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Egroupware infolog query slow (includes query plan)

2008-07-06 Thread Mark Stosberg

I should have mentioned in the last post that PostgreSQL 8.2.9 is in
use. I could upgrade to 8.3.x if that is expected to help performance in
this case. 

   Mark

On Sun, 2008-07-06 at 16:23 -0400, Mark Stosberg wrote:
> Hello,
> 
> I could use some help figuring out how to speed up a query. Below is the
> SQL and query plan for a common SELECT done by the open source
> eGroupware project. 
> 
> Now that there are about 16,000 rows in egw_infolog and 32,000 in
> egw_infolog_extra, the below query takes about 6 minutes to finish!
> 
> I'm not skilled enough at reading the "Explain Analzyze" output to
> understand what the primary problem is. 
> 
> Thanks!
> 
> Mark
> 
> ###
> 
> SELECT DISTINCT main.* ,(
> SELECT count(*) FROM egw_infolog sub WHERE
> sub.info_id_parent=main.info_id AND  (info_owner=6 OR
> ((','||info_responsible||',' LIKE '%,-2,%' OR
> ','||info_responsible||',' LIKE '%,-1,%' OR
> ','||info_responsible||',' LIKE '%,6,%') AND
> info_access='public') OR info_owner IN (6) OR
> (info_access='public'
> AND info_owner IN(6))) 
> ) AS info_anz_subs  FROM egw_infolog main
> LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id
> WHERE (
> (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR
> ','||info_responsible||',' LIKE '%,-1,%' OR
> ','||info_responsible||',' LIKE '%,6,%') AND
> info_access='public') OR info_owner IN (6) OR
> (info_access='public'
> AND info_owner IN(6)))  AND info_status <> 'deleted'  )
> ORDER BY
> info_datemodified DESC LIMIT 15 OFFSET 0
> 
> Query plan:
> 
> Limit  (cost=68624989.18..68624991.31 rows=15 width=1011) (actual
> time=686260.735..686260.878 rows=15 loops=1)
>   ->  Unique  (cost=68624989.18..68627288.59 rows=16212 width=1011)
> (actual time=686260.733..686260.857 rows=15 loops=1)
> ->  Sort  (cost=68624989.18..68625068.47 rows=31716 width=1011)
> (actual time=686260.730..686260.766 rows=29 loops=1)
>   Sort Key: main.info_datemodified, main.info_id,
> main.info_type, main.info_from, main.info_addr, main.info_subject,
> main.info_des, main.info_owner, main.info_responsible, main.info_access,
> main.info_cat, main.info_startdate, main.info_enddate,
> main.info_id_parent, main.info_planned_time, main.info_used_time,
> main.info_status, main.info_confirm, main.info_modifier,
> main.info_link_id, main.info_priority, main.pl_id, main.info_price,
> main.info_percent, main.info_datecompleted, main.info_location,
> main.info_custom_from, (subplan)
>   ->  Merge Left Join  (cost=0.00..68594428.95 rows=31716
> width=1011) (actual time=21.358..684226.134 rows=32424 loops=1)
> Merge Cond: (main.info_id =
> egw_infolog_extra.info_id)
> ->  Index Scan using egw_infolog_pkey on egw_infolog
> main  (cost=0.00..3025.84 rows=16212 width=1011) (actual
> time=0.060..135.766 rows=16212 loops=1)
>   Filter: (((info_owner = 6) OR (','::text
> || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR
> (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1,
> %'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~
> '%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR
> (info_owner = 6) OR (((info_access)::text = 'public'::text) AND
> (info_owner = 6))) AND ((info_status)::text <> 'deleted'::text))
> ->  Index Scan using egw_infolog_extra_pkey on
> egw_infolog_extra  (cost=0.00..1546.30 rows=32424 width=4) (actual
> time=0.025..317.272 rows=32424 loops=1)
> SubPlan
>   ->  Aggregate  (cost=2162.60..2162.61 rows=1
> width=0) (actual time=21.073..21.073 rows=1 loops=32424)
> ->  Seq Scan on egw_infolog sub
> (cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065
> rows=0 loops=32424)
>   Filter: ((info_id_parent = $0) AND
> ((info_owner = 6) OR (','::text || (info_responsible)::text) ||
> ','::text) ~~ '%,-2,%'::text) OR (((','::text ||
> (info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR
> (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6,
> %'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner =
> 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6
> Total runtime: 686278.730 ms
> 
> 
> 
> 
> 



-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Egroupware infolog query slow (includes query plan)

2008-07-06 Thread Tom Lane
Mark Stosberg <[EMAIL PROTECTED]> writes:
> I'm not skilled enough at reading the "Explain Analzyze" output to
> understand what the primary problem is. 

The problem is the repeated execution of the subquery in the SELECT
list; that's taking over 683 of the 686 seconds:

> SubPlan
>   ->  Aggregate  (cost=2162.60..2162.61 rows=1
> width=0) (actual time=21.073..21.073 rows=1 loops=32424)
^^  ^

The current formulation of the query guarantees that you can't do better
than a nestloop join with "sub" on the inside, and that nestloop isn't
even indexed.  See if you can convert it to a regular join instead of a
sub-select (probably with GROUP BY instead of DISTINCT).

Also, those LIKE conditions are just horrid: slow *and* unreadable.
Consider redesigning your data representation.  Perhaps converting
info_responsible to an int array would be reasonable.

regards, tom lane

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql