[PERFORM] Slow Query

2015-08-11 Thread robbyc
Hi,

I am new to optimizing queries and i'm getting a slow running time
(~1.5secs) with the following SQL:

SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
"Vacancy"."DateCreated", "Vacancy"."CustomAccess"
, "Department"."Name" as "Department", list("Occupation"."Name") as
"Occupation", "Vacancy"."PositionNo"
, "Vacancy"."Template" from 
   "Vacancy"
LEFT JOIN "CategoryOption_TableRow" as "c_22" 
 ON ("c_22"."RowID" = "Vacancy"."ID" 
 and "c_22"."Category_TableID" = 22)
LEFT JOIN "CategoryOption" as "Occupation"
 ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
LEFT JOIN "TableRow_TableRow" as "t_33" 
  ON ("t_33"."Table1RowID" = "Vacancy"."ID" 
  and "t_33"."Table_TableID" = 33 )
LEFT JOIN "Department"
 ON ("Department"."ID" = "t_33"."Table2RowID" and
"Department"."Active" = 't' and "Department"
."ClientID" = 263)
JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"= 50
and "c_50"."RowID" = "Vacancy"
."ID" and "c_50"."CategoryOptionID"=19205)
WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT
DISTINCT("Vacancy"."ID") 
FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on
("ct126"."Category_TableID" = 126
 and "RowID" = "Vacancy"."ID") 
left join "Workflow" on ("Workflow"."VacancyID" = "Vacancy"."ID"
and "Workflow"."Level" 
= 1) 
left join "CategoryOption_TableRow" "c30" on ("c30"."RowID" =
"Workflow"."ID" and "c30"."Category_TableID"
 = 30 and "c30"."CategoryOptionID" = 21923) 
WHERE "Template" AND "ct126"."CategoryOptionID"
IN(34024,35254,35255,35256)) and "Vacancy"
."Template" = 't'
GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
"Vacancy"."CustomAccess", "Department"
."Name", "Vacancy"."PositionNo", "Vacancy"."Template"
UNION
SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
"Vacancy"."DateCreated", "Vacancy"."CustomAccess"
, "Department"."Name" as "Department", list("Occupation"."Name") as
"Occupation", "Vacancy"."PositionNo"
, "Vacancy"."Template" from 
   "Vacancy"
LEFT JOIN "CategoryOption_TableRow" as "c_22" 
 ON ("c_22"."RowID" = "Vacancy"."ID" 
 and "c_22"."Category_TableID" = 22)
LEFT JOIN "CategoryOption" as "Occupation"
 ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
LEFT JOIN "TableRow_TableRow" as "t_33" 
  ON ("t_33"."Table1RowID" = "Vacancy"."ID" 
  and "t_33"."Table_TableID" = 33 )
LEFT JOIN "Department"
 ON ("Department"."ID" = "t_33"."Table2RowID" and
"Department"."Active" = 't' and "Department"
."ClientID" = 263)
JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"= 50
and "c_50"."RowID" = "Vacancy"
."ID" and "c_50"."CategoryOptionID"=19205)
WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT
DISTINCT("Vacancy"."ID") 
FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on
("ct126"."Category_TableID" = 126
 and "RowID" = "Vacancy"."ID") 
left join "Workflow" on ("Workflow"."VacancyID" = "Vacancy"."ID"
and "Workflow"."Level" 
= 1) 
left join "CategoryOption_TableRow" "c30" on ("c30"."RowID" =
"Workflow"."ID" and "c30"."Category_TableID"
 = 30 and "c30"."CategoryOptionID" = 21923) 
WHERE "Template" AND "ct126"."CategoryOptionID"
IN(34024,35254,35255,35256))  and "Vacancy"
."Template" <> 't' AND "Vacancy"."Level" = 1 
GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
"Vacancy"."CustomAccess", "Department"
."Name", "Vacancy"."PositionNo", "Vacancy"."Template"
 ORDER BY "JobTitle"

Running explain analyze gives me the following information: 
http://explain.depesz.com/s/pdC   

For a total runtime: 2877.157 ms

If i remove the left joins on Department and TableRow_TableRow this reduces
the run time by about a third.
Additionally removing CategoryOption and CategoryOption_TableRow joins
further reduces by a about a third.

Given that i need both these joins for the information retrieved by them,
what would be the best way to re-factor this query so it runs faster?

Looking at the output of explain analyze the hash aggregates and sort seem
to be the primary issue.

Thanks in advance  










--
View this message in context: 
http://postgresql.nabble.com/Slow-Query-tp5861835.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow Query

2015-08-11 Thread robbyc
Hi Venkata,

work_mem was set to 72MB, increased to 144MB, no change.

Added an index of type varchar_pattern_ops to Vacancy.JobTitle, this did
not help either.

On Wed, Aug 12, 2015 at 2:09 PM, Venkata Balaji N [via PostgreSQL] <
ml-node+s1045698n5861839...@n5.nabble.com> wrote:

> On Wed, Aug 12, 2015 at 12:34 PM, robbyc <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=5861839&i=0>> wrote:
>
>> Hi,
>>
>> I am new to optimizing queries and i'm getting a slow running time
>> (~1.5secs) with the following SQL:
>>
>> SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
>> "Vacancy"."DateCreated", "Vacancy"."CustomAccess"
>> , "Department"."Name" as "Department", list("Occupation"."Name") as
>> "Occupation", "Vacancy"."PositionNo"
>> , "Vacancy"."Template" from
>>"Vacancy"
>> LEFT JOIN "CategoryOption_TableRow" as "c_22"
>>  ON ("c_22"."RowID" = "Vacancy"."ID"
>>  and "c_22"."Category_TableID" = 22)
>> LEFT JOIN "CategoryOption" as "Occupation"
>>  ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
>> LEFT JOIN "TableRow_TableRow" as "t_33"
>>   ON ("t_33"."Table1RowID" = "Vacancy"."ID"
>>   and "t_33"."Table_TableID" = 33 )
>> LEFT JOIN "Department"
>>  ON ("Department"."ID" = "t_33"."Table2RowID" and
>> "Department"."Active" = 't' and "Department"
>> ."ClientID" = 263)
>> JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"= 50
>> and "c_50"."RowID" = "Vacancy"
>> ."ID" and "c_50"."CategoryOptionID"=19205)
>> WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT
>> DISTINCT("Vacancy"."ID")
>> FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on
>> ("ct126"."Category_TableID" = 126
>>  and "RowID" = "Vacancy"."ID")
>> left join "Workflow" on ("Workflow"."VacancyID" =
>> "Vacancy"."ID"
>> and "Workflow"."Level"
>> = 1)
>> left join "CategoryOption_TableRow" "c30" on ("c30"."RowID" =
>> "Workflow"."ID" and "c30"."Category_TableID"
>>  = 30 and "c30"."CategoryOptionID" = 21923)
>> WHERE "Template" AND "ct126"."CategoryOptionID"
>> IN(34024,35254,35255,35256)) and "Vacancy"
>> ."Template" = 't'
>> GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
>> "Vacancy"."CustomAccess", "Department"
>> ."Name", "Vacancy"."PositionNo", "Vacancy"."Template"
>> UNION
>> SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
>> "Vacancy"."DateCreated", "Vacancy"."CustomAccess"
>> , "Department"."Name" as "Department", list("Occupation"."Name") as
>> "Occupation", "Vacancy"."PositionNo"
>> , "Vacancy"."Template" from
>>"Vacancy"
>> LEFT JOIN "CategoryOption_TableRow" as "c_22"
>>  ON ("c_22"."RowID" = "Vacancy"."ID"
>>  and "c_22"."Category_TableID" = 22)
>> LEFT JOIN "CategoryOption" as "Occupation"
>>  ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
>> LEFT JOIN "TableRow_TableRow" as "t_33"
>>   ON ("t_33"."Table1RowID" = "Vacancy"."ID"
>>   and "t_33"."Table_TableID" = 33 )
>> LEFT JOIN "Department"
>>  ON (&qu

Re: [PERFORM] Slow Query

2015-08-12 Thread robbyc
Hi Vik,

Thanks for your feedback, very helpful.

I modified your query slightly, this will return all vacancy templates and
all level 1 vacancies which arent templates, and does so in about
~800-900ms less, an great improvement on the original query.

SELECT "Vacancy"."ID",
   "Vacancy"."JobTitle",
   "Vacancy"."DateCreated",
   "Vacancy"."CustomAccess",
   "Department"."Name" as "Department",
   list("Occupation"."Name") as "Occupation",
   "Vacancy"."PositionNo",
   "Vacancy"."Template"
FROM "Vacancy"
JOIN "CategoryOption_TableRow" as "c_50" ON (
"c_50"."Category_TableID"= 50
AND "c_50"."RowID" = "Vacancy"."ID"
AND "c_50"."CategoryOptionID"=19205)
LEFT JOIN "CategoryOption_TableRow" as "c_22" ON (
"c_22"."RowID" = "Vacancy"."ID"
AND "c_22"."Category_TableID" = 22)
LEFT JOIN "CategoryOption" as "Occupation" ON (
"Occupation"."ID" = "c_22"."CategoryOptionID")
LEFT JOIN "TableRow_TableRow" as "t_33" ON (
"t_33"."Table1RowID" = "Vacancy"."ID"
AND "t_33"."Table_TableID" = 33)
LEFT JOIN "Department" ON (
"Department"."ID" = "t_33"."Table2RowID"
AND "Department"."Active" = 't'
AND "Department"."ClientID" = 263)
WHERE "Vacancy"."ClientID" = 263
  AND NOT EXISTS (
SELECT 1
FROM "Vacancy" as "v"
JOIN "CategoryOption_TableRow" "ct126" on (
"ct126"."Category_TableID" = 126
AND "RowID" = "v"."ID")
WHERE "v"."Template"
  AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256)
  AND "v"."ID" = "Vacancy"."ID")
  AND ("Vacancy"."Template" OR ("Vacancy"."Template" = 'f' AND
"Vacancy"."Level" = 1))
  GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
 "Vacancy"."CustomAccess", "Department"."Name",
 "Vacancy"."PositionNo", "Vacancy"."Template"








On Wed, Aug 12, 2015 at 9:35 PM, Vik Fearing-3 [via PostgreSQL] <
ml-node+s1045698n5861873...@n5.nabble.com> wrote:

> On 08/12/2015 04:34 AM, robbyc wrote:
> > Hi,
> >
> > I am new to optimizing queries and i'm getting a slow running time
> > (~1.5secs) with the following SQL:
>
> Before mucking about with work_mem and indexes, the first thing to do is
> rewrite this query correctly.  Here are just some of the things wrong
> with the query as written:
>
> * You're doing a DISTINCT on the same set of columns also in a GROUP BY.
>   This is redundant and causes needless deduplication.
>
> * You're joining two GROUPed BY then DISTINCTed queries using the UNION
>   operator which will do yet another pass for deduplication.
>
> * You've got the entire query repeated for just a simple difference in
>   the global WHERE clause.  These can be merged.
>
> * You've kept LEFT JOINs in the subquery but you don't use any values
>   from them.  These can be safely removed altogether.
>
> * You're using a NOT IN clause which is almost never what you want.  Use
>   NOT EXISTS instead.
>
> What is this list() function?  How is it defined?  Can it be replaced
> with string_agg()?
>
> You're not doing yourself any favors at all with all this quoting and
> mixed case stuff.
>
> Here is a rewritten version, please let me know how it performs:
>
> SELECT "Vacancy"."ID",
>"Vacancy"."JobTitle",
>"Vacancy"."DateCreated",
>"Vacancy"."CustomAccess",
>"Department"."Name" as "Department",
>list("Occupation"."Name") as "Occupation",
>"Vacancy"."PositionNo",
>"Vacancy"."Template"
> FROM "Vacancy"
> JOIN "CategoryOption_TableRow" as "c_50" ON (
>   

Re: [PERFORM] Slow Query

2015-08-13 Thread robbyc
Hi,

Doing this returns 0 records

On Thu, Aug 13, 2015 at 7:22 PM, 林士博 [via PostgreSQL] <
ml-node+s1045698n5862008...@n5.nabble.com> wrote:

> In the 'not exists' cluster, you do not have to search table "Vacancy as
> v" again.
> I think it would be faster to use the outer Vacancy table as below.
> In your case, that do the same work.
>
> NOT EXISTS (
> SELECT 1
> FROM "CategoryOption_TableRow" "ct126"
> WHERE "Vacancy"."Template"
>   AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256)
>   AND "ct126"."Category_TableID" = 126
>   AND "ct126"."RowID" = "Vacancy"."ID"
>   )
>
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
> http://postgresql.nabble.com/Slow-Query-tp5861835p5862008.html
> To unsubscribe from Slow Query, click here
> 
> .
> NAML
> 
>



-- 
Regards

Robert Campbell
+61412062971
robcampbel...@gmail.com




--
View this message in context: 
http://postgresql.nabble.com/Slow-Query-tp5861835p5862122.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.