Jeroen or Stuart are probably the best folk to ask about this topic :) However on performance Tuesday this week I ran into a really clear example...
The subject refers to how to get fast queries out of postgres: ask queries that consult a lot of data from one table, or consult little data from many tables, but never a lot of data from many tables. In bug 787294 I ran into a case of this. I also had a full-day headache which led to some interesting analysis mistakes ;). For instance, this humdinger candidate query plan: 16:47 #launchpad-dev: < lifeless> Nested Loop (cost=2.76..17359385328490.43 rows=247357031677542 width=4) Anyhow, in this bug we're querying for all the bugs that the team is related to (assigned, subscribed, created, commented or structurally subscribed) to which have patches. The count of patches needed is 102. The current live plan has an estimated cost of 800K: Aggregate (cost=816866.73..816866.74 rows=1 width=0) -> Nested Loop (cost=705926.62..816816.29 rows=20177 width=0) -> HashAggregate (cost=705926.62..706128.39 rows=20177 width=280) -> Append (cost=0.00..704564.67 rows=20177 width=280) with the bulk of the cost coming in from the structural subscription subplan: -> Nested Loop (cost=43392.43..702903.59 rows=20168 width=280) Join Filter: ((public.bugtask.product = structuralsubscription.product) OR (public.bugtask.productseries = structuralsubscription.productseries) OR ((public.product.project = structuralsubscription.project) AND (public.bugtask.product = public.product.id)) OR ((public.bugtask.distribution = structuralsubscription.distribution) AND ((public.bugtask.sourcepackagename = structuralsubscription.sourcepackagename) OR (structuralsubscription.sourcepackagename IS NULL))) OR (public.bugtask.distroseries = structuralsubscription.distroseries) OR (public.bugtask.milestone = structuralsubscription.milestone)) ... I ran the original query on qastaging, and stopped it after 45 minutes. Now, qastaging is missing an index that production and staging has, so I moved over to staging to test. I successively refined the queries I ran to exercise smaller and smaller variables, trying to find what triggered the bad plan. (45 minutes is -much- longer than just doing a raw read of all the data could explain). This query:SELECT distinct BugTask.bug FROM BugTask inner join bug on bug.id=bugtask.bug LEFT JOIN Product ON BugTask.product = Product.id AND Product.active inner join structuralsubscription ss on ( ss.subscriber = 343381 and( BugTask.product = ss.product OR BugTask.productseries = ss.productseries OR Product.project = ss.project OR BugTask.distribution = ss.distribution AND (BugTask.sourcepackagename = ss.sourcepackagename OR ss.sourcepackagename IS NULL) OR BugTask.distroseries = ss.distroseries OR BugTask.milestone = ss.milestone) ) WHERE ((BugTask.status = 10) OR (BugTask.status = 15) OR (BugTask.status = 20) OR (BugTask.status = 21) OR (BugTask.status = 22) OR (BugTask.status = 25)) AND Bug.duplicateof IS NULL AND Bug.latest_patch_uploaded IS NOT NULL AND Bug.private = FALSE; which checks all the structural subscriptions has an estimated cost of 100K: HashAggregate (cost=112166.83..112397.51 rows=23068 width=4) and runs in 88 seconds This query: SELECT distinct BugTask.bug FROM BugTask inner join bug on bug.id=bugtask.bug LEFT JOIN Product ON BugTask.product = Product.id AND Product.active inner join structuralsubscription ss on ( ss.subscriber = 343381 and( BugTask.distribution = ss.distribution AND (BugTask.sourcepackagename = ss.sourcepackagename OR ss.sourcepackagename IS NULL) ) ) WHERE ((BugTask.status = 10) OR (BugTask.status = 15) OR (BugTask.status = 20) OR (BugTask.status = 21) OR (BugTask.status = 22) OR (BugTask.status = 25)) AND Bug.duplicateof IS NULL AND Bug.latest_patch_uploaded IS NOT NULL AND Bug.private = FALSE; has an estimated cost of 130K: HashAggregate (cost=131351.38..133944.19 rows=259281 width=4) _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp