Re: [PERFORM] Query only slow on first run

2007-11-29 Thread Tom Lane
cluster <[EMAIL PROTECTED]> writes: >> You're essentially asking for a random sample of data that is not >> currently in memory. You're not going to get that without some I/O. > No, that sounds reasonable enough. But do you agree with the statement > that my query will just get slower and slower

Re: [PERFORM] Query only slow on first run

2007-11-29 Thread cluster
You're essentially asking for a random sample of data that is not currently in memory. You're not going to get that without some I/O. No, that sounds reasonable enough. But do you agree with the statement that my query will just get slower and slower over time as the number of posts increases

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Scott Marlowe
On Nov 28, 2007 3:15 PM, cluster <[EMAIL PROTECTED]> wrote: > > The indexes don't contain visibility information, so Postgres has to look up > > the row on disk to verify it isn't dead. > > I guess this fact drastically decreases the performance. :-( > The number of rows with a random_number will j

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Tom Lane
cluster <[EMAIL PROTECTED]> writes: > I could really use any kind of suggestion on how to improve the query in > order to make it scale better for large data sets The 6-7000 ms for a > clean run is really a showstopper. Need to get it below 70 ms somehow. Buy a faster disk? You're essentially a

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Jean-David Beyer
cluster wrote: >> The indexes don't contain visibility information, so Postgres has to >> look up the row on disk to verify it isn't dead. > > I guess this fact drastically decreases the performance. :-( The number > of rows with a random_number will just grow over time while the number of > ques

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread cluster
The indexes don't contain visibility information, so Postgres has to look up the row on disk to verify it isn't dead. I guess this fact drastically decreases the performance. :-( The number of rows with a random_number will just grow over time while the number of questions with status = 1 will

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Steinar H. Gunderson
On Wed, Nov 28, 2007 at 09:16:08PM +0100, cluster wrote: > Hmm, actually I still don't understand why it takes 6400 ms to fetch the > rows. As far as I can see the index used is "covering" so that real row > lookups shouldn't be necessary. The indexes don't contain visibility information, so Pos

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread cluster
I'm wondering why --- doesn't seem like it should take 6400msec to fetch 646 rows, unless perhaps the data is just horribly misordered relative to the index. Which may in fact be the case ... Hmm, actually I still don't understand why it takes 6400 ms to fetch the rows. As far as I can see th

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Tom Lane
"Dave Dutcher" <[EMAIL PROTECTED]> writes: > ... According to the explain analyze > there are only 646 rows in posts which match your criteria, so it does seem > like scanning posts first might be the right thing to do. No, that's not right. What the output actually shows is that only 646 posts

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Dave Dutcher
> -Original Message- > From: tmp > We have primarily two tables of interest here: questions > (~100k rows) and posts (~400k rows). Each post refers to a > question, but only the "posts" rows for which the > corresponding "question.status = 1" are relevant. This > reduces the number of r

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread Craig James
tmp wrote: what exactly is that "random_number" column A random float that is initialized when the row is created and never modified afterwards. The physical row ordering will clearly not match the random_number ordering. However, other queries uses a row ordering by the primary key so I don

Re: [PERFORM] Query only slow on first run

2007-11-28 Thread tmp
The query's spending nearly all its time in the scan of "posts", and I'm wondering why --- doesn't seem like it should take 6400msec to fetch 646 rows, unless perhaps the data is just horribly misordered relative to the index. Which may in fact be the case ... Yes, they probably are. I use the

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 07:25:54PM -0500, Tom Lane wrote: >> You could make an index on (question_id,status) (or a partial index on >> question id, with status=1 as the filter), but I'm not sure how much it would >> help you unless the questions table is extremely big. It doesn't appear to >> be; i

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Tom Lane
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes: > You could make an index on (question_id,status) (or a partial index on > question id, with status=1 as the filter), but I'm not sure how much it would > help you unless the questions table is extremely big. It doesn't appear to > be; in fact, it

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Dave Dutcher
> -Original Message- > From: cluster > > If I disable the nested loops, the query becomes *much* slower. > > A thing that strikes me is the following. As you can see I have the > constraint: q.status = 1. Only a small subset of the data set > has this status. I have an index on q.status

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Steinar H. Gunderson
On Tue, Nov 27, 2007 at 11:51:40PM +0100, cluster wrote: > A thing that strikes me is the following. As you can see I have the > constraint: q.status = 1. Only a small subset of the data set has this > status. I have an index on q.status but for some reason this is not used. > Instead the constr

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread cluster
As for optimizing the query, I noticed that all three joins are done by nested loops. I wonder if another join method would be faster. Have you analyzed all the tables? Yes. I did a VACUUM FULL ANALYZE before running the test queries. Also I have just performed an ANALYZE just to be sure ever

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Dave Dutcher
> -Original Message- > From: cluster > > >> Probably by buying much faster disk hardware. > > Or buy more RAM, so that the data can stay cached. > > So the only problem here is lack of RAM and/or disk speed? I don't think you can reach that conclusion yet. Like everybody said the reason

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Bill Moran
In response to cluster <[EMAIL PROTECTED]>: > >> Probably by buying much faster disk hardware. > > Or buy more RAM, so that the data can stay cached. > > So the only problem here is lack of RAM and/or disk speed? Not automatically, but the chances that more RAM and/or faster disks will improve t

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread cluster
Probably by buying much faster disk hardware. Or buy more RAM, so that the data can stay cached. So the only problem here is lack of RAM and/or disk speed? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http:/

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Tom Lane
Andrew Sullivan <[EMAIL PROTECTED]> writes: > On Tue, Nov 27, 2007 at 05:33:36PM +0100, cluster wrote: >> I have a query that takes about 7000 ms in average to complete the first >> time it runs. Subsequent runs complete in only 50 ms. That is more than >> a factor 100 faster! How can I make the

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Kevin Kempter
On Tuesday 27 November 2007 09:33:36 cluster wrote: > I have a query that takes about 7000 ms in average to complete the first > time it runs. Subsequent runs complete in only 50 ms. That is more than > a factor 100 faster! How can I make the query perform good in the first > run too? > > Query and

Re: [PERFORM] Query only slow on first run

2007-11-27 Thread Andrew Sullivan
On Tue, Nov 27, 2007 at 05:33:36PM +0100, cluster wrote: > I have a query that takes about 7000 ms in average to complete the first > time it runs. Subsequent runs complete in only 50 ms. That is more than > a factor 100 faster! How can I make the query perform good in the first > run too? Prob

[PERFORM] Query only slow on first run

2007-11-27 Thread cluster
I have a query that takes about 7000 ms in average to complete the first time it runs. Subsequent runs complete in only 50 ms. That is more than a factor 100 faster! How can I make the query perform good in the first run too? Query and output from both first and second run of Explain Analyze i