cluster <[EMAIL PROTECTED]> writes:
>> You're essentially asking for a random sample of data that is not
>> currently in memory. You're not going to get that without some I/O.
> No, that sounds reasonable enough. But do you agree with the statement
> that my query will just get slower and slower
You're essentially asking for a random sample of data that is not
currently in memory. You're not going to get that without some I/O.
No, that sounds reasonable enough. But do you agree with the statement
that my query will just get slower and slower over time as the number of
posts increases
On Nov 28, 2007 3:15 PM, cluster <[EMAIL PROTECTED]> wrote:
> > The indexes don't contain visibility information, so Postgres has to look up
> > the row on disk to verify it isn't dead.
>
> I guess this fact drastically decreases the performance. :-(
> The number of rows with a random_number will j
cluster <[EMAIL PROTECTED]> writes:
> I could really use any kind of suggestion on how to improve the query in
> order to make it scale better for large data sets The 6-7000 ms for a
> clean run is really a showstopper. Need to get it below 70 ms somehow.
Buy a faster disk?
You're essentially a
cluster wrote:
>> The indexes don't contain visibility information, so Postgres has to
>> look up the row on disk to verify it isn't dead.
>
> I guess this fact drastically decreases the performance. :-( The number
> of rows with a random_number will just grow over time while the number of
> ques
The indexes don't contain visibility information, so Postgres has to look up
the row on disk to verify it isn't dead.
I guess this fact drastically decreases the performance. :-(
The number of rows with a random_number will just grow over time while
the number of questions with status = 1 will
On Wed, Nov 28, 2007 at 09:16:08PM +0100, cluster wrote:
> Hmm, actually I still don't understand why it takes 6400 ms to fetch the
> rows. As far as I can see the index used is "covering" so that real row
> lookups shouldn't be necessary.
The indexes don't contain visibility information, so Pos
I'm wondering why --- doesn't seem like it should take 6400msec to fetch
646 rows, unless perhaps the data is just horribly misordered relative
to the index. Which may in fact be the case ...
Hmm, actually I still don't understand why it takes 6400 ms to fetch the
rows. As far as I can see th
"Dave Dutcher" <[EMAIL PROTECTED]> writes:
> ... According to the explain analyze
> there are only 646 rows in posts which match your criteria, so it does seem
> like scanning posts first might be the right thing to do.
No, that's not right. What the output actually shows is that only 646
posts
> -Original Message-
> From: tmp
> We have primarily two tables of interest here: questions
> (~100k rows) and posts (~400k rows). Each post refers to a
> question, but only the "posts" rows for which the
> corresponding "question.status = 1" are relevant. This
> reduces the number of r
tmp wrote:
what exactly is that
"random_number" column
A random float that is initialized when the row is created and never
modified afterwards. The physical row ordering will clearly not match
the random_number ordering. However, other queries uses a row ordering
by the primary key so I don
The query's spending nearly all its time in the scan of "posts", and
I'm wondering why --- doesn't seem like it should take 6400msec to fetch
646 rows, unless perhaps the data is just horribly misordered relative
to the index. Which may in fact be the case ...
Yes, they probably are. I use the
On Tue, Nov 27, 2007 at 07:25:54PM -0500, Tom Lane wrote:
>> You could make an index on (question_id,status) (or a partial index on
>> question id, with status=1 as the filter), but I'm not sure how much it would
>> help you unless the questions table is extremely big. It doesn't appear to
>> be; i
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes:
> You could make an index on (question_id,status) (or a partial index on
> question id, with status=1 as the filter), but I'm not sure how much it would
> help you unless the questions table is extremely big. It doesn't appear to
> be; in fact, it
> -Original Message-
> From: cluster
>
> If I disable the nested loops, the query becomes *much* slower.
>
> A thing that strikes me is the following. As you can see I have the
> constraint: q.status = 1. Only a small subset of the data set
> has this status. I have an index on q.status
On Tue, Nov 27, 2007 at 11:51:40PM +0100, cluster wrote:
> A thing that strikes me is the following. As you can see I have the
> constraint: q.status = 1. Only a small subset of the data set has this
> status. I have an index on q.status but for some reason this is not used.
> Instead the constr
As for optimizing the query, I noticed that all three joins are done by
nested loops. I wonder if another join method would be faster. Have you
analyzed all the tables?
Yes. I did a VACUUM FULL ANALYZE before running the test queries. Also I
have just performed an ANALYZE just to be sure ever
> -Original Message-
> From: cluster
>
> >> Probably by buying much faster disk hardware.
> > Or buy more RAM, so that the data can stay cached.
>
> So the only problem here is lack of RAM and/or disk speed?
I don't think you can reach that conclusion yet. Like everybody said the
reason
In response to cluster <[EMAIL PROTECTED]>:
> >> Probably by buying much faster disk hardware.
> > Or buy more RAM, so that the data can stay cached.
>
> So the only problem here is lack of RAM and/or disk speed?
Not automatically, but the chances that more RAM and/or faster disks will
improve t
Probably by buying much faster disk hardware.
Or buy more RAM, so that the data can stay cached.
So the only problem here is lack of RAM and/or disk speed?
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http:/
Andrew Sullivan <[EMAIL PROTECTED]> writes:
> On Tue, Nov 27, 2007 at 05:33:36PM +0100, cluster wrote:
>> I have a query that takes about 7000 ms in average to complete the first
>> time it runs. Subsequent runs complete in only 50 ms. That is more than
>> a factor 100 faster! How can I make the
On Tuesday 27 November 2007 09:33:36 cluster wrote:
> I have a query that takes about 7000 ms in average to complete the first
> time it runs. Subsequent runs complete in only 50 ms. That is more than
> a factor 100 faster! How can I make the query perform good in the first
> run too?
>
> Query and
On Tue, Nov 27, 2007 at 05:33:36PM +0100, cluster wrote:
> I have a query that takes about 7000 ms in average to complete the first
> time it runs. Subsequent runs complete in only 50 ms. That is more than
> a factor 100 faster! How can I make the query perform good in the first
> run too?
Prob
I have a query that takes about 7000 ms in average to complete the first
time it runs. Subsequent runs complete in only 50 ms. That is more than
a factor 100 faster! How can I make the query perform good in the first
run too?
Query and output from both first and second run of Explain Analyze i
24 matches
Mail list logo