[PERFORM] Tracking down performance issue

2004-04-12 Thread Ken Geis
I'm running 7.4.2 on an older Linux box (450MHzAMD K-6-III, 450M RAM) running kernel 2.6.5. My client is a Java/JDBC program on Windows. I'm having trouble seeing where the bottleneck in my performance is. The client uses about 30% CPU. The server uses 70% CPU plus 1.5% I/O wait. The I/O wai

Re: [PERFORM] plan problem

2004-04-07 Thread Ken Geis
Richard Huxton wrote: On Tuesday 06 April 2004 21:25, Ken Geis wrote: I am trying to find an efficient way to draw a random sample from a complex query. I also want it to be easy to use within my application. So I've defined a view that encapsulates the query. The id in the "driving

[PERFORM] plan problem

2004-04-06 Thread Ken Geis
I am trying to find an efficient way to draw a random sample from a complex query. I also want it to be easy to use within my application. So I've defined a view that encapsulates the query. The id in the "driving" table is exposed, and I run a query like: select * from stats_record_view whe

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. After digging through planner code, I found that bumping up the sort_mem will make the planner pre

Re: [PERFORM] bad estimates

2003-08-30 Thread Ken Geis
Bruno Wolff III wrote: I haven't come up with any great ideas for this one. It might be interesting to compare the explain analyze output from the distinct on query with and without seqscans enabled. Can't do that comparison. Remember, with seqscan it fails. (Oh, and that nested loops solution I

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Sean Chittenden wrote: I found I'm suffering from an effect detailed in a previous thread titled Does "correlation" mislead the optimizer on large tables? I don't know about large tables, but this is a big problem and something I'm going to spend some time validating later today. I think Manfr

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: If you want both the max and the min, then things are going to be a bit more work. You are either going to want to do two separate selects or join two selects or use subselects. If there aren't enough prices per stock, the sequential scan might be fastest since you only need

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Ken Geis wrote: When run on 7.3.3, forcing an index scan by setting enable_seqscan=false, the query took 55 minutes to run. The index is about 660M in size, and the table is 1G. As I mentioned before, with table scans enabled, it bombs, running out of temporary space. Man, I should wait a

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Ken Geis wrote: I went through the code (7.4 beta2) that estimates the cost of an index scan path. What I need to be sure of is that when running a query in pgsql that uses only the columns that are in an index, the underlying table need not be accessed. I know that Oracle does this. Thinking

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Sorry, all, to wipe out the context, but it was getting a little long. Bruno Wolff III wrote: I am calling it quits for tonight, but will check back tomorrow to see how things turned out. I went through the code (7.4 beta2) that estimates the cost of an index scan path. What I need to be sure of

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: Can you do a \d on the real table or is that too sensitive? It was silly of me to think of this as particularly sensitive. stocks=> \d day_ends Table "public.day_ends" Column | Type | Modifiers +--+--- stock_id | intege

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: On Thu, Aug 28, 2003 at 20:46:00 -0700, Ken Geis <[EMAIL PROTECTED]> wrote: It is not the table or the query that is wrong. It is either the db parameters or the optimizer itself. ... It is still odd that you didn't get a big speed up for just the min though.

Re: [PERFORM] bad estimates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: On Thu, Aug 28, 2003 at 20:00:32 -0700, Ken Geis <[EMAIL PROTECTED]> wrote: Bruno Wolff III wrote: Not according to the optimizer! Plus, this is not guaranteed to return the correct results. For it to be fast you need an index on (stock_id, price_date) so that y

Re: [PERFORM] bad estimates / non-scanning aggregates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: Not according to the optimizer! Plus, this is not guaranteed to return the correct results. For it to be fast you need an index on (stock_id, price_date) so that you can use an index scan. I already said that such an index existed. In fact, it is the primary key of the ta

Re: [PERFORM] bad estimates / non-scanning aggregates

2003-08-29 Thread Ken Geis
Bruno Wolff III wrote: On Thu, Aug 28, 2003 at 17:10:31 -0700, Ken Geis <[EMAIL PROTECTED]> wrote: The query I want to run is select stock_id, min(price_date) from day_ends group by stock_id; The fast way to do this is: select distinct on (stock_id) stock_id, price_date order by st

[PERFORM] bad estimates / non-scanning aggregates

2003-08-29 Thread Ken Geis
e same way? Are MIN and MAX used often enough to justify special treatment, and could that be cleanly implemented? Perhaps the aggregate function can request the data in a certain order, be told that it is being passed data in a certain order, and return before seeing the entire set of data.