Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-17 Thread Claudio Freire
On Thu, Nov 17, 2011 at 11:17 AM, Aidan Van Dyk wrote: > But remember, you're doing all that in a single query.  So your disk > subsystem might even be able to perform even more *througput* if it > was given many more concurrent request.  A big raid10 is really good > at handling multiple concurre

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-17 Thread Tomas Vondra
On 17 Listopad 2011, 15:17, Aidan Van Dyk wrote: > With a single query, the query can only run as fast as the single > stream of requests can be satisfied. And as the next read is issued > as soon as the previous is done (the kernel readahead/buffering the > seq scan helps here), your iostat is go

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-17 Thread Aidan Van Dyk
On Thu, Nov 17, 2011 at 12:23 AM, Tory M Blue wrote: >> What do you mean by "nothing"? There are 3060 reads/s, servicing each one >> takes 0.33 ms - that means the drive is 100% utilized. >> >> The problem with the iostat results you've posted earlier is that they >> either use "-xd" or none of t

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 9:19 PM, Josh Berkus wrote: > Tory, > > A seq scan across 83GB in 4 minutes is pretty good.   That's over > 300MB/s.  Even if you assume that 1/3 of the table was already cached, > that's still over 240mb/s.  Good disk array. > > Either you need an index, or you need to not

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 7:47 PM, Tomas Vondra wrote: > On 17 Listopad 2011, 4:16, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe >> wrote: >>> On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: > On 17 Listopad

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Josh Berkus
Tory, A seq scan across 83GB in 4 minutes is pretty good. That's over 300MB/s. Even if you assume that 1/3 of the table was already cached, that's still over 240mb/s. Good disk array. Either you need an index, or you need to not do this query at user request time. Or a LOT more RAM. -- Jos

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
On 17 Listopad 2011, 4:16, Tory M Blue wrote: > On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe > wrote: >> On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >>> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: On 17 Listopad 2011, 2:57, Scott Marlowe wrote: > On Wed, Nov 16, 2011

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 7:02 PM, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 8:02 PM, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: >> On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >>> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 7:42 PM, Tory M Blue wrote: > On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: >> On 17 Listopad 2011, 2:57, Scott Marlowe wrote: >>> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: >>> But you're right - you're not bound by I/O (although I don't know what >

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
On Wed, Nov 16, 2011 at 6:27 PM, Tomas Vondra wrote: > On 17 Listopad 2011, 2:57, Scott Marlowe wrote: >> On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: >> >>> But you're right - you're not bound by I/O (although I don't know what >>> are >>> those 15% - iowait, util or what?). The COUNT(DI

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
On 17 Listopad 2011, 2:57, Scott Marlowe wrote: > On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > >> But you're right - you're not bound by I/O (although I don't know what >> are >> those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually >> keep all the distinct values to de

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Andy Colson
On 11/16/2011 04:53 PM, Tory M Blue wrote: Linux F12 64bit Postgres 8.4.4 16 proc / 32GB 8 disk 15KRPM SAS/Raid 5 (I know!) shared_buffers = 6000MB #temp_buffers = 8MB max_prepared_transactions = 0 work_mem = 250MB

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:52 PM, Samuel Gendler wrote: > Could you elaborate on this a bit, or point me at some docs?  I manage a > 600GB db which does almost nothing but aggregates on partitioned tables - > the largest of which has approx 600 million rows across all partitions. >  grouping in the

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:59 PM, Tomas Vondra wrote: > But you're right - you're not bound by I/O (although I don't know what are > those 15% - iowait, util or what?). The COUNT(DISTINCT) has to actually > keep all the distinct values to determine which are actually distinct. Actually I meant to

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
Thanks all, I misspoke on our use of the index. We do have an index on log_date and it is being used here is the explain analyze plan. 'Aggregate (cost=7266186.16..7266186.17 rows=1 width=8) (actual time=127575.030..127575.030 rows=1 loops=1)' ' -> Bitmap Heap Scan on userstats (cost=13518

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tomas Vondra
Hi. On 16 Listopad 2011, 23:53, Tory M Blue wrote: > > We now have about 180mill records in that table. The database size is > about 580GB and the userstats table which is the biggest one and the > one we query the most is 83GB. > > Just a basic query takes 4 minutes: > > For e.g. select count(dis

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Samuel Gendler
On Wed, Nov 16, 2011 at 3:32 PM, Scott Marlowe wrote: > > If the OP's considering partitioning, they should really consider > upgrading to 9.1 which has much better performance of things like > aggregates against partition tables. > > Could you elaborate on this a bit, or point me at some docs? I

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Scott Marlowe
On Wed, Nov 16, 2011 at 4:27 PM, Alan Hodgson wrote: > On November 16, 2011 02:53:17 PM Tory M Blue wrote: >> We now have about 180mill records in that table. The database size is >> about 580GB and the userstats table which is the biggest one and the >> one we query the most is 83GB. >> >> Just a

Re: [PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Alan Hodgson
On November 16, 2011 02:53:17 PM Tory M Blue wrote: > We now have about 180mill records in that table. The database size is > about 580GB and the userstats table which is the biggest one and the > one we query the most is 83GB. > > Just a basic query takes 4 minutes: > > For e.g. select count(dis

[PERFORM] Performance question 83 GB Table 150 million rows, distinct select

2011-11-16 Thread Tory M Blue
Linux F12 64bit Postgres 8.4.4 16 proc / 32GB 8 disk 15KRPM SAS/Raid 5 (I know!) shared_buffers = 6000MB #temp_buffers = 8MB max_prepared_transactions = 0 work_mem = 250MB maintenance_work_mem = 1000MB

Re: [PERFORM] performance question on VACUUM FULL (Postgres 8.4.2)

2010-01-21 Thread PG User 2010
Hi Jeff, Are you running VACUUM (without FULL) regularly? And if so, is that > insufficient? > Unfortunately, we have not run vacuumlo as often as we would like, and that has caused a lot of garbage blobs to get generated by our application. You can always expect some degree of bloat. Can you gi

Re: [PERFORM] performance question on VACUUM FULL (Postgres 8.4.2)

2010-01-19 Thread Jeff Davis
On Tue, 2010-01-19 at 12:19 -0800, PG User 2010 wrote: > Hello, > > We are running into some performance issues with running VACUUM FULL > on the pg_largeobject table in Postgres (8.4.2 under Linux), and I'm > wondering if anybody here might be able to suggest anything to help > address the issue.

[PERFORM] performance question on VACUUM FULL (Postgres 8.4.2)

2010-01-19 Thread PG User 2010
Hello, We are running into some performance issues with running VACUUM FULL on the pg_largeobject table in Postgres (8.4.2 under Linux), and I'm wondering if anybody here might be able to suggest anything to help address the issue. Specifically, when running VACUUM FULL on the pg_largeobject table

Re: [PERFORM] Performance Question

2008-11-16 Thread PFC
I've been searching for performance metrics and tweaks for a few weeks now. I'm trying to determine if the length of time to process my queries is accurate or not and I'm having a difficult time determining that. I know postgres performance is very dependent on hardware and settings and I

Re: [PERFORM] Performance Question

2008-11-13 Thread Merlin Moncure
On Wed, Nov 12, 2008 at 11:27 AM, - - <[EMAIL PROTECTED]> wrote: > I've been searching for performance metrics and tweaks for a few weeks now. > I'm trying to determine if the length of time to process my queries is > accurate or not and I'm having a difficult time determining that. I know > postgr

Re: [PERFORM] Performance Question

2008-11-13 Thread Dave Page
On Wed, Nov 12, 2008 at 8:57 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > - - <[EMAIL PROTECTED]> writes: >> One of my tables has 660,000 records and doing a SELECT * from that table >> (without any joins or sorts) takes 72 seconds. Ordering the table based on 3 >> columns almost doubles that time t

Re: [PERFORM] Performance Question

2008-11-12 Thread Scott Marlowe
On Wed, Nov 12, 2008 at 9:27 AM, - - <[EMAIL PROTECTED]> wrote: > I've been searching for performance metrics and tweaks for a few weeks now. > I'm trying to determine if the length of time to process my queries is > accurate or not and I'm having a difficult time determining that. I know > postgre

Re: [PERFORM] Performance Question

2008-11-12 Thread Tom Lane
- - <[EMAIL PROTECTED]> writes: > One of my tables has 660,000 records and doing a SELECT * from that table > (without any joins or sorts) takes 72 seconds. Ordering the table based on 3 > columns almost doubles that time to an average of 123 seconds. To me, those > numbers are crazy slow and I

Re: [PERFORM] Performance Question

2008-11-12 Thread Fernando Hevia
- - Enviado el: Miércoles, 12 de Noviembre de 2008 14:28 Para: pgsql-performance@postgresql.org Asunto: [PERFORM] Performance Question I've been searching for performance metrics and tweaks for a few weeks now. I'm trying to determine if the length of time to process my queries is accur

Re: [PERFORM] Performance Question

2008-11-12 Thread tv
> max_connections = 100 > shared_buffers = 16MB > work_mem = 64MB > everything else is set to the default OK, but what about effective_cache_size for example? Anyway, we need more information about the table itself - the number of rows is nice, but it does not say how large the table is. The rows

Re: [PERFORM] Performance Question

2008-11-12 Thread J Sisson
There are a few things you didn't mention... First off, what is the context this database is being used in? Is it the backend for a web server? Data warehouse? Etc? Second, you didn't mention the use of indexes. Do you have any indexes on the table in question, and if so, does EXPLAIN ANALYZE

[PERFORM] Performance Question

2008-11-12 Thread - -
I've been searching for performance metrics and tweaks for a few weeks now. I'm trying to determine if the length of time to process my queries is accurate or not and I'm having a difficult time determining that. I know postgres performance is very dependent on hardware and settings and I unders

Re: [PERFORM] performance question (something to do w/

2006-05-09 Thread Kenneth Marshall
On Mon, May 08, 2006 at 12:50:13PM -0500, Jim C. Nasby wrote: > On Mon, May 08, 2006 at 10:42:21AM -0700, Mark Lewis wrote: > > Doing a SELECT with a large list of variables inside an IN runs slowly > > on every database we've tested. We've tested mostly in Oracle and > > PostgreSQL, and both get

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Klint Gore
On Mon, 08 May 2006 19:37:37 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: > Jeffrey Tenny <[EMAIL PROTECTED]> writes: > > The server was already running with random_page_cost=2 today for all tests, > > because of > > the mods I've made to improve other problem queries in the past (my > > settings n

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Tom Lane
Jeffrey Tenny <[EMAIL PROTECTED]> writes: > The server was already running with random_page_cost=2 today for all tests, > because of > the mods I've made to improve other problem queries in the past (my settings > noted below, and > before in another msg on this topic). > So to nail this particu

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
Tom Lane wrote: > Jeffrey Tenny <[EMAIL PROTECTED]> writes: >> I tried the seqscan disabling and got what sounds like the desired plan: >> Sort (cost=54900.62..54940.29 rows=1587 width=16) (actual time=20.208..22.138 rows=677 loops=1) >> Sort Key: f, c >> -> Index Scan using x_f_idx, x_

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Tom Lane
Jeffrey Tenny <[EMAIL PROTECTED]> writes: > I tried the seqscan disabling and got what sounds like the desired plan: > Sort (cost=54900.62..54940.29 rows=1587 width=16) (actual > time=20.208..22.138 rows=677 loops=1) > Sort Key: f, c > -> Index Scan using x_f_idx, x_f_idx, ... > (cos

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
re my question here: what would be the JDBC-proper technique, my app is all jdbc. Jeffrey Tenny wrote: 1) is there a way to enable that for a single query in a multi-query transaction? ---(end of broadcast)--- TIP 3: Have you checked our extensi

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
I tried the seqscan disabling and got what sounds like the desired plan: Sort (cost=54900.62..54940.29 rows=1587 width=16) (actual time=20.208..22.138 rows=677 loops=1) Sort Key: f, c -> Index Scan using x_f_idx, x_f_idx, ... (cost=0.00..54056.96 rows=1587 width=16) (actual time=1.048

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Tom Lane
Jeffrey Tenny <[EMAIL PROTECTED]> writes: > I dropped the multicolumn index 'testindex2', and a new explain analyze > looks like this: > Sort (cost=35730.71..35768.28 rows=1503 width=16) (actual > time=962.555..964.467 rows=677 loops=1) > Sort Key: f, c > -> Seq Scan on x (cost=0.0

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
The original set of indexes were: Indexes: "x_c_idx" btree (c) "x_f_idx" btree (f) "testindex2" btree (f, c) I dropped the multicolumn index 'testindex2', and a new explain analyze looks like this: Sort (cost=35730.71..35768.28 rows=1503 width=16) (actual time=962.555..964.467

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Tom Lane
Jeffrey Tenny <[EMAIL PROTECTED]> writes: > Well, since I don't know the exact parameter values, just substituting > 1-650 for $1-$650, I get: > Index Scan using testindex2 on x (cost=0.00..34964.52 rows=1503 > width=16) (actual time=0.201..968.252 rows=677 loops=1) > Filter: ((f = 1) OR

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
Mark Lewis wrote: Doing a SELECT with a large list of variables inside an IN runs slowly on every database we've tested. We've tested mostly in Oracle and PostgreSQL, and both get very slow very quickly (actually Oracle refuses to process the query at all after it gets too many bind parameters).

Re: [PERFORM] performance question (something to do w/ parameterized

2006-05-08 Thread Jeffrey Tenny
Well, since I don't know the exact parameter values, just substituting 1-650 for $1-$650, I get: Index Scan using testindex2 on x (cost=0.00..34964.52 rows=1503 width=16) (actual time=0.201..968.252 rows=677 loops=1) Filter: ((f = 1) OR (f = 2) OR (f = 3) OR (f = 4) ... So index usage is

Re: [PERFORM] performance question (something to do w/

2006-05-08 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > On Mon, May 08, 2006 at 10:42:21AM -0700, Mark Lewis wrote: >> Doing a SELECT with a large list of variables inside an IN runs slowly >> on every database we've tested. We've tested mostly in Oracle and >> PostgreSQL, and both get very slow very quickly

Re: [PERFORM] performance question (something to do w/

2006-05-08 Thread Jim C. Nasby
On Mon, May 08, 2006 at 10:42:21AM -0700, Mark Lewis wrote: > Doing a SELECT with a large list of variables inside an IN runs slowly > on every database we've tested. We've tested mostly in Oracle and > PostgreSQL, and both get very slow very quickly (actually Oracle refuses > to process the query

Re: [PERFORM] performance question (something to do w/

2006-05-08 Thread Mark Lewis
Doing a SELECT with a large list of variables inside an IN runs slowly on every database we've tested. We've tested mostly in Oracle and PostgreSQL, and both get very slow very quickly (actually Oracle refuses to process the query at all after it gets too many bind parameters). In our case, we ha

Re: [PERFORM] performance question (something to do w/ parameterized stmts?, wrong index types?)

2006-05-08 Thread Jim C. Nasby
What's EXPLAIN ANALYZE show? On Mon, May 08, 2006 at 01:29:28PM -0400, Jeffrey Tenny wrote: > Why does this query take so long? (PostgreSQL 8.0.3, FC4) > Hopefully I have provided enough information below. > > LOG: statement: SELECT * FROM x WHERE f IN > ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12

[PERFORM] performance question (something to do w/ parameterized stmts?, wrong index types?)

2006-05-08 Thread Jeffrey Tenny
Why does this query take so long? (PostgreSQL 8.0.3, FC4) Hopefully I have provided enough information below. LOG: statement: SELECT * FROM x WHERE f IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,\ $25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$3