Re: [PERFORM] Postgres scalability and performance on windows

2006-11-24 Thread Guido Neitzer

Am 23.11.2006 um 23:37 schrieb Gopal:
hared_buffers = 2# min 16 or  
max_connections*2, 8KB each
If this is not a copy & paste error, you should add the "s" at the  
beginning of the line.


Also you might want to set this to a higher number. You are setting  
about 2 * 8k = 160MB, this number might be a bit too small if you  
do a lot of queries spread over the whole dataset. I don't know  
whether the memory management on Windows handles this well, but you  
can give it a try.

effective_cache_size = 82728  # typically 8KB each
Hmm. I don't know what the real effect of this might be as the doc  
states:


"This parameter has no effect on the size of shared memory allocated  
by PostgreSQL, nor does it reserve kernel disk cache; it is used only  
for estimation purposes."


You should try optimizing your shared_buffers to cache more of the data.
But postgres has everything spread across 10-15 processes, with  
each process using about 10-30MB, not nearly enough to cache all  
the data and ends up doing a lot of disk reads.
It's not soo easy. PostgreSQL maintains a shared_buffer which is  
accessible by all processes for reading. On a Unix system you can see  
this in the output of top - don't know how this works on Windows.
In any case I cannot believe that having 15-20 processes running on  
windows helps. Why not spwan of threads instead of processes, which  
migh be far less expensive and more efficient. Is there any way of  
doing this?
Because it brings you a whole lot of other problems? And because  
PostgreSQL is not "made for Windows". PostgreSQL runs very good on  
Linux, BSD, Mac OS X and others. The Windows version is quite young.


But before you blame stuff on PostgreSQL you should give more  
information about the query itself.
 My question is, should I just accept the performance I am getting  
as the limit on windows or should I be looking at some other params  
that I might have missed?
Post the "explain analyse select " output here. That  
might help to understand, why you get such a high CPU load.


cug

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres scalability and performance on windows

2006-11-24 Thread Gopal
Hi,

Thanks for your suggestions. Here's an output of the explain analyse.
I'll change the shared_buffers and look at the behaviour again.

"Limit  (cost=59.53..59.53 rows=1 width=28) (actual time=15.681..15.681
rows=1 loops=1)"
"  ->  Sort  (cost=59.53..59.53 rows=1 width=28) (actual
time=15.678..15.678 rows=1 loops=1)"
"Sort Key: ceil(subq.percentcover), (1::double precision /
ds.maxresolution)"
"->  Hash Join  (cost=58.19..59.52 rows=1 width=28) (actual
time=15.630..15.663 rows=2 loops=1)"
"  Hash Cond: ("outer".datasetid = "inner".did)"
"  ->  Seq Scan on tbl_metadata_dataset ds  (cost=0.00..1.21
rows=21 width=24) (actual time=0.006..0.021 rows=21 loops=1)"
"  ->  Hash  (cost=58.18..58.18 rows=1 width=12) (actual
time=15.591..15.591 rows=2 loops=1)"
"->  Sort  (cost=58.17..58.17 rows=1 width=117)
(actual time=15.585..15.586 rows=2 loops=1)"
"  Sort Key: tbl_metadata_chunks.datasetid"
"  ->  HashAggregate  (cost=58.13..58.16 rows=1
width=117) (actual time=15.572..15.573 rows=2 loops=1)"
"->  Hash IN Join  (cost=3.34..58.10
rows=7 width=117) (actual time=0.261..0.544 rows=50 loops=1)"
"  Hash Cond: ("outer".datasetid =
"inner".datasetid)"
"  ->  Bitmap Heap Scan on
tbl_metadata_chunks  (cost=2.05..56.67 rows=14 width=117) (actual
time=0.204..0.384 rows=60 loops=1)"
"Filter: (chunkgeometry &&
'010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958
A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B
8BF7700CC99FA68494058631EDF87ECC1BF608F3D1 (..)"
"->  Bitmap Index Scan on
tbl_metadata_chunks_idx3  (cost=0.00..2.05 rows=14 width=0) (actual
time=0.192..0.192 rows=60 loops=1)"
"  Index Cond:
(chunkgeometry &&
'010320E6100100050058631EDF87ECC1BF608F3D1911694940A0958
A8763C9C1BF535069BA846C494026B5F1284FABB8BFAB1577356E6C494094E1170D33F3B
8BF7700CC99FA68494058631EDF87ECC (..)"
"  ->  Hash  (cost=1.26..1.26
rows=10 width=4) (actual time=0.037..0.037 rows=10 loops=1)"
"->  Seq Scan on
tbl_metadata_dataset  (cost=0.00..1.26 rows=10 width=4) (actual
time=0.005..0.024 rows=10 loops=1)"
"  Filter: (typeofdataid
= 1)"
"Total runtime: 15.871 ms"



Gopal


[PERFORM] TPC-H Benchmark

2006-11-24 Thread Felipe Rondon Rocha
Hi everyone,

does anyone have the TPC-H benchmark for PostgreSQL? Can you tell me where can 
i find the database and queries?

Thks,
Felipe

Re: [PERFORM] TPC-H Benchmark

2006-11-24 Thread Luke Lonergan
http://www.tpc.org/tpch/spec/tpch_20060831.tar.gz

- Luke

On 11/24/06 8:47 AM, "Felipe Rondon Rocha" <[EMAIL PROTECTED]> wrote:

> Hi everyone,
>  
> does anyone have the TPC-H benchmark for PostgreSQL? Can you tell me where can
> i find the database and queries?
>  
> Thks,
> Felipe
> 




Re: [PERFORM] Postgres scalability and performance on windows

2006-11-24 Thread Frank Wiles
On Fri, 24 Nov 2006 09:22:45 +0100
Guido Neitzer <[EMAIL PROTECTED]> wrote:

> > effective_cache_size = 82728  # typically 8KB each
> Hmm. I don't know what the real effect of this might be as the doc  
> states:
> 
> "This parameter has no effect on the size of shared memory allocated  
> by PostgreSQL, nor does it reserve kernel disk cache; it is used
> only for estimation purposes."

   This is a hint to the optimizer about how much of the database may
   be in the OS level cache.  

 -
   Frank Wiles <[EMAIL PROTECTED]>
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Massive delete of rows, how to proceed?

2006-11-24 Thread Arnau

Hi all,

  I have a table with statistics with more than 15 million rows. I'd
like to delete the oldest statistics and this can be about 7 million
rows. Which method would you recommend me to do this? I'd be also
interested in calculate some kind of statistics about these deleted
rows, like how many rows have been deleted for date. I was thinking in
creating a function, any recommendations?

Thank you very much
--
Arnau

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres scalability and performance on windows

2006-11-24 Thread Tom Lane
"Gopal" <[EMAIL PROTECTED]> writes:
> Thanks for your suggestions. Here's an output of the explain analyse.

What's the query exactly, and what are the schemas of the tables it
uses (psql \d descriptions would do)?

The actual runtime seems to be almost all spent in the hash aggregation
step:

>  ->  HashAggregate  (cost=58.13..58.16 rows=1 width=117) (actual 
> time=15.572..15.573 rows=2 loops=1)
>  ->  Hash IN Join  (cost=3.34..58.10 rows=7 width=117) (actual 
> time=0.261..0.544 rows=50 loops=1)

15 msec seems like a long time to aggregate only 50 rows, so I'm
wondering what aggregates are being calculated and over what
datatypes...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq