[PERFORM] Help how to tune-up my Database

2004-05-07 Thread Michael Ryan S. Puncia








Hi,

    I
am a newbie here and just starting to use postgresql. My problems is how to
tune up my server because it its too slow.

We just ported from DBF to postgresql.

 

This is my PC specs: P4, 512Ram, Linux 9

 

Because I am working in a statistical organization we have a
very large data volume

These are my data:

 

Table 1 with 60 million data but only with 10 fields

Table 2 with 30 million data with 15 fields

Table 3 with 30 million data with 10 fields

 

I will only use this server for querying ….. I already
read and apply those articles found in the archives section but still the
performance is not good.

I am planning to add another 512 RAM …Another question
is how to calculate shared_buffer size ..

 

Thanks a lot and hoping for your kind answers ..

 

Michael Puncia

Philippines

 

 

 

 

  


 








Re: [PERFORM] Help how to tune-up my Database

2004-05-07 Thread scott.marlowe
On Fri, 7 May 2004, Michael Ryan S. Puncia wrote:

> Hi,
> 
> I am a newbie here and just starting to use postgresql. My
> problems is how to tune up my server because it its too slow.

First, read this:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

> This is my PC specs: P4, 512Ram, Linux 9

get more ram.

Hard Drives:  interface, how many, RAID???

For a mostly read database IDEs are pretty good.  Having multiple drives 
in a RAID-5 or RAID1+0 works well on a mostly read database too.  Keep the 
stripe size small is setting up a RAID array for a database.

> Because I am working in a statistical organization we have a very large data
> volume
> 
> These are my data:
> 
>  
> 
> Table 1 with 60 million data but only with 10 fields
> 
> Table 2 with 30 million data with 15 fields
> 
> Table 3 with 30 million data with 10 fields

That's not really that big, but it's big enough you have to make sure your 
server is tuned properly.

> I will only use this server for querying ... I already read and apply those
> articles found in the archives section but still the performance is not
> good.
> 
> I am planning to add another 512 RAM .Another question is how to calculate
> shared_buffer size ..

I'm assuming you've recently vacuumed full and analyzed your database...

Shared buffers should probably be between 1000 and 1 on about 98% of 
all installations.  Setting it higher than 25% of memory is usually a bad 
idea.  Since they're in 8k blocks (unless you compiled with a customer 
block size, you'd know if you did, it's not something you can accidentally 
do by leaning on the wrong switch...) you probably want about 1 blocks 
or so to start, which will give you about 80 megs of shared buffer.

PostgreSQL doesn't really cache as well as the kernel, so it's better to 
leave more memory available for kernel cache than you allocate to buffer 
cache.  On a machine with only 512Meg, I'm guessing you'll get about 128 
to 200 megs of kernel cache if you're only running postgresql and you have 
it set to 1 buffers.

The important things to check / set are things lik effective_cache_size.  
It too is measured in 8k blocks, and reflects the approximate amount of 
kernel cache being dedicated to postgresql.  assuming a single service 
postgresql only box, that will be the number that a server that's been up 
for a while shows under top like so:

  9:50am  up 12:16,  4 users,  load average: 0.00, 0.00, 0.00  
 
104 processes: 102 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  0.7% user,  0.3% system,  0.0% nice,  1.7% idle
Mem:   512924K av,  499248K used,   13676K free,   0K shrd,   54856K buff
Swap: 2048248K av,5860K used, 2042388K free  229572K cached

the 229572k cached entry shows about 230 megs.  divided by 8192 we get 
about 28000.

sort_mem might do with a small bump, especially if you're only handling a 
few connections at a time.  Be careful, it's per sort, and measured in 
megs, so it's easy for folks to set it too high and make their machine 
start flushing too much kernel cache, which will slow down the other 
backends that have to go to disk for data.

A good starting point for testing is anywhere from 8192 to 32768.  32768 
is 32 megs, which can starve a machine as small as yours if there are a 
couple of queries each running a couple of sorts on large sets at the same 
time.

Lastly, using explain analyze  you can see if postgresql 
is making a bad plan choice.  compared estimated rows to actual rows.  
Look for things like nested loops being run on what the planner thinks 
will be 80 rows but is, in fact, 8000 rows.

You can change random page cost to change the tendency of the server to 
favor seq scans to index scans.  Lower = greater tendency towards index 
scans.  the default is 4, but most production servers with enough memory 
to cache most of their data will run well on a setting of 1.2 to 1.4.  My 
dual 2800 with 2 gig ram runs quite well at 1.3 to 1.4. 

You can also change the settings to random_page_cost, as well as turning 
off options to the planner with the following env vars:

enable_hashagg
enable_hashjoin
enable_indexscan
enable_mergejoin
enable_nestloop
enable_seqscan
enable_sort
enable_tidscan

They are all on by default, and shouldn't really be turned off by default 
for the most part.  but for an individual session to figure out if the 
query planner is making the right plan you can set them to off to see if 
using another plan works better.  

so, if you've got a nested loop running over 8 rows that the planner 
thought was gonna be 80 rows, you can force it to stop using the nested 
loop for your session with:

set enable_nestloop=off;

and use explain analyze to see if it runs faster.

You can set effective_cache_size and sort_mem on the fly for a single 
connection, or set them in postgresql.conf and restart or reload to make a 
change in the default.

shared_buffers is set

Re: [PERFORM] [OFF-TOPIC] - Known maximum size of the

2004-05-07 Thread Gaetano Mendola
Don Vaillancourt wrote:

Here's a question a little off-topic.

What would a 32TB database hardware configuration look like.  I'm 
assuming 200GB hard-drives which would total 160 of them.  Double that 
if you mirror them.

Am I correct?
Why do you have to mirror them ? Usually a SAN make data redundancy
using a RAID 4 or 5, this depend if you need read performances or
write performances, in the case of Red Sherif I guess that guys are
using RAID 50 ( 0 + 5 ) sets so what you "waste" is a disk for each
set.


Regards
Gaetano Mendola




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]