from:"Janning"

[PERFORM] Write performance

2010-06-24 Thread Janning

Hi,

at the moment we encounter some performance problems with our database server.

We have a 12 GB RAM machine with intel i7-975 and using
3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)" 
One disk for the system and WAL etc. and one SW RAID-0 with two disks for  
postgresql data. Our database is about 24GB.

Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and 
reads of about 1000 blocks per second on our disk which holds the data 
directories of postgresql (WAL are on a different disk)

3000 blocks ~ about 3 MB/s write
1000 blocks ~ about 1 MB/s read

At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load 
(so 4 of 8 cpu cores are in use for io wait)

We know, its a poor man disk setup (but we can not find a hoster with rather 
advanced disk configuration at an affordable price). Anyway, we ran some tests 
on it:


# time sh -c "dd if=/dev/zero of=bigfile bs=8k count=300 && sync"
300+0 records in
300+0 records out
2457600 bytes (25 GB) copied, 276.03 s, 89.0 MB/s

real4m48.658s
user0m0.580s
sys 0m51.579s

# time dd if=bigfile of=/dev/null bs=8k
300+0 records in
300+0 records out
2457600 bytes (25 GB) copied, 222.841 s, 110 MB/s

real3m42.879s
user0m0.468s
sys 0m18.721s



Of course, writing large chunks is quite a different usage pattern. But I am 
wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can 
run a test with 89 MB/s writing and 110MB/s reading.

Can you give some hints, if this numbers seems to be reasonable? 

kind regards
Janning





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Write performance

2010-06-24 Thread Janning

On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
> On Thu, 24 Jun 2010, Janning wrote:
> > We have a 12 GB RAM machine with intel i7-975 and using
> > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
>
> Those discs are 1.5TB, not 1.5GB.

sorry, my fault.

> > One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> > postgresql data. Our database is about 24GB.
>
> Beware of RAID-0 - make sure you can recover the data when (not if) a disc
> fails.

oh sorry again, its a raid-1 of course. shame on me.

> > Our munin graph reports at 9:00 a clock writes of 3000 blocks per second
> > and reads of about 1000 blocks per second on our disk which holds the
> > data directories of postgresql (WAL are on a different disk)
> >
> > 3000 blocks ~ about 3 MB/s write
> > 1000 blocks ~ about 1 MB/s read
> >
> > At the same time we have nearly 50% CPU I/O wait and only 12% user CPU
> > load (so 4 of 8 cpu cores are in use for io wait)
>
> Not quite sure what situation you are measuring these figures under.
> However, as a typical figure, let's say you are doing random access with
> 8kB blocks (as in Postgres), and the access time on your drive is 8.5ms
> (as with these drives).
>
> For each drive, you will be able to read/write approximately 8kB /
> 0.0085s, giving 941kB per second. If you have multiple processes all doing
> random access, then you may be able to utilise both discs and get double
> that.

So with your calculation I have a maximum of 2MB/s random access. So i really 
need to upgrade my disk configuration!

> > Of course, writing large chunks is quite a different usage pattern. But I
> > am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if
> > i can run a test with 89 MB/s writing and 110MB/s reading.
>
> That's quite right, and typical performance figures for a drive like that.

thanks for your help.

kind regards 
Janning

> Matthew
>
> --
>  Don't criticise a man until you have walked a mile in his shoes; and if
>  you do at least he will be a mile behind you and bare footed.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Write performance

2010-06-24 Thread Janning

thanks for your quick response, kenneth

On Thursday 24 June 2010 14:47:34 you wrote:
> On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
> > Hi,
> >
> > at the moment we encounter some performance problems with our database
> > server.
> >
> > We have a 12 GB RAM machine with intel i7-975 and using
> > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 GB)"
> > One disk for the system and WAL etc. and one SW RAID-0 with two disks for
> > postgresql data. Our database is about 24GB.
[...]
> Your I/O is extremely under-powered relative to your
> CPU/memory. For DB servers, many times you need much more I/O
> instead.

So at the moment we are using this machine as our primary database server:
http://www.hetzner.de/en/hosting/produkte_rootserver/eq9/

Sadly, our hoster is not offering advanced disk setup. Now we have two options

1. buying a server on our own and renting a co-location.
I fear we do not know enough about hardware to vote for this option. I think 
for co-locating your own server one should have more knowledge about hardware.

2. renting a server from a hoster with an advanced disk setup.
Can anybody recommend a good hosting solution in germany with a good disk 
setup for postgresql? 

kind regards
Janning

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Write performance

2010-06-24 Thread Janning Vygen

On Thursday 24 June 2010 15:16:05 Janning wrote:
> On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote:
> > On Thu, 24 Jun 2010, Janning wrote:
> > > We have a 12 GB RAM machine with intel i7-975 and using
> > > 3 disks "Seagate Barracuda 7200.11, ST31500341AS (1.5 TB)"
> > >
> > For each drive, you will be able to read/write approximately 8kB /
> > 0.0085s, giving 941kB per second. If you have multiple processes all
> > doing random access, then you may be able to utilise both discs and get
> > double that.
>
> So with your calculation I have a maximum of 2MB/s random access. So i
> really need to upgrade my disk configuration!

i was looking at tomshardware.com and the fastest disk is

   Maxtor Atlas 15K II * 8K147S0,SAS,147 GB, 16 MB Cache,15000 rpm

with 5.5 ms random access time. 

So even if i switch to those disks i can only reach a perfomace gain of 1.5, 
right? 

To achieve a better disk performance by factor of ten, i need a raid-10 setup 
with 12 disks (so i have 6 raid-1 bundles). Or are there other factors with 
high end disks? 

kind regards 
Janning


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] The black art of postgresql.conf tweaking

2004-08-04 Thread Janning Vygen

Am Mittwoch, 4. August 2004 14:45 schrieb Paul Serby:
> Apache on the Web server can take up to 300 connections and PHP is using
>   pg_pconnect
>
> Postgres is set with the following.
>
> max_connections = 300
> shared_buffers = 38400
> sort_mem = 12000
>
> But Apache is still maxing out the non-super user connection limit.

for most websites 300 connections is far too much (imagine even 10 request per 
second for 10 hours a day ends up to 10.8 Mio pages a month)

but anyway: you should first focus on closing your http connection to the user 
as fast as possible. then you dont need so much concurrent connections which 
keep db connections open and uses memory.

I did the following:
- apache: keepalive off
- apache patch: lingerd (google for it)
- apache mod_gzip
- pg_pconnect

this keeps your http connection as short as possible, so the apache child is 
ready to serve the next client. 

Imagine 5 seconds of keepalive 1 second on lingering half-closed tcp 
connections and 4 more seconds for transport of uncompressed content.

in this scenario your apache child uses memory an your pooled db connection 
for 10 seconds while doing nothing!

in my experience apache in standard configuration can be the main bottleneck. 
and teh combination of keepalive off, lingerd and mod_gzip is GREAT and i 
didn't found much sites propagating a configuration like this.

kind regards,
janning

p.s: sorry for being slightly off topic and talking about apache but when it 
comes to performance it is always important to look at the complete system.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] EXPLAIN ANALYZE much slower than running query normally

2004-10-11 Thread Janning Vygen

Am Dienstag, 5. Oktober 2004 08:49 schrieb Chris Hutchinson:
> Running a trivial query in v7.4.2 (installed with fedora core2) using
> EXPLAIN ANALYZE is taking considerably longer than just running the query
> (2mins vs 6 secs). I was using this query to quickly compare a couple of
> systems after installing a faster disk.
>
> Is this sort of slowdown to be expected?

no.

did you run VACCUM ANALYZE before? you should do it after pg_restore your db 
to a new filesystem

in which order did you ran the queries. If you start your server and run two 
equal queries, the second one will be much faster because of some or even all 
data needed to answer the query is still in the shared buffers. 

janning



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] why my query is not using index??

2004-10-11 Thread Janning Vygen

Am Mittwoch, 6. Oktober 2004 09:31 schrieben Sie:
> postgres=# explain ANALYZE select * from test where today < '2004-01-01';
>  QUERY PLAN
>- Seq Scan on test  (cost=0.00..19.51 rows=334
> width=44) (actual
> time=0.545..2.429 rows=721 loops=1)
>Filter: (today < '2004-01-01 00:00:00'::timestamp without time zone)
>  Total runtime: 3.072 ms
> (3 rows)
>
> postgres=# explain ANALYZE select * from test where today > '2003-01-01'
> and today < '2004-01-01';
>   QUERY
> PLAN
> --- Index
> Scan using idx_today on test  (cost=0.00..18.89 rows=6 width=44) (actual
> time=0.055..1.098 rows=365 loops=1)
>Index Cond: ((today > '2003-01-01 00:00:00'::timestamp without time
> zone) AND (today < '2004-01-01 00:00:00'::timestamp without time zone))
>  Total runtime: 1.471 ms
> (3 rows)
>
> hello
>
> I was expected 1st query should using index, but it doesn't
> 2nd query doing perfect as you see.

postgres uses a seq scan if its faster. In your case postgres seems to know 
that most of your rows have a date < 2004-01-01 and so doesn't need to 
consult the index if it has to read every page anyway. seq scan can be faster 
on small tables. try (in psql) "SET enable_seqscan TO off;"  before running 
your query and see how postgres plans it without using seq scan.

janning




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] why my query is not using index??

2004-10-11 Thread Janning Vygen

Am Montag, 11. Oktober 2004 22:49 schrieb Francisco Reyes:
> On Mon, 11 Oct 2004, Janning Vygen wrote:
> > postgres uses a seq scan if its faster. In your case postgres seems to
> > know that most of your rows have a date < 2004-01-01 and so doesn't need
> > to consult the index if it has to read every page anyway. seq scan can be
> > faster on small tables. try (in psql) "SET enable_seqscan TO off;" 
> > before running your query and see how postgres plans it without using seq
> > scan.
>
> I was about to post and saw this message.
> I have a query that was using sequential scans. Upon turning seqscan to
> off it changed to using the index. What does that mean?

enable_seqscan off means that postgres is not allowed to use seqscan.
default is on and postgres decides for each table lookup which method is 
faster: seq scan or index scan. thats what the planner does: deciding which 
access method might be the fastest.

> The tables are under 5k records so I wonder if that is why the optimizer
> is option, on it's default state, to do sequential scans.

if you have small tables, postgres is using seqscan to reduce disk lookups. 
postgresql reads disk pages in 8k blocks. if your whole table is under 8k 
there is no reason for postgres to load an index from another disk page 
because it has to load the whole disk anyway. 

not sure, but i think postgres also analyzes the table to see which values are 
in there. if you have a huge table with a column of integers and postgres 
knows that 99% are of value 1 and you are looking for a row with a value of 
1, why should it use an index just to see that it has to load the whole table 
to find a matching row.

And that's why you can't make performance tests with small tables. you need 
test data which is as close as possible to real data.

> I was also wondering if there is a relation between the sequential scans
> and the fact that my entire query is a series of left joins:

no.

janning

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] slow rule on update

2004-10-05 Thread Janning Vygen

Hi,

(pg_version 7.4.2, i do run vacuum analyze on the whole database frequently 
and just before executing statements below)

i dont know if anyone can help me because i dont know really where the problem 
is, but i try. If any further information is needed i'll be glad to send.

my real rule much longer (more calculation instead of "+ 1") but this shortcut 
has the same disadvantages in performance:

CREATE RULE ru_sp_update AS ON UPDATE TO Spiele
DO
  UPDATE punktecache SET pc_punkte = pc_punkte + 1

FROM Spieletipps AS stip
NATURAL JOIN tippspieltage2spiele AS tspt2sp

WHERE punktecache.tr_kurzname = stip.tr_kurzname
AND punktecache.mg_name = stip.mg_name
AND punktecache.tspt_name   = tspt2sp.tspt_name
AND stip.sp_id = OLD.sp_id
;

punktecache is a materialized view which should be updated by this rule

# \d punktecache
 Table "public.punktecache"
   Column|   Type   | Modifiers
-+--+---
 tr_kurzname | text | not null
 mg_name | text | not null
 tspt_name   | text | not null
 pc_punkte   | smallint | not null
Indexes:
"pk_punktecache" primary key, btree (tr_kurzname, mg_name, tspt_name)
Foreign-key constraints:
"fk_mitglieder" FOREIGN KEY (tr_kurzname, mg_name) REFERENCES 
mitglieder(tr_kurzname, mg_name) ON UPDATE CASCADE ON DELETE CASCADE
"fk_tippspieltage" FOREIGN KEY (tr_kurzname, tspt_name) REFERENCES 
tippspieltage(tr_kurzname, tspt_name) ON UPDATE CASCADE ON DELETE CASCADE


my update statement:

explain analyze UPDATE spiele
SETsp_heimtore = spup.spup_heimtore,
   sp_gasttore = spup.spup_gasttore,
   sp_abpfiff  = spup.spup_abpfiff
FROM   spieleupdates AS spup
WHERE  spiele.sp_id = spup.sp_id;

and output from explain
[did i post explain's output right? i just copied it, but i wonder if there is 
a more pretty print like method to post explain's output?]


 Nested Loop  (cost=201.85..126524.78 rows=1 width=45) (actual 
time=349.694..290491.442 rows=100990 loops=1)
   ->  Nested Loop  (cost=201.85..126518.97 rows=1 width=57) (actual 
time=349.623..288222.145 rows=100990 loops=1)
 ->  Hash Join  (cost=201.85..103166.61 rows=4095 width=64) (actual 
time=131.376..8890.220 rows=102472 loops=1)
   Hash Cond: (("outer".tspt_name = "inner".tspt_name) AND 
("outer".tr_kurzname = "inner".tr_kurzname))
   ->  Seq Scan on punktecache  (cost=0.00..40970.20 rows=2065120 
width=45) (actual time=0.054..4356.321 rows=2065120 loops=1)
   ->  Hash  (cost=178.16..178.16 rows=4738 width=35) (actual 
time=102.259..102.259 rows=0 loops=1)
 ->  Nested Loop  (cost=0.00..178.16 rows=4738 width=35) 
(actual time=17.262..88.076 rows=10519 loops=1)
   ->  Seq Scan on spieleupdates spup  
(cost=0.00..0.00 rows=1 width=4) (actual time=0.015..0.024 rows=1 loops=1)
   ->  Index Scan using ix_tspt2sp_fk_spiele on 
tippspieltage2spiele tspt2sp  (cost=0.00..118.95 rows=4737 width=31) (actual 
time=17.223..69.486 rows=10519 loops=1)
 Index Cond: ("outer".sp_id = tspt2sp.sp_id)
 ->  Index Scan using pk_spieletipps on spieletipps stip  
(cost=0.00..5.69 rows=1 width=25) (actual time=2.715..2.717 rows=1 
loops=102472)
   Index Cond: (("outer".tr_kurzname = stip.tr_kurzname) AND 
("outer".mg_name = stip.mg_name) AND ("outer".sp_id = stip.sp_id))
   ->  Index Scan using pk_spiele on spiele  (cost=0.00..5.78 rows=1 width=4) 
(actual time=0.012..0.014 rows=1 loops=100990)
 Index Cond: (spiele.sp_id = "outer".sp_id)
 Total runtime: 537319.321 ms


Can this be made any faster? Can you give me a hint where to start research? 

My guess is that the update statement inside the rule doesnt really uses the 
index on punktecache, but i dont know why and i dont know how to change it.

Any hint or help is is very appreciated.

kind regards
janning

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[PERFORM] Write performance

Re: [PERFORM] Write performance

Re: [PERFORM] Write performance

Re: [PERFORM] Write performance

Re: [PERFORM] The black art of postgresql.conf tweaking

Re: [PERFORM] EXPLAIN ANALYZE much slower than running query normally

Re: [PERFORM] why my query is not using index??

Re: [PERFORM] why my query is not using index??

[PERFORM] slow rule on update

9 matches

Site Navigation

Mail list logo

Footer information