[PERFORM] Page Miss Hits

2004-08-02 Thread Ioannis Theoharis


Hi, i would like to answer if there is any way in postgres to find the
page miss hits caused during a query execution.


Is there something like explain analyze with the page miss hits???

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] No index usage with "left join"

2004-08-02 Thread mailing
We have a "companies" and a "contacts" table with about 3000 records
each.

We run the following SQL-Command which runs about 2 MINUTES !:

SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
companies.intfield01

contacts.sid (type text, b-tree index on it)
companies.intfield01 (type bigint, b-tree index on it)

comfire=> explain analyze SELECT count(*) FROM prg_contacts LEFT JOIN
prg_addresses ON prg_contacts.sid=prg_addresses.intfield01;
NOTICE:  QUERY PLAN:

Aggregate  (cost=495261.02..495261.02 rows=1 width=15) (actual
time=40939.38..40939.38 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..495253.81 rows=2885 width=15) (actual
time=0.05..40930.14 rows=2866 loops=1)
->  Seq Scan on prg_contacts  (cost=0.00..80.66 rows=2866
width=7) (actual time=0.01..18.10 rows=2866 loops=1)
->  Seq Scan on prg_addresses  (cost=0.00..131.51 rows=2751
width=8) (actual time=0.03..6.25 rows=2751 loops=2866)
Total runtime: 40939.52 msec

EXPLAIN

Note:
- We need the left join because we need all contacts even if they are
not assigned to a company
- We are not able to change the datatypes of the joined fields
because we use a standard software (btw who cares: SuSE Open Exchange
Server)
- When we use a normal join (without LEFT or a where clause) the SQL
runs immediately using the indexes

How can I force the usage of the indexes when using "left join". Or
any other SQL construct that does the same !? Can anybody please give
us a hint !?

Thanks in forward.

Greetings
Achim

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] No index usage with "left join"

2004-08-02 Thread Leeuw van der, Tim
Cannot you do a cast in your query? Does that help with using the indexes?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
[EMAIL PROTECTED]
Sent: maandag 2 augustus 2004 14:09
To: [EMAIL PROTECTED]
Subject: [PERFORM] No index usage with "left join"


We have a "companies" and a "contacts" table with about 3000 records
each.

We run the following SQL-Command which runs about 2 MINUTES !:

SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
companies.intfield01

contacts.sid (type text, b-tree index on it)
companies.intfield01 (type bigint, b-tree index on it)

comfire=> explain analyze SELECT count(*) FROM prg_contacts LEFT JOIN
prg_addresses ON prg_contacts.sid=prg_addresses.intfield01;
NOTICE:  QUERY PLAN:

Aggregate  (cost=495261.02..495261.02 rows=1 width=15) (actual
time=40939.38..40939.38 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..495253.81 rows=2885 width=15) (actual
time=0.05..40930.14 rows=2866 loops=1)
->  Seq Scan on prg_contacts  (cost=0.00..80.66 rows=2866
width=7) (actual time=0.01..18.10 rows=2866 loops=1)
->  Seq Scan on prg_addresses  (cost=0.00..131.51 rows=2751
width=8) (actual time=0.03..6.25 rows=2751 loops=2866)
Total runtime: 40939.52 msec

EXPLAIN

Note:
- We need the left join because we need all contacts even if they are
not assigned to a company
- We are not able to change the datatypes of the joined fields
because we use a standard software (btw who cares: SuSE Open Exchange
Server)
- When we use a normal join (without LEFT or a where clause) the SQL
runs immediately using the indexes

How can I force the usage of the indexes when using "left join". Or
any other SQL construct that does the same !? Can anybody please give
us a hint !?

Thanks in forward.

Greetings
Achim

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] What kind of performace can I expect and how to measure?

2004-08-02 Thread Joost Kraaijeveld
Hi all,

My system is a PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 
20020903 (Red Hat Linux 8.0 3.2-7). It has a Pentium III-733 Mhz with 512 MB ram. It 
is connected to my workststation (dual XEON 1700 with 1 Gb RAM) with a 100 Mb switched 
network.

I have a table with 31 columns, all fixed size datatypes. It contains 88393 rows. 
Doing a "select * from table" with PGAdmin III in it's SQL window, it takes a total of 
9206 ms query runtime an a 40638 ms data retrievel runtime.

Is this a reasonable time to get 88393 rows from the database?

If not, what can I do to find the bottleneck (and eventually make it faster)?

Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: [EMAIL PROTECTED]
web: www.askesis.nl 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] What kind of performace can I expect and how to measure?

2004-08-02 Thread Merlin Moncure
Joost wrote:
> My system is a PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC
gcc
> (GCC) 20020903 (Red Hat Linux 8.0 3.2-7). It has a Pentium III-733 Mhz
> with 512 MB ram. It is connected to my workststation (dual XEON 1700
with
> 1 Gb RAM) with a 100 Mb switched network.
> 
> I have a table with 31 columns, all fixed size datatypes. It contains
> 88393 rows. Doing a "select * from table" with PGAdmin III in it's SQL
> window, it takes a total of 9206 ms query runtime an a 40638 ms data
> retrievel runtime.
> 
> Is this a reasonable time to get 88393 rows from the database?
> 
> If not, what can I do to find the bottleneck (and eventually make it
> faster)?

The 9206 ms time is what the database actually spent gathering the data
and sending it to you.  This is non-negotiable unless you bump up
hardware, etc, or fetch less data.  This time usually scales linearly
(or close to it) with the size of the dataset you fetch.

The 40638 ms time is pgAdmin putting the data in the grid.  This time
spent here is dependant on your client and starts to get really nasty
with large tables.  Future versions of pgAdmin might be able to deal
better with large datasets (cursor based fetch is one proposed
solution).  In the meantime, I would suggest using queries to refine
your terms a little bit...(do you really need to view all 80k records at
once?).

Merlin


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] No index usage with

2004-08-02 Thread G u i d o B a r o s i o
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match

Greetz,
Guido

> Cannot you do a cast in your query? Does that help with using the indexes?
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of
> [EMAIL PROTECTED]
> Sent: maandag 2 augustus 2004 14:09
> To: [EMAIL PROTECTED]
> Subject: [PERFORM] No index usage with "left join"
> 
> 
> We have a "companies" and a "contacts" table with about 3000 records
> each.
> 
> We run the following SQL-Command which runs about 2 MINUTES !:
> 
> SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
> companies.intfield01
> 
> contacts.sid (type text, b-tree index on it)
> companies.intfield01 (type bigint, b-tree index on it)
> 
> comfire=> explain analyze SELECT count(*) FROM prg_contacts LEFT JOIN
> prg_addresses ON prg_contacts.sid=prg_addresses.intfield01;
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=495261.02..495261.02 rows=1 width=15) (actual
> time=40939.38..40939.38 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..495253.81 rows=2885 width=15) (actual
> time=0.05..40930.14 rows=2866 loops=1)
>   ->  Seq Scan on prg_contacts  (cost=0.00..80.66 rows=2866
> width=7) (actual time=0.01..18.10 rows=2866 loops=1)
>   ->  Seq Scan on prg_addresses  (cost=0.00..131.51 rows=2751
> width=8) (actual time=0.03..6.25 rows=2751 loops=2866)
> Total runtime: 40939.52 msec
> 
> EXPLAIN
> 
> Note:
> - We need the left join because we need all contacts even if they are
> not assigned to a company
> - We are not able to change the datatypes of the joined fields
> because we use a standard software (btw who cares: SuSE Open Exchange
> Server)
> - When we use a normal join (without LEFT or a where clause) the SQL
> runs immediately using the indexes
> 
> How can I force the usage of the indexes when using "left join". Or
> any other SQL construct that does the same !? Can anybody please give
> us a hint !?
> 
> Thanks in forward.
> 
> Greetings
> Achim
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
> 
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] What kind of performace can I expect and how to measure?

2004-08-02 Thread Joost Kraaijeveld
Hi Merlin,

> The 9206 ms time is what the database actually spent 
> gathering the data and sending it to you.  This is non-negotiable unless you bump up
> hardware, etc, or fetch less data.  This time usually scales linearly
> (or close to it) with the size of the dataset you fetch.
>
> The 40638 ms time is pgAdmin putting the data in the grid.  This time
So it take PostgreSQL 9206 ms to get the data AND send it to the client. It than takes 
PGAdmin 40638 ms to display the data?

> solution).  In the meantime, I would suggest using queries to refine
> your terms a little bit...(do you really need to view all 80k 
> records at once?).
The application is build in Clarion, a 4 GL environment. We do not have any influence 
over the query it generates and executes.


Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: [EMAIL PROTECTED]
web: www.askesis.nl 

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] No index usage with

2004-08-02 Thread Bruce Momjian
G u i d o B a r o s i o wrote:
>  TIP 9: the planner will ignore your desire to choose an index scan if your
>joining column's datatypes do not match

And this is fixed in 7.5/8.0.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] No index usage with "left join"

2004-08-02 Thread Rod Taylor
> SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
> companies.intfield01
> 
> contacts.sid (type text, b-tree index on it)
> companies.intfield01 (type bigint, b-tree index on it)

> How can I force the usage of the indexes when using "left join". Or
> any other SQL construct that does the same !? Can anybody please give
> us a hint !?

You really don't need to use indexes since you're fetching all
information from both tables.

Anyway, we can be fairly sure this isn't PostgreSQL 7.4 (which would
likely choose a far better plan -- hash join rather than nested loop) as
it won't join a bigint to a text field without a cast.

Try this:
set enable_nestloop = false;
SELECT count(*) FROM contacts LEFT JOIN companies ON
cast(contacts.sid as bigint) = companies.intfield01;
set enable_nestloop = true;



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] No index usage with "left join"

2004-08-02 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
>> How can I force the usage of the indexes when using "left join".

> Anyway, we can be fairly sure this isn't PostgreSQL 7.4 (which would
> likely choose a far better plan -- hash join rather than nested loop)

Indeed, the lack of any join-condition line in the EXPLAIN output
implies it's 7.2 or older.  IIRC 7.4 is the first release that is
capable of using merge or hash join with a condition more complicated
than plain "Var = Var".  In this case, since the two fields are of
different datatypes, the planner sees something like "Var = Var::text"
(ie, there's an inserted cast function).  7.2 will just say "duh, too
complicated for me" and generate a nestloop.  With the columns being
of different datatypes, you don't even have a chance for an inner
indexscan in the nestloop.

In short: change the column datatypes to be the same, or update to
7.4.something.  There are no other solutions.

(Well, if you were really desperate you could create a set of
mergejoinable "text op bigint" comparison operators, and then 7.2
would be able to cope; but I should think that updating to 7.4 would
be much less work.) 

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] What kind of performace can I expect and how to measure?

2004-08-02 Thread Merlin Moncure
> Hi Merlin,
> 
> > The 9206 ms time is what the database actually spent
> > gathering the data and sending it to you.  This is non-negotiable
unless
> you bump up
> > hardware, etc, or fetch less data.  This time usually scales
linearly
> > (or close to it) with the size of the dataset you fetch.
> >
> > The 40638 ms time is pgAdmin putting the data in the grid.  This
time
> So it take PostgreSQL 9206 ms to get the data AND send it to the
client.
> It than takes PGAdmin 40638 ms to display the data?

That is correct.  This is not a problem with pgAdmin, or postgres, but a
problem with grids.   Conceptually, SQL tables are an in an unordered,
infinite space and grids require an ordered, finite space.  All 4GLs and
data managers have this problem.  The real solution is to refine your
query in a meaningful way (80k rows is more than a human being can deal
with in a practical sense).  If you can't do that, install an arbitrary
limit on the result set where performance breaks down, could be 10-100k
depending on various factors.

To simulate a finite, ordered, dataset, pgAdmin takes all the result
data and puts it in GUI controls are not designed to hold 100k rows
data...this is a design compromise to allow editing.

Merlin






---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] SSD Drives

2004-08-02 Thread jason.servetar
Thanks I found the same info on the tigi and like what I saw. I also
spoke with a consulting firm that has used them and also says good
things, but they have not tried it with postgres. I will post an
analysis of performance once we have the equipment ordered and
installed. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Sunday, August 01, 2004 5:43 AM
To: [EMAIL PROTECTED]
Subject: [PERFORM] SSD Drives


To the person who was looking for a $5k midlevel SSD drive (sorry, I hit
'd'
too fast):

  http://www.tigicorp.com/tigijet_exp_s.htm

I found this via this interesting survey of SSD products:

  http://www.storagesearch.com/ssd-buyers-guide.html

Incidentally it seems the popular Platypus SSD PCI cards are no more,
Platypus
appears to have gone out of business.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] No index usage with "left join"

2004-08-02 Thread Scott Marlowe
On Mon, 2004-08-02 at 06:08, [EMAIL PROTECTED] wrote:
> We have a "companies" and a "contacts" table with about 3000 records
> each.
> 
> We run the following SQL-Command which runs about 2 MINUTES !:
> 
> SELECT count(*) FROM contacts LEFT JOIN companies ON contacts.sid =
> companies.intfield01
> 
> contacts.sid (type text, b-tree index on it)
> companies.intfield01 (type bigint, b-tree index on it)
> 
> comfire=> explain analyze SELECT count(*) FROM prg_contacts LEFT JOIN
> prg_addresses ON prg_contacts.sid=prg_addresses.intfield01;
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=495261.02..495261.02 rows=1 width=15) (actual
> time=40939.38..40939.38 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..495253.81 rows=2885 width=15) (actual
> time=0.05..40930.14 rows=2866 loops=1)
>   ->  Seq Scan on prg_contacts  (cost=0.00..80.66 rows=2866
> width=7) (actual time=0.01..18.10 rows=2866 loops=1)
>   ->  Seq Scan on prg_addresses  (cost=0.00..131.51 rows=2751
> width=8) (actual time=0.03..6.25 rows=2751 loops=2866)
> Total runtime: 40939.52 msec
> 
> EXPLAIN
> 
> Note:
> - We need the left join because we need all contacts even if they are
> not assigned to a company
> - We are not able to change the datatypes of the joined fields
> because we use a standard software (btw who cares: SuSE Open Exchange
> Server)
> - When we use a normal join (without LEFT or a where clause) the SQL
> runs immediately using the indexes
> 
> How can I force the usage of the indexes when using "left join". Or
> any other SQL construct that does the same !? Can anybody please give
> us a hint !?

Why in the world would the database use the index in this case?  You're
retrieving every single row, so it may as well hit the data store
directly.  By the way, unlike many other databases that can just hit the
index, PostgreSQL always has to go back to the data store anyway to get
the real value, so if it's gonna hit more than some small percentage of
rows, it's usually a win to just seq scan it.  Try restricting your
query with a where clause to one or two rows and see what you get.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] What kind of performace can I expect and how to

2004-08-02 Thread Scott Marlowe
On Mon, 2004-08-02 at 06:21, Joost Kraaijeveld wrote:
> Hi all,
> 
> My system is a PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 
> 20020903 (Red Hat Linux 8.0 3.2-7). It has a Pentium III-733 Mhz with 512 MB ram. It 
> is connected to my workststation (dual XEON 1700 with 1 Gb RAM) with a 100 Mb 
> switched network.
> 
> I have a table with 31 columns, all fixed size datatypes. It contains 88393 rows. 
> Doing a "select * from table" with PGAdmin III in it's SQL window, it takes a total 
> of 9206 ms query runtime an a 40638 ms data retrievel runtime.

This means it took the backend about 9 seconds to prepare the data, and
40 or so seconds total (including the 9 I believe) for the client to
retrieve and then display it.

> Is this a reasonable time to get 88393 rows from the database?

Depends on your row size really.  I'm certain you're not CPU bound if
you've only got one hard drive.  Put that data on a 20 way RAID5 array
and I'm sure it would come back a little quicker.

> If not, what can I do to find the bottleneck (and eventually make it faster)?

The bottleneck is almost always IO to start with.  First, as another
drive and mirror it.  Then go to RAID 1+0, then add more and more
drives.

Read this document about performance tuning:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Page Miss Hits

2004-08-02 Thread Scott Marlowe
On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote:
> Hi, i would like to answer if there is any way in postgres to find the
> page miss hits caused during a query execution.
> 
> 
> Is there something like explain analyze with the page miss hits???

You're making a basic assumption that is (at least currently) untrue,
and that is that PostgreSQL has it's own cache.  It doesn't.  It has a
buffer that drops buffer back into the free pool when the last
referencing backend concludes and shuts down.  So, PostgreSQL currently
relies on the kernel to cache for it.  So, what you need is a tool that
monitors the kernel cache usage and its hit rate.  I'm not familiar with
any, but I'm sure something out there likely does that.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Page Miss Hits

2004-08-02 Thread Gaetano Mendola
Scott Marlowe wrote:
On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote:
Hi, i would like to answer if there is any way in postgres to find the
page miss hits caused during a query execution.
Is there something like explain analyze with the page miss hits???

You're making a basic assumption that is (at least currently) untrue,
and that is that PostgreSQL has it's own cache.
Are you sure of this ? What is the meaning of the ARC recently introduced
then ?

Regards
Gaetano Mendola
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Page Miss Hits

2004-08-02 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Scott Marlowe wrote:
| On Mon, 2004-08-02 at 10:43, Gaetano Mendola wrote:
|
|>Scott Marlowe wrote:
|>
|>>On Mon, 2004-08-02 at 02:11, Ioannis Theoharis wrote:
|>>
|>>
|>>>Hi, i would like to answer if there is any way in postgres to find the
|>>>page miss hits caused during a query execution.
|>>>
|>>>
|>>>Is there something like explain analyze with the page miss hits???
|>>
|>>
|>>You're making a basic assumption that is (at least currently) untrue,
|>>and that is that PostgreSQL has it's own cache.
|>
|>Are you sure of this ? What is the meaning of the ARC recently introduced
|>then ?
|
|
| Yes I am.  Test it yourself, setup a couple of backends, select * from
| some big tables, then, one at a time, shut down the psql clients and
| when the last one closes, the shared mem goes away.  Run another client,
| do select * from the big table, and watch the client size grow from a
| few meg to a size large enough to hold the whole table (or however much
| your shared_buffers will hold.)
|
| While someone may make ARC and the shared buffers act like a cache some
| day (can't be that hard, most of the work is done really) right now it's
| not how it works.
|
| ARC still helps, since it makes sure the shared_buffers don't all get
| flushed from the useful small datasets when a seq scan gets executed.
I'm still not convinced. Why the last backend alive, have to throw away
bunch of memory copied in the SHM? And again, the ARC is a replacement
policy for a cache, which one ?
Regards
Gaetano Mendola






-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBDqkL7UpzwH2SGd4RAsQFAKCWVpCXKgRfE1nc44ZmtEaIrtNaIQCgr4fd
Hx2NiuRzV0UQ3Na9g/zQbzE=
=XWua
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] my boss want to migrate to ORACLE

2004-08-02 Thread Stephane Tessier
I checked and we have a 128 megs battery backed cache on the raid
controller...

-Original Message-
From: Scott Marlowe [mailto:[EMAIL PROTECTED]
Sent: 30 juillet, 2004 11:15
To: Stephane Tessier
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PERFORM] my boss want to migrate to ORACLE


On Fri, 2004-07-30 at 07:56, Stephane Tessier wrote:
> I think with your help guys I'll do it!
>
> I'm working on it!
>
> I'll work on theses issues:
>
> we have space for more ram(we use 2 gigs on possibility of 3 gigs)
> iowait is very high 98% --> look like postgresql wait for io access
> raid5 -->raid0 if i'm right raid5 use 4 writes(parity,data, etc) for each
> write on disk

Just get battery backed cache on your RAID controller.  RAID0 is way too
unreliable for a production environment.  One disk dies and all your
data is just gone.

> use more transactions (we have a lot of insert/update without
transaction).
> cpu look like not running very hard
>
> *php is not running on the same machine
> *redhat enterprise 3.0 ES
> *the version of postgresql is 7.3.4(using RHDB from redhat)
> *pg_autovacuum running at 12 and 24 hour each day
>
>
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of
> [EMAIL PROTECTED]
> Sent: 29 juillet, 2004 23:00
> To: Stephane Tessier
> Cc: [EMAIL PROTECTED]
> Subject: Re: [PERFORM] my boss want to migrate to ORACLE
>
>
> A furthur thought or two:
>
> - you are *sure* that it is Postgres that is slow? (could be Php...or your
>   machine could be out of some resource - see next 2 points)
> - is your machine running out of cpu or memory?
> - is your machine seeing huge io transfers or long io waits?
> - are you running Php on this machine as well as Postgres?
> - what os (and what release) are you running? (guessing Linux but...)
>
> As an aside, they always say this but: Postgres 7.4 generally performs
> better
> than 7.3...so an upgrade could be worth it - *after* you have
> solved/identified
> the other issues.
>
> best wishes
>
> Mark
>
> Quoting Stephane Tessier <[EMAIL PROTECTED]>:
>
> > Hi everyone,
> >
> > somebody can help me??? my boss want to migrate to
> > ORACLE
> >
> > we have a BIG problem of performance,it's slow
> > we use postgres 7.3 for php security application with approximately 4
> > millions of insertion by day and 4 millions of delete and update
> > and archive db with 40 millions of archived stuff...
> >
> > we have 10 databases for our clients and a centralized database for the
> > general stuff.
> >
> > database specs:
> >
> > double XEON 2.4 on DELL PowerEdge2650
> > 2 gigs of RAM
> > 5 SCSI Drive RAID 5 15rpm
> >
> > tasks:
> >
> > 4 millions of transactions by day
> > 160 open connection 24 hours by day 7 days by week
> > pg_autovacuum running 24/7
> > reindex on midnight
>
>
>
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
>
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Page Miss Hits

2004-08-02 Thread Rod Taylor
> | ARC still helps, since it makes sure the shared_buffers don't all get
> | flushed from the useful small datasets when a seq scan gets executed.
> 
> I'm still not convinced. Why the last backend alive, have to throw away
> bunch of memory copied in the SHM? And again, the ARC is a replacement
> policy for a cache, which one ?

As you know, ARC is a recent addition. I've not seen any benchmarks
demonstrating that the optimal SHARED_BUFFERS setting is different today
than it was in the past.

We know it's changed, but the old buffer strategy had an equally hard
time with a small buffer as it did a large one. Does that mean the
middle of the curve is still at 15k buffers but the extremes are handled
better? Or something completely different?

Please feel free to benchmark 7.5 (OSDL folks should be able to help us
as well) and report back.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] my boss want to migrate to ORACLE

2004-08-02 Thread James Thornton
Stephane Tessier wrote:
I checked and we have a 128 megs battery backed cache on the raid
controller...

we have space for more ram(we use 2 gigs on possibility of 3 gigs)
iowait is very high 98% --> look like postgresql wait for io access
raid5 -->raid0 if i'm right raid5 use 4 writes(parity,data, etc) for each
write on disk
Just get battery backed cache on your RAID controller.  RAID0 is way too
unreliable for a production environment.  One disk dies and all your
data is just gone.
I'm the one who sent the e-mail about RAID 5's 4 writes, but I suggested 
you look at RAID 10, not RAID 0.

--
 James Thornton
__
Internet Business Consultant, http://jamesthornton.com
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] my boss want to migrate to ORACLE

2004-08-02 Thread Stephane Tessier
oups,

i changed for RAID 10(strip and mirror)

-Original Message-
From: James Thornton [mailto:[EMAIL PROTECTED]
Sent: 2 aout, 2004 17:32
To: Stephane Tessier
Cc: 'Scott Marlowe'; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [PERFORM] my boss want to migrate to ORACLE


Stephane Tessier wrote:

> I checked and we have a 128 megs battery backed cache on the raid
> controller...

>>we have space for more ram(we use 2 gigs on possibility of 3 gigs)
>>iowait is very high 98% --> look like postgresql wait for io access
>>raid5 -->raid0 if i'm right raid5 use 4 writes(parity,data, etc) for each
>>write on disk
> 
> Just get battery backed cache on your RAID controller.  RAID0 is way too
> unreliable for a production environment.  One disk dies and all your
> data is just gone.

I'm the one who sent the e-mail about RAID 5's 4 writes, but I suggested 
you look at RAID 10, not RAID 0.

-- 

  James Thornton
__
Internet Business Consultant, http://jamesthornton.com


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Page Miss Hits

2004-08-02 Thread Gaetano Mendola
Rod Taylor wrote:
| ARC still helps, since it makes sure the shared_buffers don't all get
| flushed from the useful small datasets when a seq scan gets executed.
I'm still not convinced. Why the last backend alive, have to throw away
bunch of memory copied in the SHM? And again, the ARC is a replacement
policy for a cache, which one ?

As you know, ARC is a recent addition. I've not seen any benchmarks
demonstrating that the optimal SHARED_BUFFERS setting is different today
than it was in the past.
We know it's changed, but the old buffer strategy had an equally hard
time with a small buffer as it did a large one. Does that mean the
middle of the curve is still at 15k buffers but the extremes are handled
better? Or something completely different?
Please feel free to benchmark 7.5 (OSDL folks should be able to help us
as well) and report back.
I know, I know.
We were discussing about the fact that postgres use a his own cache or not;
and for the OP pleasure then if is possible retrieve hit and miss information
from that cache.
For benchmarch may be is better that you look not at the particular implementation
done in postgresql but at the general improvements that the ARC replacement
policy introduce. If I'm not wrong till now postgres was using an LRU,
around you can find some articles like these:
http://www.almaden.ibm.com/StorageSystems/autonomic_storage/ARC/rj10284.pdf
http://www.almaden.ibm.com/cs/people/dmodha/arcfast.pdf
where are showns the improvements.
As you wrote no one did benchmarks on demostrating with the "brute force" that
ARC is better but on the paper should be.
Regards
Gaetano Mendola





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] my boss want to migrate to ORACLE

2004-08-02 Thread markir
It may be worth pricing up expansion options e.g. 256M or more.
The other path to consider is changing RAID5 -> RAID10 if your card supports it.

However, I would recommend reducing that shared_buffers setting and doing your
performance measurements *again* - before changing anything else. This is
because you want to ensure that all your io hammering is not just because you
are making the machine swap (by giving postgres too much memory as Scott
mentioned)!

Quoting Stephane Tessier <[EMAIL PROTECTED]>:

> I checked and we have a 128 megs battery backed cache on the raid
> controller...



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] my boss want to migrate to ORACLE

2004-08-02 Thread Scott Marlowe
Is it set to write back or write through?  Also, you may want to look at
lowering the stripe size.  The default on many RAID controllers is 128k,
but for PostgreSQL 8k to 32k seems a good choice.  But that's not near
as important as the cache setting being write back.

On Mon, 2004-08-02 at 15:18, Stephane Tessier wrote:
> I checked and we have a 128 megs battery backed cache on the raid
> controller...
> 
> -Original Message-
> From: Scott Marlowe [mailto:[EMAIL PROTECTED]
> Sent: 30 juillet, 2004 11:15
> To: Stephane Tessier
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: Re: [PERFORM] my boss want to migrate to ORACLE
> 
> 
> On Fri, 2004-07-30 at 07:56, Stephane Tessier wrote:
> > I think with your help guys I'll do it!
> >
> > I'm working on it!
> >
> > I'll work on theses issues:
> >
> > we have space for more ram(we use 2 gigs on possibility of 3 gigs)
> > iowait is very high 98% --> look like postgresql wait for io access
> > raid5 -->raid0 if i'm right raid5 use 4 writes(parity,data, etc) for each
> > write on disk
> 
> Just get battery backed cache on your RAID controller.  RAID0 is way too
> unreliable for a production environment.  One disk dies and all your
> data is just gone.
> 
> > use more transactions (we have a lot of insert/update without
> transaction).
> > cpu look like not running very hard
> >
> > *php is not running on the same machine
> > *redhat enterprise 3.0 ES
> > *the version of postgresql is 7.3.4(using RHDB from redhat)
> > *pg_autovacuum running at 12 and 24 hour each day
> >
> >
> >
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] Behalf Of
> > [EMAIL PROTECTED]
> > Sent: 29 juillet, 2004 23:00
> > To: Stephane Tessier
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: [PERFORM] my boss want to migrate to ORACLE
> >
> >
> > A furthur thought or two:
> >
> > - you are *sure* that it is Postgres that is slow? (could be Php...or your
> >   machine could be out of some resource - see next 2 points)
> > - is your machine running out of cpu or memory?
> > - is your machine seeing huge io transfers or long io waits?
> > - are you running Php on this machine as well as Postgres?
> > - what os (and what release) are you running? (guessing Linux but...)
> >
> > As an aside, they always say this but: Postgres 7.4 generally performs
> > better
> > than 7.3...so an upgrade could be worth it - *after* you have
> > solved/identified
> > the other issues.
> >
> > best wishes
> >
> > Mark
> >
> > Quoting Stephane Tessier <[EMAIL PROTECTED]>:
> >
> > > Hi everyone,
> > >
> > > somebody can help me??? my boss want to migrate to
> > > ORACLE
> > >
> > > we have a BIG problem of performance,it's slow
> > > we use postgres 7.3 for php security application with approximately 4
> > > millions of insertion by day and 4 millions of delete and update
> > > and archive db with 40 millions of archived stuff...
> > >
> > > we have 10 databases for our clients and a centralized database for the
> > > general stuff.
> > >
> > > database specs:
> > >
> > > double XEON 2.4 on DELL PowerEdge2650
> > > 2 gigs of RAM
> > > 5 SCSI Drive RAID 5 15rpm
> > >
> > > tasks:
> > >
> > > 4 millions of transactions by day
> > > 160 open connection 24 hours by day 7 days by week
> > > pg_autovacuum running 24/7
> > > reindex on midnight
> >
> >
> >
> > ---(end of broadcast)---
> > TIP 9: the planner will ignore your desire to choose an index scan if your
> >   joining column's datatypes do not match
> >
> >
> > ---(end of broadcast)---
> > TIP 8: explain analyze is your friend
> >
> 
> 


---(end of broadcast)---
TIP 8: explain analyze is your friend