Re: [PERFORM] Help me please !

2005-02-23 Thread Richard Huxton
Asatryan, Anahit wrote:
I am running postgreSQL 8.0.1 under the Windows 2000. I want to use COPY
FROM STDIN function from Java application, but it doesn't work, it
throws:
"org.postgresql.util.PSQLException: Unknown Response Type G"  error.
I don't think that there is a "STDIN" if you are executing via JDBC. The 
only workaround I know of is to create a file and copy from that, which 
you already have working.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Joins, Deletes and Indexes

2005-02-23 Thread Richard Huxton
[EMAIL PROTECTED] wrote:
I've got 2 tables defined as follows:
CREATE TABLE "cluster"
(
  id int8 NOT NULL DEFAULT nextval('serial'::text),
  clusterid varchar(255) NOT NULL,
  ...
  CONSTRAINT pk_cluster PRIMARY KEY (id)
) 

CREATE TABLE sensorreport
(
  id int8 NOT NULL DEFAULT nextval('serial'::text),
  clusterid int8 NOT NULL,
  ...
  CONSTRAINT pk_sensorreport PRIMARY KEY (id),
  CONSTRAINT fk_sensorreport_clusterid FOREIGN KEY (clusterid) REFERENCES
"cluster" (id) ON UPDATE RESTRICT ON DELETE RESTRICT
) 

I've defined an Index on the clusterid field of sensorreport.
Looking further down, perhaps an index on cluster.clusterid too.
So I've run into 2 issues, one a SELECT, the other a DELETE;
SELECT issue:
So the following query:
EXPLAIN ANALYZE select * from sensorreport where clusterid = 25000114;
Yields:
"Index Scan using idx_sensorreport_clusterid on sensorreport
(cost=0.00..2.01 rows=1 width=129) (actual time=0.000..0.000 rows=38
loops=1)"
"  Index Cond: (clusterid = 25000114)"
"Total runtime: 0.000 ms"
However, when using a join as follows (in the cluster table id=25000114
clusterid='clusterid1'):
EXPLAIN ANALYZE select * from sensorreport as a join cluster as c on c.id =
a.clusterid where c.clusterid = 'clusterid1';
You don't say what version you're using, but older versions of PG took a 
literal join as a request to plan a query in that order. Try rewriting 
it without the "join" keyword and see if the plan alters.

Yields:
Hash Join  (cost=1.18..566211.51 rows=1071429 width=287) (actual
time=150025.000..150025.000 rows=38 loops=1)
  Hash Cond: ("outer".clusterid = "inner".id)
  ->  Seq Scan on sensorreport a  (cost=0.00..480496.03 rows=1503
width=129) (actual time=10.000..126751.000 rows=1539 loops=1)
  ->  Hash  (cost=1.18..1.18 rows=1 width=158) (actual time=0.000..0.000
rows=0 loops=1)
->  Seq Scan on "cluster" c  (cost=0.00..1.18 rows=1 width=158)
(actual time=0.000..0.000 rows=1 loops=1)
  Filter: ((clusterid)::text = 'clusterid1'::text)
Total runtime: 150025.000 ms
My question is can I get the join query to use the
idx_sensorreport_clusterid index on the sensorreport table?
The only reason to use the index on sensorreport is if it isn't going to 
match many rows. That means we want to run the restriction on 
"clisterid1" first, which suggests you want that index on table cluster.

DELETE issue:
The statement:
EXPLAIN ANALYZE delete from cluster where clusterid='clusterid99'
Yields:
 Seq Scan on "cluster"  (cost=0.00..1.18 rows=1 width=6) (actual
time=0.000..0.000 rows=1 loops=1)
   Filter: ((clusterid)::text = 'clusterid99'::text)
 Total runtime: 275988.000 ms
I'm assuming that the length of the delete is because the "DELETE RESTRICT"
on the foreign key from sensortable.
Again, is there any way to get the delete to use the
idx_sensorreport_clusterid index?
No, because this is the cluster table, not sensorreport :-)
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Vig, Sandor (G/FI-2)
Hi,

I changed fsync to false. It took 8 minutes to restore the full database.
That is 26 times faster than before. :-/ (aprox. 200 tps)
With background writer it took 12 minutes. :-(

The funny thing is, I had a VMWARE emulation on the same Windows mashine,
running Red Hat, with fsync turned on. It took also 8 minutes to finish.
Probably the Linux code is better + VMWARE optimises (physical) disk
access.(?)

It seems to me, I need 2 types of operating modes:
- For bulk loading (database restore) : fsync=false
- Normal operation fsync=true

Am I right? How can I do it "elegantly"?

I Think, it should be a "performance tuning guide" in the docomentation.
(not just explaning the settings) Playing with the settings could be quite
anoying. 

Anyway, thanks for the tips.

Bye,
Vig Sándor



-Original Message-
From: Magnus Hagander [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 22, 2005 7:15 PM
To: Vig, Sandor (G/FI-2); pgsql-performance@postgresql.org
Subject: RE: [PERFORM] PostgreSQL is extremely slow on Windows



>I've downloaded the latest release (PostgreSQL 8.0) for windows.
>Installation was OK, but I have tried to restore a database.
>It had more than ~100.000 records. Usually I use PostgreSQL
>under Linux, and it used to be done under 10 minutes.
>
>Under W2k und XP it took 3 hours(!) Why is it so slow
>
>The commands I used:
>
>Under Linux: (duration: 1 minute)
>   pg_dump -D databasename > databasename.db
>
>Under Windows: (duration: 3 - 3.5 hours(!))
>   psql databasename < databasename.db >nul
>
>It seemed to me, that only 20-30 transactions/sec were
>writen to the database.

20-30 transactionsi s about what you'll get on a single disk on Windows
today.
We have a patch in testing that will bring this up to about 80.
You can *never* get above 80 without using write cache, regardless of
your OS, if you have a single disk. You might want to look into wether
write cacheing is enabled on your linux box, and disable it. (unless you
are using RAID) A lot points towards write cache enabled on your system.

If you need the performance that equals the one with write cache on, you
can set fsync=off. But then you will lose the guarantee that your
machine will survive an unclean shutdown or crash. I would strongly
advice against it on a production system - same goes for running with
write cache!

//Magnus

The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Joins, Deletes and Indexes

2005-02-23 Thread Butkus_Charles
 

> -Original Message-
> From: Richard Huxton [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, February 23, 2005 3:40 AM
> To: [EMAIL PROTECTED]
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Joins, Deletes and Indexes
> 
> [EMAIL PROTECTED] wrote:
> > I've got 2 tables defined as follows:
> > 
> > CREATE TABLE "cluster"
> > (
> >   id int8 NOT NULL DEFAULT nextval('serial'::text),
> >   clusterid varchar(255) NOT NULL,
> >   ...
> >   CONSTRAINT pk_cluster PRIMARY KEY (id)
> > ) 
> > 
> > CREATE TABLE sensorreport
> > (
> >   id int8 NOT NULL DEFAULT nextval('serial'::text),
> >   clusterid int8 NOT NULL,
> >   ...
> >   CONSTRAINT pk_sensorreport PRIMARY KEY (id),
> >   CONSTRAINT fk_sensorreport_clusterid FOREIGN KEY 
> (clusterid) REFERENCES
> > "cluster" (id) ON UPDATE RESTRICT ON DELETE RESTRICT
> > ) 
> > 
> > I've defined an Index on the clusterid field of sensorreport.
> 
> Looking further down, perhaps an index on cluster.clusterid too.
> 
> > So I've run into 2 issues, one a SELECT, the other a DELETE;
> > 
> > SELECT issue:
> > So the following query:
> > EXPLAIN ANALYZE select * from sensorreport where clusterid 
> = 25000114;
> > 
> > Yields:
> > "Index Scan using idx_sensorreport_clusterid on sensorreport
> > (cost=0.00..2.01 rows=1 width=129) (actual time=0.000..0.000 rows=38
> > loops=1)"
> > "  Index Cond: (clusterid = 25000114)"
> > "Total runtime: 0.000 ms"
> > 
> > However, when using a join as follows (in the cluster table 
> id=25000114
> > clusterid='clusterid1'):
> > EXPLAIN ANALYZE select * from sensorreport as a join 
> cluster as c on c.id =
> > a.clusterid where c.clusterid = 'clusterid1';
> 
> You don't say what version you're using, but older versions 
> of PG took a 
> literal join as a request to plan a query in that order. Try 
> rewriting 
> it without the "join" keyword and see if the plan alters.

I'm using version 8.0 on Windows.

> 
> > Yields:
> > Hash Join  (cost=1.18..566211.51 rows=1071429 width=287) (actual
> > time=150025.000..150025.000 rows=38 loops=1)
> >   Hash Cond: ("outer".clusterid = "inner".id)
> >   ->  Seq Scan on sensorreport a  (cost=0.00..480496.03 
> rows=1503
> > width=129) (actual time=10.000..126751.000 rows=1539 loops=1)
> >   ->  Hash  (cost=1.18..1.18 rows=1 width=158) (actual 
> time=0.000..0.000
> > rows=0 loops=1)
> > ->  Seq Scan on "cluster" c  (cost=0.00..1.18 
> rows=1 width=158)
> > (actual time=0.000..0.000 rows=1 loops=1)
> >   Filter: ((clusterid)::text = 'clusterid1'::text)
> > Total runtime: 150025.000 ms
> > 
> > My question is can I get the join query to use the
> > idx_sensorreport_clusterid index on the sensorreport table?
> 
> The only reason to use the index on sensorreport is if it 
> isn't going to 
> match many rows. That means we want to run the restriction on 
> "clisterid1" first, which suggests you want that index on 
> table cluster.

The cluster table only has 11 rows, so I'm not sure an index would
help.  The sensorreport table has 15,000,000 rows so that's why I've
got the index there.

> 
> > DELETE issue:
> > The statement:
> > EXPLAIN ANALYZE delete from cluster where clusterid='clusterid99'
> > 
> > Yields:
> >  Seq Scan on "cluster"  (cost=0.00..1.18 rows=1 width=6) (actual
> > time=0.000..0.000 rows=1 loops=1)
> >Filter: ((clusterid)::text = 'clusterid99'::text)
> >  Total runtime: 275988.000 ms
> > 
> > I'm assuming that the length of the delete is because the 
> "DELETE RESTRICT"
> > on the foreign key from sensortable.
> > Again, is there any way to get the delete to use the
> > idx_sensorreport_clusterid index?
> 
> No, because this is the cluster table, not sensorreport :-)

True, but the foreign key constraint on the sensorreport table forces
Postgres to check if there are any sensorreport's that are currently
using this cluster before allowing the cluster to be deleted.

> 
> -- 
>Richard Huxton
>Archonet Ltd
> 

Thanks a lot for the reply.

Chuck Butkus
EMC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Joins, Deletes and Indexes

2005-02-23 Thread Richard Huxton
[EMAIL PROTECTED] wrote:
The cluster table only has 11 rows, so I'm not sure an index would
help.  The sensorreport table has 15,000,000 rows so that's why I've
got the index there.
Ah - only 11?
on the foreign key from sensortable.
Again, is there any way to get the delete to use the
idx_sensorreport_clusterid index?
No, because this is the cluster table, not sensorreport :-)
True, but the foreign key constraint on the sensorreport table forces
Postgres to check if there are any sensorreport's that are currently
using this cluster before allowing the cluster to be deleted.
If you only have 11 distinct values in the large table then it's 
debatable whether it's always quicker to use the index. Since your first 
example (clusterid = 25000114) returned so few rows, I'm guessing that 
some other values represent a sizeable percentage of the table. That'd 
explain the difference between PG's estimates and the actual number of 
matching rows.

You can try "SET enable_seqscan =false;" before running the query and 
see whether using the index helps things.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Markus Schaber
Hi, Magnus & all,

Magnus Hagander schrieb:
> 20-30 transactionsi s about what you'll get on a single disk on Windows
> today.
> We have a patch in testing that will bring this up to about 80.
> You can *never* get above 80 without using write cache, regardless of
> your OS, if you have a single disk. You might want to look into wether
> write cacheing is enabled on your linux box, and disable it. (unless you
> are using RAID) A lot points towards write cache enabled on your system.

Note that you can get higher rates for the server as a whole when using
concurrent transactions (that is, several independend connections
committing at the same time). The commit delay settings should be tuned
accordingly.


Markus
-- 
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help me please !

2005-02-23 Thread Markus Schaber
Hi, Asatryan,

Asatryan, Anahit schrieb:
> I am running postgreSQL 8.0.1 under the Windows 2000. I want to use COPY
> FROM STDIN function from Java application, but it doesn’t work, it throws:
> 
> “org.postgresql.util.PSQLException: Unknown Response Type G”  error.

Currently, there is no COPY support in the postgresql jdbc driver. There
were some patches enabling COPY support floating around on the
pgsql-jdbc@postgresql.org mailing list. You can search the archive and
try whether one of them fits your needs.

AFAIR, COPY support is on the TODO list, but they wait for some other
driver reworking to be finished.

You can discuss this issue on psql-jdbc list or search the archives if
you need more info.


Markus
-- 
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:[EMAIL PROTECTED] | www.logi-track.com



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Magnus Hagander
> Hi,
> 
> I changed fsync to false. It took 8 minutes to restore the 
> full database.
> That is 26 times faster than before. :-/ (aprox. 200 tps) 
> With background writer it took 12 minutes. :-(

That seems reasonable.


> The funny thing is, I had a VMWARE emulation on the same 
> Windows mashine, running Red Hat, with fsync turned on. It 
> took also 8 minutes to finish.
> Probably the Linux code is better + VMWARE optimises (physical) disk
> access.(?)

Vmware makes fsync() into a no-op. It will always cache the disk.
(This is vmware workstation. Their server products behave differntly, of
course)


> It seems to me, I need 2 types of operating modes:
> - For bulk loading (database restore) : fsync=false
> - Normal operation fsync=true

Yes, fsync=false is very good for bulk loading *IFF* you can live with
data loss in case you get a crash during load.


> Am I right? How can I do it "elegantly"?

You'll need to edit postgresql.conf and restart the server for this.


> I Think, it should be a "performance tuning guide" in the 
> docomentation.
> (not just explaning the settings) Playing with the settings 
> could be quite anoying. 

There is some information on techdocs.postgresql.org you miht be
interested in.

//Magnus

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Magnus Hagander
> > You can *never* get above 80 without using write cache, 
> regardless of 
> > your OS, if you have a single disk.
> 
> Why? Even with, say, a 15K RPM disk? Or the ability to 
> fsync() multiple concurrently-committing transactions at once?

Uh. What I meant was a single *IDE* disk. Sorry. Been too deep into
helping ppl with IDE disks lately to remember that SCSI can be a lot
faster :-) And we're talking about restore of a dump, so it's a single
session.

(Strictly, that shuld be a 7200rpm IDE disk. I don't know if any others
are common, though)

//mha

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Problem with 7.4.5 and webmin 1.8 in grant function

2005-02-23 Thread amrit
> Well, sure looks like you only have one running. Your data directory is
> /var/lib/pgsql/data so lets see the files:
>
> /var/lib/pgsql/data/pg_hba.conf
> /var/lib/pgsql/data/pg_ident.conf
> /var/lib/pgsql/data/postmaster.opts
>
> Might also be useful to know any nondefault settings in postgresql.conf too.
>
> As I understand it, these vendor shipped rpms have ident *enabled*.
> I will download FC3 Pg and check this out... I'm a compile it from
> source guy :-)
>
> Mark

I got the answer that is in module config of postgresl-webmin , there is a check
box for

Use DBI to connect if available?yes   nothe default is
yes , but if I choosed no everything went fine.

I also test it in the desktop mechine and get the same error and the same
solution. Could you explain what happen to the FC3 + postgresql and webmin 1.8?
Thanks
Amrit ,Thailand

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] is pg_autovacuum so effective ?

2005-02-23 Thread Gaetano Mendola
Christopher Browne wrote:
> Gaetano Mendola <[EMAIL PROTECTED]> writes:
> 
> 
>>Tom Lane wrote:
>>
>>>Gaetano Mendola <[EMAIL PROTECTED]> writes:
>>>
>>>
I'm using ony pg_autovacuum. I expect that disk usage will reach
a steady state but is not. PG engine: 7.4.5
>>>
>>>
>>>One data point doesn't prove that you're not at a steady state.
>>
>>I do a graph about my disk usage and it's a ramp since one week,
>>I'll continue to wait in order to see if it will decrease.
>>I was expecting the steady state at something like 4 GB
>>( after a full vacuum and reindex ) + 10 % = 4.4 GB
>>I'm at 4.6 GB and increasing. I'll see how it will continue.
> 
> 
> You probably want for the "experiment" to last more than a week.
> 
> After all, it might actually be that with your usage patterns, that
> table would stabilize at 15% "overhead," and that might take a couple
> or three weeks.
> 
> Unless it's clear that it's growing perilously quickly, just leave it
> alone so that there's actually some possibility of reaching an
> equilibrium.  Any time you "VACUUM FULL" it, that _destroys_ any
> experimental results or any noticeable patterns, and it guarantees
> that you'll see "seemingly perilous growth" for a while.
> 
> And if the table is _TRULY_ growing "perilously quickly," then it is
> likely that you should add in some scheduled vacuums on the table.
> Not VACUUM FULLs; just plain VACUUMs.
> 
> I revised cron scripts yet again today to do hourly and "4x/day"
> vacuums of certain tables in some of our systems where we know they
> need the attention.  I didn't schedule any VACUUM FULLs; it's
> unnecessary, and would lead directly to system outages, which is
> totally unacceptable.

Yes, I'm in this direction too.


Regards
Gaetano Mendola








---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Allgood
Hello All
   I am setting up a hardware clustering solution. My hardware is Dual 
Opteron 550 with 8GB ram. My external storage is a Kingston Fibre 
channel Infostation. With 14 15000'k 36GB drives. The OS we are running 
is Redhat ES 3.0. Clustering using Redhat Cluster Suite.  Postgres 
Version is Postgres 7.4.7. We will be setting up about 9 databases which 
range in size from 100MB to 2.5GB on the config. The postgres 
application is mostly read intensive.  What would be the best way to 
setup the hardrives on this server. Currently I have it partioned with 2 
seperate raid 5 with 1 failover per raid. I have two database clusters 
configured with a seperate postmaster running for each. Running two 
postmasters seems like a pain but that is the only way I knew to 
seperate the load. I am mostly concerned about disk IO and performance. 
Is my current setup a good way to accomplish the best performance or 
would it be better to use all the drives in one huge raid five with a 
couple of failovers. I have looked around in the archives and found some 
info but I would like to here about some of the configs other people are 
running and how they have them setup.

Thanks
John Allgood - ESC
Systems Admin
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Inefficient Query Plans

2005-02-23 Thread Tom Lane
"Luke Chambers" <[EMAIL PROTECTED]> writes:
> The following query plans both result from the very same query run on
> different servers. They obviously differ drastically, but I don't why
> as one db is a slonied copy of the other with identical postgresql.conf
> files.

There's an order-of-magnitude difference in the estimated row counts for
some of the joins, so it's hardly surprising that different plans would
be chosen.  Assuming that these are exactly the same Postgres version,
the only explanation would be considerably different ANALYZE statistics
stored in the two databases.

> Both databases are vacuum analyzed nightly.

Maybe you should double-check that.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] subquery vs join on 7.4.5

2005-02-23 Thread Tom Lane
David Haas <[EMAIL PROTECTED]> writes:
> I'm comparing the speeds of the following two queries on 7.4.5.  I was  
> curious why query 1 was faster than query 2:

> query 1:
> Select layer_number
> FROM batch_report_index
> WHERE device_id = (SELECT device_id FROM device_index WHERE device_name  
> ='CP8M')
> AND technology_id = (SELECT technology_id FROM technology_index WHERE  
> technology_name = 'G12');

> query 2:
> Select b.layer_number
> FROM batch_report_index b, device_index d, technology_index t
> WHERE b.device_id = d.device_id
> AND b.technology_id = t.technology_id
> AND d.device_name = 'CP8M'
> AND t.technology_name = 'G12';

Why didn't you try a two-column index on batch_report_index(device_id,
technology_id) ?

Whether this would actually be better than a seqscan I'm not sure, given
the large number of matching rows.  But the planner would surely try it
given that it's drastically underestimating that number :-(

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Problem with 7.4.5 and webmin 1.8 in grant function

2005-02-23 Thread G u i d o B a r o s i o
Sorry, just a fool tip, cause I haven't seen that you already done the pg_ctl 
stop && pg_ctl start ...

(I mean, did you reload your conf settings?)

Regards,
Guido

> > > I used you perl script and found the error =>
> > > [EMAIL PROTECTED] tmp]# perl relacl.pl
> > > DBI connect('dbname=template1;port=5432','postgres',...) failed: FATAL:
> > IDENT
> > > authentication failed for user "postgres" at relacl.pl line 21
> > > Error in connect to DBI:Pg:dbname=template1;port=5432:
> > >
> > >
> > Excellent - we know what is going on now!
> >
> >
> > > And my pg_hba.conf is
> > >
> > > # IPv4-style local connections:
> > > hostall all 127.0.0.1 255.255.255.255   trust
> > > hostall all 192.168.0.0 255.255.0.0 trust
> > >
> > > trusted for every user.
> >
> > Ok, what I think has happened is that there is another Pg installation
> > (or another initdb'ed cluster) on this machine that you are accidentally
> > talking to. Try
> >
> > $ rpm -qa|grep -i postgres
> >
> > which will spot another software installation, you may just have to
> > search for files called pg_hba.conf to find another initdb'ed cluster
> >
> > This other installation should have a pg_hba.conf that looks something
> > like :
> >
> > local   all allident
> > hostall all   127.0.0.1  255.255.255.255   ident
> >
> > So a bit of detective work is in order :-)
> >
> > Mark
> After being a detector I found that
> [EMAIL PROTECTED] ~]# rpm -qa|grep -i postgres
> postgresql-7.4.5-3.1.tlc
> postgresql-python-7.4.5-3.1.tlc
> postgresql-jdbc-7.4.5-3.1.tlc
> postgresql-tcl-7.4.5-3.1.tlc
> postgresql-server-7.4.5-3.1.tlc
> postgresql-libs-7.4.5-3.1.tlc
> postgresql-docs-7.4.5-3.1.tlc
> postgresql-odbc-7.3-8.1.tlc
> postgresql-pl-7.4.5-3.1.tlc
> postgresql-test-7.4.5-3.1.tlc
> postgresql-contrib-7.4.5-3.1.tlc
> [EMAIL PROTECTED] ~]#
> 
> no other pg installation except the pgsql for windows in samba folder which I
> think it isn't matter ,is it?
> No other  pg being run.
> [EMAIL PROTECTED] ~]# ps ax|grep postmaster
>  2228 ?S  0:00 /usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
>  3308 pts/0S+ 0:00 grep postmaster
> [EMAIL PROTECTED] ~]#
> 
> Is it possible that it is related to pg_ident.conf ?
> 
> Any comment please.
> Amrit,Thailand
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Michael Adler
On Wed, Feb 23, 2005 at 11:39:27AM -0500, John Allgood wrote:
> Hello All
> 
>I am setting up a hardware clustering solution. My hardware is Dual 
> Opteron 550 with 8GB ram. My external storage is a Kingston Fibre 
> channel Infostation. With 14 15000'k 36GB drives. The OS we are running 
> is Redhat ES 3.0. Clustering using Redhat Cluster Suite.  Postgres 
> Version is Postgres 7.4.7. We will be setting up about 9 databases which 
> range in size from 100MB to 2.5GB on the config. The postgres 
> application is mostly read intensive.  What would be the best way to 
> setup the hardrives on this server. Currently I have it partioned with 2 
> seperate raid 5 with 1 failover per raid. I have two database clusters 
> configured with a seperate postmaster running for each. Running two 
> postmasters seems like a pain but that is the only way I knew to 
> seperate the load. I am mostly concerned about disk IO and performance. 
> Is my current setup a good way to accomplish the best performance or 
> would it be better to use all the drives in one huge raid five with a 
> couple of failovers. I have looked around in the archives and found some 
> info but I would like to here about some of the configs other people are 
> running and how they have them setup.

http://www.powerpostgresql.com/PerfList/

Consider a separate array for pg_xlog. 

With tablespaces in 8.0, you can isolate much of the IO in a single
cluster.

 -Mike Adler

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Bruno Almeida do Lago
Is there a real limit for max_connections? Here we've an Oracle server with
up to 1200 simultaneous conections over it!

"max_connections: exactly like previous versions, this needs to be set to
the actual number of simultaneous connections you expect to need. High
settings will require more shared memory (shared_buffers). As the
per-connection overhead, both from PostgreSQL and the host OS, can be quite
high, it's important to use connection pooling if you need to service a
large number of users. For example, 150 active connections on a medium-end
32-bit Linux server will consume significant system resources, and 600 is
about the limit."


C ya,
Bruno

 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Michael Adler
Sent: Wednesday, February 23, 2005 3:04 PM
To: John Allgood
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

On Wed, Feb 23, 2005 at 11:39:27AM -0500, John Allgood wrote:
> Hello All
> 
>I am setting up a hardware clustering solution. My hardware is Dual 
> Opteron 550 with 8GB ram. My external storage is a Kingston Fibre 
> channel Infostation. With 14 15000'k 36GB drives. The OS we are running 
> is Redhat ES 3.0. Clustering using Redhat Cluster Suite.  Postgres 
> Version is Postgres 7.4.7. We will be setting up about 9 databases which 
> range in size from 100MB to 2.5GB on the config. The postgres 
> application is mostly read intensive.  What would be the best way to 
> setup the hardrives on this server. Currently I have it partioned with 2 
> seperate raid 5 with 1 failover per raid. I have two database clusters 
> configured with a seperate postmaster running for each. Running two 
> postmasters seems like a pain but that is the only way I knew to 
> seperate the load. I am mostly concerned about disk IO and performance. 
> Is my current setup a good way to accomplish the best performance or 
> would it be better to use all the drives in one huge raid five with a 
> couple of failovers. I have looked around in the archives and found some 
> info but I would like to here about some of the configs other people are 
> running and how they have them setup.

http://www.powerpostgresql.com/PerfList/

Consider a separate array for pg_xlog. 

With tablespaces in 8.0, you can isolate much of the IO in a single
cluster.

 -Mike Adler

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Tom Lane
"Bruno Almeida do Lago" <[EMAIL PROTECTED]> writes:
> Is there a real limit for max_connections? Here we've an Oracle server with
> up to 1200 simultaneous conections over it!

[ shrug... ] If your machine has the beef to run 1200 simultaneous
queries, you can set max_connections to 1200.

The point of what you were quoting is that if you want to service
1200 concurrent users but you only expect maybe 100 simultaneously
active queries from them (and you have a database box that can only
service that many) then you want to put a connection pooler in
front of 100 backends, not try to start a backend for every user.

Oracle may handle this sort of thing differently, I dunno.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Rod Taylor
On Wed, 2005-02-23 at 15:26 -0300, Bruno Almeida do Lago wrote:
> Is there a real limit for max_connections? Here we've an Oracle server with
> up to 1200 simultaneous conections over it!

If you can reduce them by using something like pgpool between PostgreSQL
and the client, you'll save some headache. PostgreSQL did not perform as
well with a large number of idle connections and it does otherwise (last
time I tested was 7.4 though -- perhaps it's better now).

The kernel also starts to play a significant role with a high number of
connections. Some operating systems don't perform as well with a high
number of processes (process handling, scheduling, file handles, etc.).

I think you can do it without any technical issues, but you will
probably be happier with the result if you can hide idle connections
from the database machine.
-- 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Tom Lane
Rod Taylor <[EMAIL PROTECTED]> writes:
> The kernel also starts to play a significant role with a high number of
> connections. Some operating systems don't perform as well with a high
> number of processes (process handling, scheduling, file handles, etc.).

Right; the main problem with having lots more backends than you need is
that the idle ones still eat their share of RAM and open file handles.

A connection pooler uses relatively few resources per idle connection,
so it's a much better impedance match if you want to service lots of
connections that are mostly idle.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Allgood
I think maybe I didn't explain myself well enough. At most we will 
service 200-250 connections across all the 9 databases mentioned. The 
database we are building is for a trucking company. Each of the 
databases represents a different division. With one master database that 
everything is updated to. Most of the access to the database is by 
simple queries. Most of the IO intensive stuff is done when revenue 
reports are generated and when we have our month/year end processing. 
All the trucking loads that are mark as delivered are transferred to our 
master database during night time processing. All that will be handled 
using custom scripts. Maybe I have given a better explanation of the 
application. my biggest concern is how to partition the shared storage 
for maximum performance. Is there a real benifit to having more that one 
raid5 partition or am I wasting my time.

Thanks
John Allgood - ESC
Tom Lane wrote:
"Bruno Almeida do Lago" <[EMAIL PROTECTED]> writes:
 
Is there a real limit for max_connections? Here we've an Oracle server with
up to 1200 simultaneous conections over it!
   
[ shrug... ] If your machine has the beef to run 1200 simultaneous
queries, you can set max_connections to 1200.
The point of what you were quoting is that if you want to service
1200 concurrent users but you only expect maybe 100 simultaneously
active queries from them (and you have a database box that can only
service that many) then you want to put a connection pooler in
front of 100 backends, not try to start a backend for every user.
Oracle may handle this sort of thing differently, I dunno.
regards, tom lane
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org
 
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] is pg_autovacuum so effective ?

2005-02-23 Thread Matthew T. O'Connor
Christopher Browne wrote:
Gaetano Mendola <[EMAIL PROTECTED]> writes:
 

I do a graph about my disk usage and it's a ramp since one week,
I'll continue to wait in order to see if it will decrease.
I was expecting the steady state at something like 4 GB
( after a full vacuum and reindex ) + 10 % = 4.4 GB
I'm at 4.6 GB and increasing. I'll see how it will continue.
   

You probably want for the "experiment" to last more than a week.
After all, it might actually be that with your usage patterns, that
table would stabilize at 15% "overhead," and that might take a couple
or three weeks.
Unless it's clear that it's growing perilously quickly, just leave it
alone so that there's actually some possibility of reaching an
equilibrium.  Any time you "VACUUM FULL" it, that _destroys_ any
experimental results or any noticeable patterns, and it guarantees
that you'll see "seemingly perilous growth" for a while.
And if the table is _TRULY_ growing "perilously quickly," then it is
likely that you should add in some scheduled vacuums on the table.
Not VACUUM FULLs; just plain VACUUMs.
I revised cron scripts yet again today to do hourly and "4x/day"
vacuums of certain tables in some of our systems where we know they
need the attention.  I didn't schedule any VACUUM FULLs; it's
unnecessary, and would lead directly to system outages, which is
totally unacceptable.
 

Chris, is this in addition to pg_autovacuum?  Or do you not use 
pg_autovacuum at all?, and if so why not?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Arbash Meinel
John Allgood wrote:
I think maybe I didn't explain myself well enough. At most we will
service 200-250 connections across all the 9 databases mentioned. The
database we are building is for a trucking company. Each of the
databases represents a different division. With one master database
that everything is updated to. Most of the access to the database is
by simple queries. Most of the IO intensive stuff is done when revenue
reports are generated and when we have our month/year end processing.
All the trucking loads that are mark as delivered are transferred to
our master database during night time processing. All that will be
handled using custom scripts. Maybe I have given a better explanation
of the application. my biggest concern is how to partition the shared
storage for maximum performance. Is there a real benifit to having
more that one raid5 partition or am I wasting my time.
Thanks
John Allgood - ESC
If you read the general advice statements, it's actually better to not
use raid5, but to use raid10 (striping and mirroring). Simply because
raid5 writing is quite poor.
Also, if you have the disks, the next best improvement is to move
pg_xlog onto it's own set of disks. I think that gets as much as 25%
improvement by itself. pg_xlog is an append process, which must complete
before the actual data gets updated, so giving it it's own set of
spindles reduces seek time, and lets the log be written quickly.
I think there is even some benefit to making pg_xlog be a solid state
disk, as it doesn't have to be huge, but having high I/O rates can
remove it as a bottleneck. (I'm not positive how large pg_xlog gets, but
it is probably small compared with the total db size, and I think it can
be flushed periodically as transactions are completed.)
I'm not sure what you are considering "shared storage". Are you thinking
that all the machines will be mounting a remote drive for writing the
DB? They should all have their own local copy (multiple masters on the
same database is not supported).
I think it is possible to get better performance by having more raid
systems. But it is application dependent. If you know that you have 2
tables that are being updated often and independently, then having each
one on it's own raid would allow better concurrency.
But it sounds like in your app you get concurrency by having a bunch of
remote databases, which then do bulk updates on the master database. I
think if you are careful to have each of the remote dbs update the
master at a slightly different time, you could probably get very good
transaction rates.
John
=:->


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Michael Adler
On Wed, Feb 23, 2005 at 02:15:52PM -0500, John Allgood wrote:
> using custom scripts. Maybe I have given a better explanation of the 
> application. my biggest concern is how to partition the shared storage 
> for maximum performance. Is there a real benifit to having more that one 
> raid5 partition or am I wasting my time.

I think the simplest and most generic solution would be to put the OS
and pg_xlog on a RAID 1 pair and dedicate the rest of the drives to
RAID 5 or RAID 1+0 (striped set of mirrors) array.

Depending on the nature of your work, you may get better performance
by placing individual tables/indices on dedicated spindles for
parallel access.

 -Mike Adler

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Allgood
This some good info. The type of attached storage is a Kingston 14 bay 
Fibre Channel Infostation. I have 14 36GB 15,000 RPM drives. I think the 
way it is being explained that I should build a mirror with two disk for 
the pg_xlog and the striping and mirroring the rest and put all my 
databases into one cluster. Also I might mention that I am running 
clustering using Redhat Clustering Suite.

John Arbash Meinel wrote:
John Allgood wrote:
I think maybe I didn't explain myself well enough. At most we will
service 200-250 connections across all the 9 databases mentioned. The
database we are building is for a trucking company. Each of the
databases represents a different division. With one master database
that everything is updated to. Most of the access to the database is
by simple queries. Most of the IO intensive stuff is done when revenue
reports are generated and when we have our month/year end processing.
All the trucking loads that are mark as delivered are transferred to
our master database during night time processing. All that will be
handled using custom scripts. Maybe I have given a better explanation
of the application. my biggest concern is how to partition the shared
storage for maximum performance. Is there a real benifit to having
more that one raid5 partition or am I wasting my time.
Thanks
John Allgood - ESC
If you read the general advice statements, it's actually better to not
use raid5, but to use raid10 (striping and mirroring). Simply because
raid5 writing is quite poor.
Also, if you have the disks, the next best improvement is to move
pg_xlog onto it's own set of disks. I think that gets as much as 25%
improvement by itself. pg_xlog is an append process, which must complete
before the actual data gets updated, so giving it it's own set of
spindles reduces seek time, and lets the log be written quickly.
I think there is even some benefit to making pg_xlog be a solid state
disk, as it doesn't have to be huge, but having high I/O rates can
remove it as a bottleneck. (I'm not positive how large pg_xlog gets, but
it is probably small compared with the total db size, and I think it can
be flushed periodically as transactions are completed.)
I'm not sure what you are considering "shared storage". Are you thinking
that all the machines will be mounting a remote drive for writing the
DB? They should all have their own local copy (multiple masters on the
same database is not supported).
I think it is possible to get better performance by having more raid
systems. But it is application dependent. If you know that you have 2
tables that are being updated often and independently, then having each
one on it's own raid would allow better concurrency.
But it sounds like in your app you get concurrency by having a bunch of
remote databases, which then do bulk updates on the master database. I
think if you are careful to have each of the remote dbs update the
master at a slightly different time, you could probably get very good
transaction rates.
John
=:->
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Arbash Meinel
John Allgood wrote:
This some good info. The type of attached storage is a Kingston 14 bay
Fibre Channel Infostation. I have 14 36GB 15,000 RPM drives. I think
the way it is being explained that I should build a mirror with two
disk for the pg_xlog and the striping and mirroring the rest and put
all my databases into one cluster. Also I might mention that I am
running clustering using Redhat Clustering Suite.

So are these 14-disks supposed to be shared across all of your 9 databases?
It seems to me that you have a few architectural issues here.
First, you can't really have 2 masters writing to the same disk array.
I'm not sure if Redhat Clustering gets around this. But second is that
you can't run 2 postgres engines on the same database. Postgres doesn't
support a clustered setup. There are too many issues with concurancy and
keeping everyone in sync.
Since you seem to be okay with having a bunch of smaller localized
databases, which update a master database 1/day, I would think you would
want hardware to go something like this.
1 master server, at least dual opteron with access to lots of disks
(likely the whole 14 if you can get away with it). Put 2 as a RAID1 for
the OS, 4 as a RAID10 for pg_xlog, and then the other 8 as RAID10 for
the rest of the database.
8-9 other servers, these don't need to be as powerful, since they are
local domains. Probably a 4-disk RAID10 for the OS and pg_xlog is plenty
good, and whatever extra disks you can get for the local database.
The master database holds all information for all domains, but the other
databases only hold whatever is the local information. Every night your
script sequences through the domain databases one-by-one, updating the
master database, and synchronizing whatever data is necesary back to the
local domain. I would guess that this script could actually just
continually run, going to each local db in turn, but you may want
nighttime only updating depending on what kind of load they have.
John
=:->


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Problem with 7.4.5 and webmin 1.8 in grant function

2005-02-23 Thread Mark Kirkwood
[EMAIL PROTECTED] wrote:
I got the answer that is in module config of postgresl-webmin , there is a 
check
box for
Use DBI to connect if available?yes   nothe default is
yes , but if I choosed no everything went fine.
I also test it in the desktop mechine and get the same error and the same
solution. Could you explain what happen to the FC3 + postgresql and webmin 1.8?
Well, given the error was coming from the postmaster, I don't believe
that DBI or webmin have anything to do with it. What I can believe is
that DBI=yes and DBI=no are using different parameters for connecting,
therefore hitting different parts of your old (see below) pg_hba.conf
settings.
I concur with the other poster, and suspect that the files *were* using
some form of ident identification, but have been subsequently edited to
use trust - but the postmaster has not been restarted to know this! Try
$ pg_ctl reload
to get running with 'trust'.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Allgood
Here is a summary about the cluster suite from redhat. All 9 databases 
will be on the primary server the secondary server I have is the 
failover. They don't actually share the partitions at the same time. 
When you have some type of failure the backup server takes over. Once 
you setup the hardware and install the clustering software. You then 
setup a service "ie postgres" and then you tell it what harddrive you 
will be using. /dev/sde1 and the clustering software takes care of 
starting and stopping the postgres database.

 Cluster Manager
The Cluster Manager feature of Red Hat Cluster Suite provides an 
application failover infrastructure that can be used by a wide range of 
applications, including:

   * Most custom and mainstream commercial applications
   * File and print serving
   * Databases and database applications
   * Messaging applications
   * Internet and open source application
With Cluster Manager, these applications can be deployed in high 
availability configurations so that they are always operational—bringing 
"scale-out" capabilities to enterprise Linux deployments.

For high-volume open source applications, such as NFS, Samba, and 
Apache, Cluster Manager provides a complete ready-to-use failover 
solution. For most other applications, customers can create custom 
failover scripts using provided templates. Red Hat Professional Services 
can provide custom Cluster Manager deployment services where required.

   Features
   * Support for up to eight nodes: Allows high availability to be
 provided for multiple applications simultaneously.
   * NFS/CIFS Failover: Supports highly available file serving in Unix
 and Windows environments.
   * Fully shared storage subsystem: All cluster members have access to
 the same storage.
   * Comprehensive Data Integrity guarantees: Uses the latest I/O
 barrier technology, such as programmable power switches and
 watchdog timers.
   * SCSI and Fibre Channel support: Cluster Manager configurations can
 be deployed using latest SCSI and Fibre Channel technology.
 Multi-terabyte configurations can readily be made highly available.
   * Service failover: Cluster Manager not only ensures hardware
 shutdowns or failures are detected and recovered from
 automatically, but also will monitor your applications to ensure
 they are running correctly, and will restart them automatically if
 they fail.

John Arbash Meinel wrote:
John Allgood wrote:
This some good info. The type of attached storage is a Kingston 14 bay
Fibre Channel Infostation. I have 14 36GB 15,000 RPM drives. I think
the way it is being explained that I should build a mirror with two
disk for the pg_xlog and the striping and mirroring the rest and put
all my databases into one cluster. Also I might mention that I am
running clustering using Redhat Clustering Suite.

So are these 14-disks supposed to be shared across all of your 9 
databases?
It seems to me that you have a few architectural issues here.

First, you can't really have 2 masters writing to the same disk array.
I'm not sure if Redhat Clustering gets around this. But second is that
you can't run 2 postgres engines on the same database. Postgres doesn't
support a clustered setup. There are too many issues with concurancy and
keeping everyone in sync.
Since you seem to be okay with having a bunch of smaller localized
databases, which update a master database 1/day, I would think you would
want hardware to go something like this.
1 master server, at least dual opteron with access to lots of disks
(likely the whole 14 if you can get away with it). Put 2 as a RAID1 for
the OS, 4 as a RAID10 for pg_xlog, and then the other 8 as RAID10 for
the rest of the database.
8-9 other servers, these don't need to be as powerful, since they are
local domains. Probably a 4-disk RAID10 for the OS and pg_xlog is plenty
good, and whatever extra disks you can get for the local database.
The master database holds all information for all domains, but the other
databases only hold whatever is the local information. Every night your
script sequences through the domain databases one-by-one, updating the
master database, and synchronizing whatever data is necesary back to the
local domain. I would guess that this script could actually just
continually run, going to each local db in turn, but you may want
nighttime only updating depending on what kind of load they have.
John
=:->
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Josh Berkus
Bruno,

> For example, 150 active connections on a medium-end
> 32-bit Linux server will consume significant system resources, and 600 is
> about the limit."

That, is, "is about the limit for a medium-end 32-bit Linux server".Sorry 
if the implication didn't translate well.   If you use beefier hardware, of 
course, you can manage more connections; personally I've never needed more 
than 1000, even on a web site that gets 100,000 d.u.v.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Arbash Meinel
John Allgood wrote:
Here is a summary about the cluster suite from redhat. All 9 databases
will be on the primary server the secondary server I have is the
failover. They don't actually share the partitions at the same time.
When you have some type of failure the backup server takes over. Once
you setup the hardware and install the clustering software. You then
setup a service "ie postgres" and then you tell it what harddrive you
will be using. /dev/sde1 and the clustering software takes care of
starting and stopping the postgres database.
Okay, I misunderstood your hardware. So you actually only have 1
machine, with a second machine as a potential rollover. But all
transactions occur on the same hardware, even if is a separate
"database". I was thinking these were alternate machines.
So my first question is why are you partitioning into a separate
database, and then updating the master one at night. Since everything is
restricted to the same machine, why not just have everything performed
on the master?
However, sticking with your arrangement, it would seem that you might be
able to get some extra performance if each database is on it's own raid,
since you are fairly likely to have 2 transactions occuring at the same
time, that don't affect eachother (since you wouldn't have any foreign
keys, etc on 2 separate databases.)
But I think the basic OS RAID1, pg_xlog RAID10, database RAID10 is still
a good separation of disks. And probably would help you maximize your
throughput.
I can't say too much about how the Cluster failover stuff will work with
postgres. But as long as one is completely shutdown before the next is
started, and they are both running binary compatible versions of
postgres, it seems like it would be fine. Not much different from having
a second machine that is sitting turned off, which you turn on when the
first one fails.
John
=:->


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Neil Conway
Magnus Hagander wrote:
Yes, fsync=false is very good for bulk loading *IFF* you can live with
data loss in case you get a crash during load.
It's not merely data loss -- you could encounter potentially 
unrecoverable database corruption.

There is a TODO item about allowing the delaying of WAL writes. If we 
maintain the WAL invariant (that is, a WAL record describing a change 
must hit disk before the change itself does) but simply don't flush the 
WAL at transaction commit, we should be able to get better performance 
without the risk of database corruption (so we would need to keep pages 
modified by the committed transaction pinned in memory until the WAL has 
been flushed, which might be done on a periodic basis).

Naturally, there is a risk of losing data in the period between 
transaction commit and syncing the WAL, but no risk of database 
corruption. This seems a reasonable approach to providing better 
performance for people who don't need the strict guarantees provided by 
fsync=true.

-Neil
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL is extremely slow on Windows

2005-02-23 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> There is a TODO item about allowing the delaying of WAL writes. If we 
> maintain the WAL invariant (that is, a WAL record describing a change 
> must hit disk before the change itself does) but simply don't flush the 
> WAL at transaction commit, we should be able to get better performance 
> without the risk of database corruption (so we would need to keep pages 
> modified by the committed transaction pinned in memory until the WAL has 
> been flushed, which might be done on a periodic basis).

That interlock already exists, in the form of the bufmgr LSN logic.

I think this "feature" might be as simple as

XLogFlush(recptr);

becomes

/* Must flush if we are deleting files... */
if (PerCommitFlush || nrels > 0)
XLogFlush(recptr);

in RecordTransactionCommit.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread Christopher Browne
[EMAIL PROTECTED] ("Bruno Almeida do Lago") wrote:
> Is there a real limit for max_connections? Here we've an Oracle server with
> up to 1200 simultaneous conections over it!
>
> "max_connections: exactly like previous versions, this needs to be set to
> the actual number of simultaneous connections you expect to need. High
> settings will require more shared memory (shared_buffers). As the
> per-connection overhead, both from PostgreSQL and the host OS, can be quite
> high, it's important to use connection pooling if you need to service a
> large number of users. For example, 150 active connections on a medium-end
> 32-bit Linux server will consume significant system resources, and 600 is
> about the limit."

Right now, I have an Opteron box with:
 a) A load average of about 0.1, possibly less ;-), and
 b) 570 concurrent connections.

Having so connections is something of a "fool's errand," as it really
is ludicrously unnecessary, but I wouldn't be too afraid of having
1000 connections on that box, as long as they're being used for
relatively small transactions.

You can, of course, kill performance on any not-outrageously-large
system if a few of those users are doing big queries...
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com').
http://cbbrowne.com/info/slony.html
I've had a perfectly wonderful evening. But this wasn't it.
-- Groucho Marx

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match