date:20060614

On Tue, Jun 13, 2006 at 09:50:49PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > On Tue, Jun 13, 2006 at 06:04:42PM -0400, Tom Lane wrote:
> >> It'd depend on the context, possibly, but it's easy to show that the
> >> current planner does fold "now() - interval_constant" when making
> >> estimates.  Simple example:
> 
> > Turns out the difference is between feeding a date vs a timestamp into the
> > query... I would have thought that since date is a date that the WHERE 
> > clause
> > would be casted to a date if it was a timestamptz, but I guess not...
> 
> Hmm ... worksforme.  Could you provide a complete test case?

I can't provide the data I used for that, but I'll try and come up with
something else.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] OT - select + must have from - sql standard syntax?

2006-06-14 Thread Dave Dutcher

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Greg Stark
> Sent: Tuesday, June 13, 2006 11:16 PM
> Subject: Re: [PERFORM] OT - select + must have from - sql 
> standard syntax?
[SNIP]
> 
> Well you could always create a "dual", it was always just a 
> regular table. We
> used to joke about what would happen to Oracle if you 
> inserted an extra row in
> it...


I've never used Oracle, so I don't understand why its called dual when
it only has one row?  Shouldn't it be called single?  :\


Dave


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] OT - select + must have from - sql standard syntax?

Antoine <[EMAIL PROTECTED]> writes:
> ps. shame the standard isn't "freely" consultable to save you guys
> silly OT questions!

You can get free "draft" versions that are close-enough-to-final to be
perfectly usable.  See our developers' FAQ for some links.  I like the
drafts partly because they're plain ASCII, and so far easier to search
than PDFs ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] how to partition disks

2006-06-14 Thread Richard Broersma Jr

> > raid 10 is of course not questionable. but are you sure that it will 
> > work faster than for example:
> > 2 discs (raid 1) for xlog
> > 6 discs (raid 10) for tables
> > 6 discs (raid 10) for indices?
> > 
> 
> This depends on your application. Do you have a lot of disc reads?
> Anyhow, I would put the xlog always to a RAID 10 volume because most of 
> the I/O for update and inserts is going to the xlog.
> 
> 4 discs xlog
> 6 discs tables
> 4 discs tables2

I have a question in regards to I/O bandwidths of various raid configuration.  
Primary, does the
above suggested raid partitions imply that multiple (smaller) disk arrays have 
a potential for
more I/O bandwidth than a larger raid 10 array?

Regards,

Richard



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] how to partition disks

2006-06-14 Thread Sven Geisler


Hi Richard,

Richard Broersma Jr schrieb:

This depends on your application. Do you have a lot of disc reads?
Anyhow, I would put the xlog always to a RAID 10 volume because most of 
the I/O for update and inserts is going to the xlog.


4 discs xlog
6 discs tables
4 discs tables2


I have a question in regards to I/O bandwidths of various raid configuration.  
Primary, does the
above suggested raid partitions imply that multiple (smaller) disk arrays have 
a potential for
more I/O bandwidth than a larger raid 10 array?


Yes.
Because the disc arms didn't need to reposition that much as there would 
o with one large volume.


For example, You run two queries with two clients and each queries needs 
to read some indices from disk. In this case it more efficient to read 
from different volumes than to read from one large volume where the disc 
arms has to jump.


Sven.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Shaun Thomas

>>> On 6/13/2006 at 9:13 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> Read what you wrote, and rethink...

Hah.  Yes, I understand the irony of that statement, but the point is
that the value of the variable won't change during query execution.

> If you're desperate you can construct a query string with the
variable
> value embedded as a literal, and then EXECUTE that.  This isn't a
great
> solution since it forces a re-plan on every execution.

That's so gross... but it might work.  I'm not really desperate, just
frustrated.  I really can't wait until we can upgrade; 7.4 is driving
me nuts.  I'm not really worried about a re-plan, since this SP just
updates a fact table, so it only gets called twice a day.  Cutting the
execution time of the SP down to < 20 seconds from 15 minutes would be
nice, but not absolutely required.  I was just surprised at the large
difference in manual execution as opposed to the SP with the same
query.

> We've occasionally debated ways to do it better, but no such
> improvement will ever appear in 7.4 ;-)

Agreed!  When we finally upgrade, I fully plan on putting a symbolic
bullet into our old installation. ;)

Thanks!

-- 

Shaun Thomas
Database Administrator

Leapfrog Online 
807 Greenwood Street 
Evanston, IL 60201 
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com

Confidentiality Note:

The document(s) accompanying this e-mail transmission, if any, and the
e-mail transmittal message contain information from Leapfrog Online
Customer Acquisition, LLC is confidential or privileged. The information
is intended to be for the use of the individual(s) or entity(ies) named
on this e-mail transmission message. If you are not the intended
recipient, be aware that any disclosure, copying, distribution or use of
the contents of this e-mail is prohibited. If you have received this
e-mail in error, please immediately delete this e-mail and notify us by
telephone of the error

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Steve Poe

Dave, Joshua, Scott (and all),

Thanks for your feedback, while I do appreciate it, I did not intent on
making this discussion "buy this instead"...I whole-heartly agree with
you. Joshua, you made the best comment, it is a business decision for
the client. I don't agree with it, but I understand it. I've recommended
Sun or Penguin Computing which I've had no technical issues with. They
did not dispute my recommendation but they ignored it. I have not like
Dell, on the server side, since 1998 - 2000 time period.

Excluding Dell's issues, has anyone seen performance differences between
AMD's Opteron and Intel's new Xeon's (dual or quad CPU or dual-core). If
anyone has done benchmark comparisons between them, any summary
information would be appreciated.

For now, I am asking the client to hold-off and wait for the AMD Opteron
availability on the Dell servers.

Thanks again.

Steve




---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John E. Vincent

-- this is the third time I've tried sending this and I never saw it get 
through to the list. Sorry if multiple copies show up.


Hi all,

I've been lurking using the web archives for a while and haven't found 
an answer that seems to answer my questions about pg_dump.


We have a 206GB data warehouse running on version 8.0.3. The server is 
somewhat underpowered in terms of CPU: (1) 2.8 GHz Xeon 4GB Ram and a 
single HBA to our SAN (IBM DS4300). We in the process of migrating to a 
new server that we've repurposed from our production OLTP database (8) 
2.0 GHz Xeon, 16GB Ram and dual HBAs to the same SAN running version 8.1.


Independant of that move, we still need to get by on the old system and 
I'm concerned that even on the new system, pg_dump will still perform 
poorly. I can't do a full test because we're also taking advantage of 
the table partitioning in 8.1 so we're not doing a dump and restore.


We backup the database using:

pg_dump -Fc -cv ${CURDB} > ${BACKDIR}/${CURDB}-${DATE}.bak

There a three different LUNs allocated to the old warehouse on the SAN - 
data, wal and a dump area for the backups. The SAN has two controllers 
(only 128MB of cache per) and the data is on one controller while the 
WAL and dump area are on the other. Still a single HBA though.


Creating the compressed backup of this database takes 12 hours. We start 
at 6PM and it's done a little after 1AM, just in time for the next day's 
load. The load itself takes about 5 hours.


I've watched the backup process and I/O is not a problem. Memory isn't a 
problem either. It seems that we're CPU bound but NOT in I/O wait. The 
server is a dedicated PGSQL box.


Here are our settings from the conf file:

maintenance_work_mem = 524288
work_mem = 1048576 ( I know this is high but you should see some of our 
sorts and aggregates)

shared_buffers = 5
effective_cache_size = 45
wal_buffers = 64
checkpoint_segments = 256
checkpoint_timeout = 3600

We're inserting around 3mil rows a night if you count staging, info, dim 
and fact tables. The vacuum issue is a whole other problem but right now 
I'm concerned about just the backup on the current hardware.


I've got some space to burn so I could go to an uncompressed backup and 
compress it later during the day.


If there are any tips anyone can provide I would greatly appreciate it. 
I know that the COPY performance was bumped up in 8.1 but I'm stuck on 
this 8.0 box for a while longer.


Thanks,
John E. Vincent

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] how to partition disks

On Wed, Jun 14, 2006 at 04:32:23PM +0200, Sven Geisler wrote:
> Hi Richard,
> 
> Richard Broersma Jr schrieb:
> >>This depends on your application. Do you have a lot of disc reads?
> >>Anyhow, I would put the xlog always to a RAID 10 volume because most of 
> >>the I/O for update and inserts is going to the xlog.
> >>
> >>4 discs xlog
> >>6 discs tables
> >>4 discs tables2
> >
> >I have a question in regards to I/O bandwidths of various raid 
> >configuration.  Primary, does the
> >above suggested raid partitions imply that multiple (smaller) disk arrays 
> >have a potential for
> >more I/O bandwidth than a larger raid 10 array?
> 
> Yes.
> Because the disc arms didn't need to reposition that much as there would 
> o with one large volume.
> 
> For example, You run two queries with two clients and each queries needs 
> to read some indices from disk. In this case it more efficient to read 
> from different volumes than to read from one large volume where the disc 
> arms has to jump.

But keep in mind that all of that is only true if you have very good
knowledge of how your data will be accessed. If you don't know that,
you'll almost certainly be better off just piling everything into one
RAID array and letting the controller deal with it.

Also, if you have a good RAID controller that's batter-backed,
seperating pg_xlog onto it's own array is much less likely to be a win.
The reason you normally put pg_xlog on it's own partition is because the
database has to fsync pg_xlog *at every single commit*. This means you
absolutely want that fsync to be as fast as possible. But with a good,
battery-backed controller, this no longer matters. The fsync is only
going to push the data into the controller, and the controller will take
things from there. That means it's far less important to put pg_xlog on
it's own array. I actually asked about this recently and one person did
reply that they'd done testing and found it was better to just put all
their drives into one array so they weren't wasting bandwidth on the
pg_xlog drives.

Even if you do decide to keep pg_xlog seperate, a 4 drive RAID10 for
that is overkill. It will be next to impossible for you to generate
enough WAL traffic to warrent it.

Your best bet is to perform testing with your application. That's the
only way you'll truely find out what's going to work best. Short of
that, your best bet is to just pile all the drives together. If you do
testing, I'd start first with the effect of a seperate pg_xlog. Only
after you have those results would I consider trying to do things like
split indexes from tables, etc. 

BTW, you should consider reserving some of the drives in the array as
hot spares.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Precomputed constants?

On Wed, Jun 14, 2006 at 01:30:10PM +0200, B?sz?rm?nyi Zolt?n wrote:
> Replacing random() with a true constant gives me index scan
> even if it's hidden inside other function calls. E.g.:

The database has no choice but to compute random() for every row; it's
marked VOLATILE.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

"John E. Vincent" <[EMAIL PROTECTED]> writes:
> I've watched the backup process and I/O is not a problem. Memory isn't a 
> problem either. It seems that we're CPU bound but NOT in I/O wait.

Is it the pg_dump process, or the connected backend, that's chewing the
bulk of the CPU time?  (This should be pretty obvious in "top".)

If it's the pg_dump process, the bulk of the CPU time is likely going
into compression --- you might consider backing off the compression
level, perhaps --compress=1 or even 0 if size of the dump file isn't
a big concern.

Another possibility if your LAN is reasonably fast is to run pg_dump on
a different machine, so that you can put two CPUs to work.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> -- this is the third time I've tried sending this and I never saw it get 
> through to the list. Sorry if multiple copies show up.
> 
> Hi all,

BUNCHES SNIPPED

> work_mem = 1048576 ( I know this is high but you should see some of our 
> sorts and aggregates)

Ummm.  That's REALLY high.  You might want to consider lowering the
global value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries could
theoretically run your machine out of memory right now.  Just put a "set
work_mem=100" in your script before the big query runs.

> We're inserting around 3mil rows a night if you count staging, info, dim 
> and fact tables. The vacuum issue is a whole other problem but right now 
> I'm concerned about just the backup on the current hardware.
> 
> I've got some space to burn so I could go to an uncompressed backup and 
> compress it later during the day.

That's exactly what we do.  We just do a normal backup, and have a
script that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.

Seeing as how you're CPU bound, most likely the problem is just the
compressed backup.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Solaris shared_buffers anomaly?

2006-06-14 Thread Mischa Sandberg


Tom Lane wrote:

Mischa Sandberg <[EMAIL PROTECTED]> writes:



Tom Lane wrote:

Does Solaris have any call that allows locking a shmem segment in RAM?

Yes, mlock(). But want to understand what's going on before patching.


Sure, but testing it with mlock() might help you understand what's going
on, by eliminating one variable: we don't really know if the shmem is
getting swapped, or something else.



For a dedicated DB server machine, Solaris has a feature:
create "intimate" shared memory with shmat(..., SHM_SHARE_MMU).
All backends share the same TLB entries (!). 


We use that already.  (Hmm, might be interesting for you to turn it
*off* and see if anything changes.  See src/backend/port/sysv_shmem.c.)


Gah. Always must remember to RTFSource.
And reproduce the problem on a machine I control :-)

--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Which processor runs better for Postgresql?

On Tue, 2006-06-13 at 15:11, Dave Page wrote:
>  
> > -Original Message-

> And how old are the 2600's now?
> 
> Anyhoo, I'm not saying the current machines are excellent performers or
> anything, but there are good business reasons to run them if you don't
> need to squeeze out every last pony.

Just thought I'd point you to Dell's forums.

http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1

wherein you'll find plenty of folks who have problems with freezing RAID
controllers with 28xx and 18xx machines.  

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:> -- this is the third time I've tried sending this and I never saw it get> through to the list. Sorry if multiple copies show up.>> Hi all,
BUNCHES SNIPPED> work_mem = 1048576 ( I know this is high but you should see some of our> sorts and aggregates)Ummm. That's REALLY high. You might want to consider lowering theglobal value here, and then crank it up on a case by case basis, like
during nighttime report generation. Just one or two queries couldtheoretically run your machine out of memory right now. Just put a "setwork_mem=100" in your script before the big query runs.
I know it is but that's what we need for some of our queries. Our ETL tool (informatica) and BI tool (actuate) won't let us set those things as part of our jobs. We need it for those purposes. We have some really nasty queries that will be fixed in our new server.
E.G. we have a table called loan_account_agg_fact that has 200+ million rows and it contains every possible combination of late status for a customer account (i.e. 1 day late, 2 day late, 3 day late) so it gets inserted for new customers but updated for existing records as part of our warehouse load. Part of the new layout is combining late ranges so instead of number of days we have a range of days (
i.e. 1-15,16-30). Even with work_mem that large, the load of that loan_account_agg_fact table creates over 3GB of temp tables!
That's exactly what we do. We just do a normal backup, and have ascript that gzips anything in the backup directory that doesn't end in.gz... If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.Seeing as how you're CPU bound, most likely the problem is just thecompressed backup.I'm starting to think the same thing. I'll see how this COPY I'm doing of the single largest table does right now and make some judgement based on that.
-- John E. Vincent

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread A.M.

On Wed, June 14, 2006 1:04 pm, John Vincent wrote:

> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things as
> part of our jobs. We need it for those purposes. We have some really nasty
> queries that will be fixed in our new server.

You could modify pgpool to insert the necessary set commands and point the
tools at pgpool.

-M


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

Out of curiosity, does anyone have any idea what the ratio of actual datasize to backup size is if I use the custom format with -Z 0 compression or the tar format? Thanks.On 6/14/06, 
Scott Marlowe <[EMAIL PROTECTED]> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:> -- this is the third time I've tried sending this and I never saw it get> through to the list. Sorry if multiple copies show up.>> Hi all,
BUNCHES SNIPPED> work_mem = 1048576 ( I know this is high but you should see some of our> sorts and aggregates)Ummm.  That's REALLY high.  You might want to consider lowering theglobal value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries couldtheoretically run your machine out of memory right now.  Just put a "setwork_mem=100" in your script before the big query runs.
> We're inserting around 3mil rows a night if you count staging, info, dim> and fact tables. The vacuum issue is a whole other problem but right now> I'm concerned about just the backup on the current hardware.
>> I've got some space to burn so I could go to an uncompressed backup and> compress it later during the day.That's exactly what we do.  We just do a normal backup, and have ascript that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least afew days to see how it affects backup speeds.Seeing as how you're CPU bound, most likely the problem is just thecompressed backup.
-- John E. Vincent[EMAIL PROTECTED]

Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Dave Page

Title: RE: [PERFORM] Which processor runs better for Postgresql?



 

  
  
  From: Scott Marlowe 
  [mailto:[EMAIL PROTECTED] Sent: 14 June 2006 
  18:04To: Dave PageCc: Joshua D. Drake; 
  [EMAIL PROTECTED]; pgsql-performance@postgresql.orgSubject: RE: 
  [PERFORM] Which processor runs better for Postgresql?
  
  On Tue, 2006-06-13 at 15:11, Dave Page 
  wrote:> > > -Original Message-> And 
  how old are the 2600's now?>> Anyhoo, I'm not saying the current 
  machines are excellent performers or> anything, but there are good 
  business reasons to run them if you don't> need to squeeze out every 
  last pony.Just thought I'd point you to Dell's 
  forums.http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1wherein you'll find plenty of folks who have problems with freezing 
  RAIDcontrollers with 28xx and 18xx machines.  
Never had any such problems in the dozen or so 
machines we run (about a 50-50 split of Linux to 
Windows). 
Regards, Dave

Re: [PERFORM] OT - select + must have from - sql standard syntax?

2006-06-14 Thread Antoine


You can get free "draft" versions that are close-enough-to-final to be
perfectly usable.  See our developers' FAQ for some links.  I like the
drafts partly because they're plain ASCII, and so far easier to search
than PDFs ...


Great to know - thanks!
Cheers
Antoine

--
This is where I should put some witty comment.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[PERFORM] Postgres consuming way too much memory???

2006-06-14 Thread jody brownell

I have a box with an app and postgresql on it. Hardware includes with 2 2.8 Ghz 
xeons 512KB cache, 4 GB of memory, 6 scsi disk in a software 
raid 5 on a trustix 2.2 with a 2.6.15.3 kernel. The data and indexes are on the 
raid array while the tx log is on disk 
with the OS. All is well.

The one application executes one transaction every 60 seconds or so. The 
transaction can range from tiny 
to relatively large. Maybe 30-70k inserts, 60-100k updates... nothing too 
heavy, take about 8-12 seconds 
to finish the the entire update in the worst case. The application is using the 
latest jdbc I am using 
preparedStatements with addBatch/executebatch/clearBatch to send statements in 
batches of 10 thousand... 
(is that high?)

The box itself is a little over subscribed for memory which is causing us to 
swap a bit... As the 
application runs, I notice the postgres process which handles this particular 
app connection grows in memory seemingly 
uncrontrollably until kaboom. Once the kernel kills off enough processes and 
the system settles, I see the postgres process is at 1.9GB 
of res memory and 77MB of shared memory. This challenges a number of 
assumptions I have made in the last while and raises a 
few questions... BTW, I am assuming this is not a memory leak b/c the same 
install of our software on a box 
with 8GB of memory and no swap being used has no unexplained growth in the 
memory... it is perfectly healthy 
and quite performant.

Anyway, due to errors in the transaction, it is rolledback afterwhich the 
postgres process remains at 901MB of 
resident memory and 91MB of of shared memory.

27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar 
qradar :::x.x.x.x(51149) idle

There are a few things I would like to understand. 

- What in the postgres will grow at an uncontrolled rate when the system is 
under heavy load or the transaction 
  is larger... there must be something not governed by the shared memory or 
other configuration in postgresql.conf. 
  It seems like, once we start hitting swap, postgres grows in memory resulting 
in more swapping... until applications 
  start getting killed.
- when the transaction was rolled back why did the process hold onto the 901MB 
of memory? 
- when is a transaction too big? is this determined by the configuration and 
performance of wal_buffers and wal log or is there 
  house cleaning which MUST be done at commit/rollback to avoid siutations like 
this thus indicating there is an upper bound.

I have been configuring postgres from tidbits I collected reading this list in 
the last few months 
not sure if what I have is totally right for the work load, but when I have 
adequate memory and avoid swap, we are more than 
happy with performance. Configuration which is not below is just the default.

shared_buffers = 32767
work_mem = 20480
maintenance_work_mem = 32768
max_fsm_pages = 4024000
max_fsm_relations = 2000
fsync = false
wal_sync_method = fsync
wal_buffers = 4096
checkpoint_segments = 32
checkpoint_timeout = 1200
checkpoint_warning = 60
commit_delay = 5000
commit_siblings = 5
effective_cache_size = 175000
random_page_cost = 2
autovacuum = true
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay=100
autovacuum_vacuum_cost_limit=100
default_statistics_target = 40

Is there anything here which looks weird or mis configured? I am just starting 
to play with the bg writer configuration so I did not include.
typically, there is little or no iowait... and no reason to think there is 
something miconfigured... from what I have seen.

In one transaction i have seen as many as 5 checkpoint_segments be created/used 
so I was considering increasing wal_buffers to 8192 from 4096 
given as many as 4 segments in memory/cache at once... need to test this though 


Anyone have any thoughts on what could have caused the bloat? 

thanks

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Which processor runs better for Postgresql?

On Wed, 2006-06-14 at 13:43, Dave Page wrote:
>  

> 
> Just thought I'd point you to Dell's forums.
> 
> 
> http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1
> 
> wherein you'll find plenty of folks who have problems with
> freezing RAID
> controllers with 28xx and 18xx machines. 
>  
> 
> Never had any such problems in the dozen or so machines we run
> (about a 50-50 split of Linux to Windows). 
> 
> Regards, Dave
> 

Yeah, We've got a mix of 2650 and 2850s, and our 2850s have been rock
solid stable, unlike the 2650s.  I was actually kinda surprised to see
how many people have problems with the 2850s.

Apparently, the 2850 mobos have a built in RAID that's pretty stable
(it's got a PERC number I can't remembeR), but ordering them with an add
on Perc RAID controller appears to make them somewhat unstable as well.

On recommendation I've seen repeatedly is to use the --noapic option at
boot time.

Just FYI

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Dave Page

 

> -Original Message-
> From: Scott Marlowe [mailto:[EMAIL PROTECTED] 
> Sent: 14 June 2006 20:52
> To: Dave Page
> Cc: Joshua D. Drake; [EMAIL PROTECTED]; 
> pgsql-performance@postgresql.org
> Subject: RE: [PERFORM] Which processor runs better for Postgresql?
> 
> 
> Yeah, We've got a mix of 2650 and 2850s, and our 2850s have been rock
> solid stable, unlike the 2650s.  I was actually kinda surprised to see
> how many people have problems with the 2850s.
> 
> Apparently, the 2850 mobos have a built in RAID that's pretty stable
> (it's got a PERC number I can't remembeR), but ordering them 
> with an add
> on Perc RAID controller appears to make them somewhat 
> unstable as well.

That might be it - we always chose the onboard PERC because it has twice
the cache of the other options. 

Regards, Dave.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Postgres consuming way too much memory???

"jody brownell" <[EMAIL PROTECTED]> writes:
> 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar 
> qradar :::x.x.x.x(51149) idle

This looks like a memory leak, but you haven't provided enough info to
let someone else reproduce it.  Can you log what your application is
doing and extract a test case?  What PG version is this, anyway?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On Wed, 2006-06-14 at 12:04, John Vincent wrote:
> 
> On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> > -- this is the third time I've tried sending this and I
> never saw it get
> > through to the list. Sorry if multiple copies show up.
> >
> > Hi all,
> 
> BUNCHES SNIPPED
> 
> > work_mem = 1048576 ( I know this is high but you should see
> some of our
> > sorts and aggregates)
> 
> Ummm.  That's REALLY high.  You might want to consider
> lowering the
> global value here, and then crank it up on a case by case
> basis, like 
> during nighttime report generation.  Just one or two queries
> could
> theoretically run your machine out of memory right now.  Just
> put a "set
> work_mem=100" in your script before the big query runs.
> 
> 
> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things
> as part of our jobs. We need it for those purposes. We have some
> really nasty queries that will be fixed in our new server. 

Description of "Queries gone wild" redacted.  hehe.

Yeah, I've seen those kinds of queries before too.  you might be able to
limit your exposure by using alter user:

alter user userwhoneedslotsofworkmem set work_mem=100;

and then only that user will have that big of a default.  You could even
make it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I just
get REAL nervous seeing a production machine with a work_mem set that
high.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
Description of "Queries gone wild" redacted.  hehe.Yeah, I've seen those kinds of queries before too.  you might be able tolimit your exposure by using alter user:alter user userwhoneedslotsofworkmem set work_mem=100;
Is this applicable on  8.0? We were actually LOOKING for a governor of some sort for these queries.  And something that is not explicitly stated, is that allocated up front or is that just a ceiling?
and then only that user will have that big of a default.  You could evenmake it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I justget REAL nervous seeing a production machine with a work_mem set thathigh.Which is actually how it's configured. We have a dedicated user connecting from  Actuate. The reports developers use thier own logins when developing new reports. Only when they get published do they convert to the Actuate user.
-- John E. Vincent[EMAIL PROTECTED]

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

How long does gzip take to compress this backup?

On Wed, 2006-06-14 at 15:59, John Vincent wrote:
> Okay I did another test dumping using the uncompressed backup on the
> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a
> fair bit of space left on the SAN to work with. 
> 
> On 6/14/06, John Vincent <[EMAIL PROTECTED]> wrote:
> Well I did a test to answer my own question:
> 
> -rw-r--r--  1 postgres postgres 167M Jun 14 01:43
> claDW_PGSQL-20060613170001.bak
> -rw-r--r--  1 root root 2.4G Jun 14 14:45
> claDW_PGSQL.test.bak 
> 
> the claDW_PGSQL database is a subset of the data in the main
> schema that I'm dealing with. 
> 
> I did several tests using -Fc -Z0 and a straight pg_dump with
> no format option.
> 
> The file size is about 1300% larger and takes just as long to
> dump even for that small database. 
> 
> Interestingly enough gzip compresses about 1M smaller with no
> gzip options.
> 
> I don't know that the uncompressed is really helping much. I'm
> going to run another query when there's no other users on the
> system and see how it goes. 
> 
> 
> 
> -- 
> John E. Vincent

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> Out of curiosity, does anyone have any idea what the ratio of actual
> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?

-Z 0 should mean no compression.

Something you can try is piping the output of pg_dump to gzip/bzip2. On
some OSes, that will let you utilize 1 CPU for just the compression. If
you wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.

Or if you don't care about disk IO bandwidth, just compress after the
fact (though, that could just put you in a situation where pg_dump
becomes bandwidth constrained).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

time gzip -6 claDW_PGSQL.test.bakreal    3m4.360suser    1m22.090ssys 0m6.050sWhich is still less time than it would take to do a compressed pg_dump. On 6/14/06, 
Scott Marlowe <[EMAIL PROTECTED]> wrote:
How long does gzip take to compress this backup?On Wed, 2006-06-14 at 15:59, John Vincent wrote:> Okay I did another test dumping using the uncompressed backup on the> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a> fair bit of space left on the SAN to work with.

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On 6/14/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:> Out of curiosity, does anyone have any idea what the ratio of actual> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?-Z 0 should mean no compression.But the custom format is still a binary backup, no?
Something you can try is piping the output of pg_dump to gzip/bzip2. Onsome OSes, that will let you utilize 1 CPU for just the compression. Ifyou wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.Or if you don't care about disk IO bandwidth, just compress after thefact (though, that could just put you in a situation where pg_dumpbecomes bandwidth constrained).
Unfortunately if we working with our current source box, the 1 CPU is already the bottleneck in regards to compression. If I run the pg_dump from the remote server though, I might be okay.
--Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

On Wed, Jun 14, 2006 at 05:18:14PM -0400, John Vincent wrote:
> On 6/14/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> >
> >On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> >> Out of curiosity, does anyone have any idea what the ratio of actual
> >> datasize to backup size is if I use the custom format with -Z 0
> >compression
> >> or the tar format?
> >
> >-Z 0 should mean no compression.
> 
> 
> But the custom format is still a binary backup, no?
 
I fail to see what that has to do with anything...

> Something you can try is piping the output of pg_dump to gzip/bzip2. On
> >some OSes, that will let you utilize 1 CPU for just the compression. If
> >you wanted to get even fancier, there is a parallelized version of bzip2
> >out there, which should let you use all your CPUs.
> >
> >Or if you don't care about disk IO bandwidth, just compress after the
> >fact (though, that could just put you in a situation where pg_dump
> >becomes bandwidth constrained).
> 
> 
> Unfortunately if we working with our current source box, the 1 CPU is
> already the bottleneck in regards to compression. If I run the pg_dump from
> the remote server though, I might be okay.

Oh, right, forgot about that. Yeah, your best bet could be to use an
external machine for the dump.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Solaris shared_buffers anomaly?

2006-06-14 Thread Josh Berkus

Folks,

First off, you'll be glad to know that I've persuaded two of the Sun 
performance engineers to join this list soon.   So you should be able to 
get more difinitive answers to these questions.

Second, 7.4 still did linear scanning of shared_buffers as part of LRU and 
for other activities.   I don't know how that would cause swapping, but it 
certainly could cause dramatic slowdowns (like 2-5x) if you overallocated 
shared_buffers.   

Possibly this is also triggering a bug in Solaris 2.6.   2.6 is pretty 
darned old (1997); maybe you should upgrade?   We're testing with s_b set 
to 300,000 on Solaris 10 (Niagara) so this is obviously not a current 
Solaris issue.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Dan Gorman


All,
  So I thought I'd pose this question:

If I have a pg database attached to a powervault (PV) with just an  
off-the-shelf SCSI card I generally want fsync on to prevent data  
corruption in case the PV should loose power.
However, if I have it attached to a NetApp that ensures data writes  
to via the NVRAM can I safely turn fsync off to gain additional  
performance?


Best Regards,
Dan Gorman



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread D'Arcy J.M. Cain

On Wed, 14 Jun 2006 14:48:04 -0700
Dan Gorman <[EMAIL PROTECTED]> wrote:
> If I have a pg database attached to a powervault (PV) with just an  
> off-the-shelf SCSI card I generally want fsync on to prevent data  
> corruption in case the PV should loose power.
> However, if I have it attached to a NetApp that ensures data writes  
> to via the NVRAM can I safely turn fsync off to gain additional  
> performance?

I wouldn't.  Remember, you still have to get the data to the NetApp.
You don't want things sitting in the computer's buffers when it's power
goes down.

-- 
D'Arcy J.M. Cain  |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Mark Lewis

No.  You need fsync on in order to force the data to get TO the NetApp
at the right time.  With fsync off, the data gets cached in the
operating system.

-- Mark Lewis

On Wed, 2006-06-14 at 14:48 -0700, Dan Gorman wrote:
> All,
>So I thought I'd pose this question:
> 
> If I have a pg database attached to a powervault (PV) with just an  
> off-the-shelf SCSI card I generally want fsync on to prevent data  
> corruption in case the PV should loose power.
> However, if I have it attached to a NetApp that ensures data writes  
> to via the NVRAM can I safely turn fsync off to gain additional  
> performance?
> 
> Best Regards,
> Dan Gorman
> 
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Effects of cascading references in foreign keys

2006-06-14 Thread Bruce Momjian


Would someone please find the answer to Tom's last question?

---

Tom Lane wrote:
> I wrote:
> > Looking at this, I wonder if there isn't a bug or at least an
> > inefficiency in 8.1.  The KeysEqual short circuit tests are still there
> > in ri_triggers.c; aren't they now redundant with the test in triggers.c?
> > And don't they need to account for the special case mentioned in the
> > comment in triggers.c, that the RI check must still be done if we are
> > looking at a row updated by the same transaction that created it?
> 
> OK, I take back the possible-bug comment: the special case only applies
> to the FK-side triggers, which is to say RI_FKey_check, and that routine
> doesn't attempt to skip the check on equal old/new keys.  I'm still
> wondering though if the KeysEqual tests in the other RI triggers aren't
> now a waste of cycles.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq
> 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Jim Nasby


On Jun 13, 2006, at 8:50 PM, Tom Lane wrote:


"Jim C. Nasby" <[EMAIL PROTECTED]> writes:

On Tue, Jun 13, 2006 at 06:04:42PM -0400, Tom Lane wrote:

It'd depend on the context, possibly, but it's easy to show that the
current planner does fold "now() - interval_constant" when making
estimates.  Simple example:


Turns out the difference is between feeding a date vs a timestamp  
into the
query... I would have thought that since date is a date that the  
WHERE clause

would be casted to a date if it was a timestamptz, but I guess not...


Hmm ... worksforme.  Could you provide a complete test case?


decibel=# create table date_test(d date not null, i int not null);
CREATE TABLE
decibel=# insert into date_test select now()-x*'1 day'::interval, i  
from generate_series(0,3000) x, generate_series(1,10) i;

INSERT 0 30010
decibel=# analyze verbose date_test;
INFO:  analyzing "decibel.date_test"
INFO:  "date_test": scanned 3 of 1622163 pages, containing  
555 live rows and 0 dead rows; 3 rows in sample, 300100155  
estimated total rows

ANALYZE
decibel=# explain select * from date_test where d >= now()-'15  
days'::interval;

 QUERY PLAN
-
Seq Scan on date_test  (cost=0.00..6873915.80 rows=1228164 width=8)
   Filter: (d >= (now() - '15 days'::interval))
(2 rows)

decibel=# explain select * from date_test where d >= (now()-'15  
days'::interval)::date;

 QUERY PLAN
-
Seq Scan on date_test  (cost=0.00..7624166.20 rows=1306467 width=8)
   Filter: (d >= ((now() - '15 days'::interval))::date)
(2 rows)

decibel=# select version();
 version
 
-
PostgreSQL 8.1.4 on amd64-portbld-freebsd6.0, compiled by GCC cc  
(GCC) 3.4.4 [FreeBSD] 20050518

(1 row)

decibel=#

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Confirmation of bad query plan generated by 7.4