[PERFORM] how to partition disks

2006-06-14 Thread hubert depesz lubaczewski
hii'm waiting for new server to arrive.for the server i will get hp msa1000, with 14 discs (72g, 15000 rpm).what kind of partitioning you suggest, to get maximum performance?for system things i will have separate discs, so whole array is only for postgresql.
data processing is oltp, but with large amounts of write operations.any hints?should i go with 3 partitions, or just 1? or 2?depesz-- 
http://www.depesz.com/ - nowy, lepszy depesz


Re: [PERFORM] how to partition disks

2006-06-14 Thread Sven Geisler

Hi Hubert,

hubert depesz lubaczewski schrieb:

hi
i'm waiting for new server to arrive.
for the server i will get hp msa1000, with 14 discs (72g, 15000 rpm).
what kind of partitioning you suggest, to get maximum performance?
for system things i will have separate discs, so whole array is only for 
postgresql.


data processing is oltp, but with large amounts of write operations.

any hints?

should i go with 3 partitions, or just 1? or 2?



You should configure your discs to RAID 10 volumes.
You should set up a separate volume for WAL.
A volume for an additional table space may also useful.

In your case I would do 2 partitions:

1. RAID 10 with 8 discs for general data
2. RAID 10 with 4 discs for WAL
(two disk as spare)

You may split the first volume in two volumes for a second table space 
if you Server doesn't have enough RAM and you have a high disk read I/O.



Cheers
Sven.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Precomputed constants?

2006-06-14 Thread Böszörményi Zoltán
Hi,

here's my problem:

# explain analyze select * from mxstrpartsbg where szam =
round(80*random())::integer;
   QUERY PLAN
-
 Seq Scan on mxstrpartsbg  (cost=0.00..56875.04 rows=1 width=322) (actual
time=190.748..1271.664 rows=1 loops=1)
   Filter: (szam = (round((80::double precision * random(::integer)
 Total runtime: 1271.785 ms
(3 rows)

# explain analyze select * from mxstrpartsbg where szam = 671478; 
QUERY PLAN

 Index Scan using mxstrpartsbg_pkey on mxstrpartsbg  (cost=0.00..5.87
rows=1 width=322) (actual time=71.642..71.644 rows=1 loops=1)
   Index Cond: (szam = 671478)
 Total runtime: 71.706 ms
(3 rows)

Is there a way to have PostgreSQL to pre-compute all the constants in the
WHERE clause? It would be a huge performance gain. Thanks in advance.

Best regards,
Zoltán Böszörményi


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Precomputed constants?

2006-06-14 Thread Volkan YAZICI
On Jun 14 12:53, Böszörményi Zoltán wrote:
> # explain analyze select * from mxstrpartsbg where szam =
> round(80*random())::integer;

AFAIK, you can use sth like that:

SELECT * FROM mxstrpartsbg
  WHERE szam = (SELECT round(80*random())::integer OFFSET 0);

This will prevent calculation of round() for every row.


Regards.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] how to partition disks

2006-06-14 Thread hubert depesz lubaczewski
On 6/14/06, Sven Geisler <[EMAIL PROTECTED]> wrote:
You should configure your discs to RAID 10 volumes.You should set up a separate volume for WAL.A volume for an additional table space may also useful.In your case I would do 2 partitions:1. RAID 10 with 8 discs for general data
raid 10 is of course not questionable. but are you sure that it will work faster than for example:2 discs (raid 1) for xlog6 discs (raid 10) for tables6 discs (raid 10) for indices?
depesz-- http://www.depesz.com/ - nowy, lepszy depesz


Re: [PERFORM] Precomputed constants?

2006-06-14 Thread Böszörményi Zoltán
> On Jun 14 12:53, Böszörményi Zoltán wrote:
>> # explain analyze select * from mxstrpartsbg where szam =
>> round(80*random())::integer;
>
> AFAIK, you can use sth like that:
>
> SELECT * FROM mxstrpartsbg
>   WHERE szam = (SELECT round(80*random())::integer OFFSET 0);
>
> This will prevent calculation of round() for every row.
>
> Regards.

Thanks, It worked.

Oh, I see now. I makes sense, random() isn't a constant and
it was computed for every row. Actually running the query produces
different results sets with 0, 1 or 2 rows.

Replacing random() with a true constant gives me index scan
even if it's hidden inside other function calls. E.g.:

# explain analyze select * from mxstrpartsbg where szam =
round('80.71'::decimal(10,2))::integer;
QUERY PLAN
--
 Index Scan using mxstrpartsbg_pkey on mxstrpartsbg  (cost=0.00..5.87
rows=1 width=322) (actual time=0.020..0.022 rows=1 loops=1)
   Index Cond: (szam = 81)
 Total runtime: 0.082 ms
(3 rows)

Best regards,
Zoltán Böszörményi


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] how to partition disks

2006-06-14 Thread Sven Geisler

Hi Hupert,

hubert depesz lubaczewski schrieb:
On 6/14/06, *Sven Geisler* <[EMAIL PROTECTED] 
> wrote:

You should configure your discs to RAID 10 volumes.
You should set up a separate volume for WAL.
A volume for an additional table space may also useful.
In your case I would do 2 partitions:
1. RAID 10 with 8 discs for general data


raid 10 is of course not questionable. but are you sure that it will 
work faster than for example:

2 discs (raid 1) for xlog
6 discs (raid 10) for tables
6 discs (raid 10) for indices?



This depends on your application. Do you have a lot of disc reads?
Anyhow, I would put the xlog always to a RAID 10 volume because most of 
the I/O for update and inserts is going to the xlog.


4 discs xlog
6 discs tables
4 discs tables2

This should be better. You should distribute indices on separate spindle 
stacks to share the I/O. But again this depends on your application and 
your server. How are the indices used? How large is your file system 
cache. What does PostgreSQL effectively read from disc.


Don't forget to tune your postgresql.conf:



Cheers
Sven.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Jim C. Nasby
On Tue, Jun 13, 2006 at 09:50:49PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > On Tue, Jun 13, 2006 at 06:04:42PM -0400, Tom Lane wrote:
> >> It'd depend on the context, possibly, but it's easy to show that the
> >> current planner does fold "now() - interval_constant" when making
> >> estimates.  Simple example:
> 
> > Turns out the difference is between feeding a date vs a timestamp into the
> > query... I would have thought that since date is a date that the WHERE 
> > clause
> > would be casted to a date if it was a timestamptz, but I guess not...
> 
> Hmm ... worksforme.  Could you provide a complete test case?

I can't provide the data I used for that, but I'll try and come up with
something else.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] OT - select + must have from - sql standard syntax?

2006-06-14 Thread Dave Dutcher
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Greg Stark
> Sent: Tuesday, June 13, 2006 11:16 PM
> Subject: Re: [PERFORM] OT - select + must have from - sql 
> standard syntax?
[SNIP]
> 
> Well you could always create a "dual", it was always just a 
> regular table. We
> used to joke about what would happen to Oracle if you 
> inserted an extra row in
> it...


I've never used Oracle, so I don't understand why its called dual when
it only has one row?  Shouldn't it be called single?  :\


Dave


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] OT - select + must have from - sql standard syntax?

2006-06-14 Thread Tom Lane
Antoine <[EMAIL PROTECTED]> writes:
> ps. shame the standard isn't "freely" consultable to save you guys
> silly OT questions!

You can get free "draft" versions that are close-enough-to-final to be
perfectly usable.  See our developers' FAQ for some links.  I like the
drafts partly because they're plain ASCII, and so far easier to search
than PDFs ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] how to partition disks

2006-06-14 Thread Richard Broersma Jr
> > raid 10 is of course not questionable. but are you sure that it will 
> > work faster than for example:
> > 2 discs (raid 1) for xlog
> > 6 discs (raid 10) for tables
> > 6 discs (raid 10) for indices?
> > 
> 
> This depends on your application. Do you have a lot of disc reads?
> Anyhow, I would put the xlog always to a RAID 10 volume because most of 
> the I/O for update and inserts is going to the xlog.
> 
> 4 discs xlog
> 6 discs tables
> 4 discs tables2

I have a question in regards to I/O bandwidths of various raid configuration.  
Primary, does the
above suggested raid partitions imply that multiple (smaller) disk arrays have 
a potential for
more I/O bandwidth than a larger raid 10 array?

Regards,

Richard



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] how to partition disks

2006-06-14 Thread Sven Geisler

Hi Richard,

Richard Broersma Jr schrieb:

This depends on your application. Do you have a lot of disc reads?
Anyhow, I would put the xlog always to a RAID 10 volume because most of 
the I/O for update and inserts is going to the xlog.


4 discs xlog
6 discs tables
4 discs tables2


I have a question in regards to I/O bandwidths of various raid configuration.  
Primary, does the
above suggested raid partitions imply that multiple (smaller) disk arrays have 
a potential for
more I/O bandwidth than a larger raid 10 array?


Yes.
Because the disc arms didn't need to reposition that much as there would 
o with one large volume.


For example, You run two queries with two clients and each queries needs 
to read some indices from disk. In this case it more efficient to read 
from different volumes than to read from one large volume where the disc 
arms has to jump.


Sven.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Shaun Thomas
>>> On 6/13/2006 at 9:13 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

> Read what you wrote, and rethink...

Hah.  Yes, I understand the irony of that statement, but the point is
that the value of the variable won't change during query execution.

> If you're desperate you can construct a query string with the
variable
> value embedded as a literal, and then EXECUTE that.  This isn't a
great
> solution since it forces a re-plan on every execution.

That's so gross... but it might work.  I'm not really desperate, just
frustrated.  I really can't wait until we can upgrade; 7.4 is driving
me nuts.  I'm not really worried about a re-plan, since this SP just
updates a fact table, so it only gets called twice a day.  Cutting the
execution time of the SP down to < 20 seconds from 15 minutes would be
nice, but not absolutely required.  I was just surprised at the large
difference in manual execution as opposed to the SP with the same
query.

> We've occasionally debated ways to do it better, but no such
> improvement will ever appear in 7.4 ;-)

Agreed!  When we finally upgrade, I fully plan on putting a symbolic
bullet into our old installation. ;)

Thanks!


-- 

Shaun Thomas
Database Administrator

Leapfrog Online 
807 Greenwood Street 
Evanston, IL 60201 
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com


Confidentiality Note:

The document(s) accompanying this e-mail transmission, if any, and the
e-mail transmittal message contain information from Leapfrog Online
Customer Acquisition, LLC is confidential or privileged. The information
is intended to be for the use of the individual(s) or entity(ies) named
on this e-mail transmission message. If you are not the intended
recipient, be aware that any disclosure, copying, distribution or use of
the contents of this e-mail is prohibited. If you have received this
e-mail in error, please immediately delete this e-mail and notify us by
telephone of the error

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Steve Poe
Dave, Joshua, Scott (and all),

Thanks for your feedback, while I do appreciate it, I did not intent on
making this discussion "buy this instead"...I whole-heartly agree with
you. Joshua, you made the best comment, it is a business decision for
the client. I don't agree with it, but I understand it. I've recommended
Sun or Penguin Computing which I've had no technical issues with. They
did not dispute my recommendation but they ignored it. I have not like
Dell, on the server side, since 1998 - 2000 time period.

Excluding Dell's issues, has anyone seen performance differences between
AMD's Opteron and Intel's new Xeon's (dual or quad CPU or dual-core). If
anyone has done benchmark comparisons between them, any summary
information would be appreciated.

For now, I am asking the client to hold-off and wait for the AMD Opteron
availability on the Dell servers.

Thanks again.

Steve




---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John E. Vincent
-- this is the third time I've tried sending this and I never saw it get 
through to the list. Sorry if multiple copies show up.


Hi all,

I've been lurking using the web archives for a while and haven't found 
an answer that seems to answer my questions about pg_dump.


We have a 206GB data warehouse running on version 8.0.3. The server is 
somewhat underpowered in terms of CPU: (1) 2.8 GHz Xeon 4GB Ram and a 
single HBA to our SAN (IBM DS4300). We in the process of migrating to a 
new server that we've repurposed from our production OLTP database (8) 
2.0 GHz Xeon, 16GB Ram and dual HBAs to the same SAN running version 8.1.


Independant of that move, we still need to get by on the old system and 
I'm concerned that even on the new system, pg_dump will still perform 
poorly. I can't do a full test because we're also taking advantage of 
the table partitioning in 8.1 so we're not doing a dump and restore.


We backup the database using:

pg_dump -Fc -cv ${CURDB} > ${BACKDIR}/${CURDB}-${DATE}.bak

There a three different LUNs allocated to the old warehouse on the SAN - 
data, wal and a dump area for the backups. The SAN has two controllers 
(only 128MB of cache per) and the data is on one controller while the 
WAL and dump area are on the other. Still a single HBA though.


Creating the compressed backup of this database takes 12 hours. We start 
at 6PM and it's done a little after 1AM, just in time for the next day's 
load. The load itself takes about 5 hours.


I've watched the backup process and I/O is not a problem. Memory isn't a 
problem either. It seems that we're CPU bound but NOT in I/O wait. The 
server is a dedicated PGSQL box.


Here are our settings from the conf file:

maintenance_work_mem = 524288
work_mem = 1048576 ( I know this is high but you should see some of our 
sorts and aggregates)

shared_buffers = 5
effective_cache_size = 45
wal_buffers = 64
checkpoint_segments = 256
checkpoint_timeout = 3600

We're inserting around 3mil rows a night if you count staging, info, dim 
and fact tables. The vacuum issue is a whole other problem but right now 
I'm concerned about just the backup on the current hardware.


I've got some space to burn so I could go to an uncompressed backup and 
compress it later during the day.


If there are any tips anyone can provide I would greatly appreciate it. 
I know that the COPY performance was bumped up in 8.1 but I'm stuck on 
this 8.0 box for a while longer.


Thanks,
John E. Vincent

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] how to partition disks

2006-06-14 Thread Jim C. Nasby
On Wed, Jun 14, 2006 at 04:32:23PM +0200, Sven Geisler wrote:
> Hi Richard,
> 
> Richard Broersma Jr schrieb:
> >>This depends on your application. Do you have a lot of disc reads?
> >>Anyhow, I would put the xlog always to a RAID 10 volume because most of 
> >>the I/O for update and inserts is going to the xlog.
> >>
> >>4 discs xlog
> >>6 discs tables
> >>4 discs tables2
> >
> >I have a question in regards to I/O bandwidths of various raid 
> >configuration.  Primary, does the
> >above suggested raid partitions imply that multiple (smaller) disk arrays 
> >have a potential for
> >more I/O bandwidth than a larger raid 10 array?
> 
> Yes.
> Because the disc arms didn't need to reposition that much as there would 
> o with one large volume.
> 
> For example, You run two queries with two clients and each queries needs 
> to read some indices from disk. In this case it more efficient to read 
> from different volumes than to read from one large volume where the disc 
> arms has to jump.

But keep in mind that all of that is only true if you have very good
knowledge of how your data will be accessed. If you don't know that,
you'll almost certainly be better off just piling everything into one
RAID array and letting the controller deal with it.

Also, if you have a good RAID controller that's batter-backed,
seperating pg_xlog onto it's own array is much less likely to be a win.
The reason you normally put pg_xlog on it's own partition is because the
database has to fsync pg_xlog *at every single commit*. This means you
absolutely want that fsync to be as fast as possible. But with a good,
battery-backed controller, this no longer matters. The fsync is only
going to push the data into the controller, and the controller will take
things from there. That means it's far less important to put pg_xlog on
it's own array. I actually asked about this recently and one person did
reply that they'd done testing and found it was better to just put all
their drives into one array so they weren't wasting bandwidth on the
pg_xlog drives.

Even if you do decide to keep pg_xlog seperate, a 4 drive RAID10 for
that is overkill. It will be next to impossible for you to generate
enough WAL traffic to warrent it.

Your best bet is to perform testing with your application. That's the
only way you'll truely find out what's going to work best. Short of
that, your best bet is to just pile all the drives together. If you do
testing, I'd start first with the effect of a seperate pg_xlog. Only
after you have those results would I consider trying to do things like
split indexes from tables, etc. 

BTW, you should consider reserving some of the drives in the array as
hot spares.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Precomputed constants?

2006-06-14 Thread Jim C. Nasby
On Wed, Jun 14, 2006 at 01:30:10PM +0200, B?sz?rm?nyi Zolt?n wrote:
> Replacing random() with a true constant gives me index scan
> even if it's hidden inside other function calls. E.g.:

The database has no choice but to compute random() for every row; it's
marked VOLATILE.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Tom Lane
"John E. Vincent" <[EMAIL PROTECTED]> writes:
> I've watched the backup process and I/O is not a problem. Memory isn't a 
> problem either. It seems that we're CPU bound but NOT in I/O wait.

Is it the pg_dump process, or the connected backend, that's chewing the
bulk of the CPU time?  (This should be pretty obvious in "top".)

If it's the pg_dump process, the bulk of the CPU time is likely going
into compression --- you might consider backing off the compression
level, perhaps --compress=1 or even 0 if size of the dump file isn't
a big concern.

Another possibility if your LAN is reasonably fast is to run pg_dump on
a different machine, so that you can put two CPUs to work.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Scott Marlowe
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> -- this is the third time I've tried sending this and I never saw it get 
> through to the list. Sorry if multiple copies show up.
> 
> Hi all,

BUNCHES SNIPPED

> work_mem = 1048576 ( I know this is high but you should see some of our 
> sorts and aggregates)

Ummm.  That's REALLY high.  You might want to consider lowering the
global value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries could
theoretically run your machine out of memory right now.  Just put a "set
work_mem=100" in your script before the big query runs.

> We're inserting around 3mil rows a night if you count staging, info, dim 
> and fact tables. The vacuum issue is a whole other problem but right now 
> I'm concerned about just the backup on the current hardware.
> 
> I've got some space to burn so I could go to an uncompressed backup and 
> compress it later during the day.

That's exactly what we do.  We just do a normal backup, and have a
script that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.

Seeing as how you're CPU bound, most likely the problem is just the
compressed backup.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Solaris shared_buffers anomaly?

2006-06-14 Thread Mischa Sandberg

Tom Lane wrote:

Mischa Sandberg <[EMAIL PROTECTED]> writes:



Tom Lane wrote:

Does Solaris have any call that allows locking a shmem segment in RAM?

Yes, mlock(). But want to understand what's going on before patching.


Sure, but testing it with mlock() might help you understand what's going
on, by eliminating one variable: we don't really know if the shmem is
getting swapped, or something else.



For a dedicated DB server machine, Solaris has a feature:
create "intimate" shared memory with shmat(..., SHM_SHARE_MMU).
All backends share the same TLB entries (!). 


We use that already.  (Hmm, might be interesting for you to turn it
*off* and see if anything changes.  See src/backend/port/sysv_shmem.c.)


Gah. Always must remember to RTFSource.
And reproduce the problem on a machine I control :-)

--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Scott Marlowe
On Tue, 2006-06-13 at 15:11, Dave Page wrote:
>  
> > -Original Message-

> And how old are the 2600's now?
> 
> Anyhoo, I'm not saying the current machines are excellent performers or
> anything, but there are good business reasons to run them if you don't
> need to squeeze out every last pony.

Just thought I'd point you to Dell's forums.

http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1

wherein you'll find plenty of folks who have problems with freezing RAID
controllers with 28xx and 18xx machines.  

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John Vincent
On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:> -- this is the third time I've tried sending this and I never saw it get> through to the list. Sorry if multiple copies show up.>> Hi all,
BUNCHES SNIPPED> work_mem = 1048576 ( I know this is high but you should see some of our> sorts and aggregates)Ummm.  That's REALLY high.  You might want to consider lowering theglobal value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries couldtheoretically run your machine out of memory right now.  Just put a "setwork_mem=100" in your script before the big query runs.
I know it is but that's what we need for some of our queries. Our ETL tool (informatica) and BI tool (actuate) won't let us set those things as part of our jobs. We need it for those purposes. We have some really nasty queries that will be fixed in our new server.
E.G. we have a table called loan_account_agg_fact that has 200+ million rows and it contains every possible combination of late status for a customer account (i.e. 1 day late, 2 day late, 3 day late) so it gets inserted for new customers but updated for existing records as part of our warehouse load. Part of the new layout is combining late ranges so instead of number of days we have a range of days (
i.e. 1-15,16-30). Even with work_mem that large, the load of that loan_account_agg_fact table creates over 3GB of temp tables!
That's exactly what we do.  We just do a normal backup, and have ascript that gzips anything in the backup directory that doesn't end in.gz...  If you've got space to burn, as you say, then use it at least a
few days to see how it affects backup speeds.Seeing as how you're CPU bound, most likely the problem is just thecompressed backup.I'm starting to think the same thing. I'll see how this COPY I'm doing of the single largest table does right now and make some judgement based on that.
-- John E. Vincent


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread A.M.
On Wed, June 14, 2006 1:04 pm, John Vincent wrote:

> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things as
> part of our jobs. We need it for those purposes. We have some really nasty
> queries that will be fixed in our new server.

You could modify pgpool to insert the necessary set commands and point the
tools at pgpool.

-M


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John Vincent
Out of curiosity, does anyone have any idea what the ratio of actual datasize to backup size is if I use the custom format with -Z 0 compression or the tar format? Thanks.On 6/14/06, 
Scott Marlowe <[EMAIL PROTECTED]> wrote:
On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:> -- this is the third time I've tried sending this and I never saw it get> through to the list. Sorry if multiple copies show up.>> Hi all,
BUNCHES SNIPPED> work_mem = 1048576 ( I know this is high but you should see some of our> sorts and aggregates)Ummm.  That's REALLY high.  You might want to consider lowering theglobal value here, and then crank it up on a case by case basis, like
during nighttime report generation.  Just one or two queries couldtheoretically run your machine out of memory right now.  Just put a "setwork_mem=100" in your script before the big query runs.
> We're inserting around 3mil rows a night if you count staging, info, dim> and fact tables. The vacuum issue is a whole other problem but right now> I'm concerned about just the backup on the current hardware.
>> I've got some space to burn so I could go to an uncompressed backup and> compress it later during the day.That's exactly what we do.  We just do a normal backup, and have ascript that gzips anything in the backup directory that doesn't end in
.gz...  If you've got space to burn, as you say, then use it at least afew days to see how it affects backup speeds.Seeing as how you're CPU bound, most likely the problem is just thecompressed backup.
-- John E. Vincent[EMAIL PROTECTED]


Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Dave Page
Title: RE: [PERFORM] Which processor runs better for Postgresql?



 

  
  
  From: Scott Marlowe 
  [mailto:[EMAIL PROTECTED] Sent: 14 June 2006 
  18:04To: Dave PageCc: Joshua D. Drake; 
  [EMAIL PROTECTED]; pgsql-performance@postgresql.orgSubject: RE: 
  [PERFORM] Which processor runs better for Postgresql?
  
  On Tue, 2006-06-13 at 15:11, Dave Page 
  wrote:> > > -Original Message-> And 
  how old are the 2600's now?>> Anyhoo, I'm not saying the current 
  machines are excellent performers or> anything, but there are good 
  business reasons to run them if you don't> need to squeeze out every 
  last pony.Just thought I'd point you to Dell's 
  forums.http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1wherein you'll find plenty of folks who have problems with freezing 
  RAIDcontrollers with 28xx and 18xx machines.  
Never had any such problems in the dozen or so 
machines we run (about a 50-50 split of Linux to 
Windows). 
Regards, Dave


Re: [PERFORM] OT - select + must have from - sql standard syntax?

2006-06-14 Thread Antoine

You can get free "draft" versions that are close-enough-to-final to be
perfectly usable.  See our developers' FAQ for some links.  I like the
drafts partly because they're plain ASCII, and so far easier to search
than PDFs ...


Great to know - thanks!
Cheers
Antoine

--
This is where I should put some witty comment.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] Postgres consuming way too much memory???

2006-06-14 Thread jody brownell
I have a box with an app and postgresql on it. Hardware includes with 2 2.8 Ghz 
xeons 512KB cache, 4 GB of memory, 6 scsi disk in a software 
raid 5 on a trustix 2.2 with a 2.6.15.3 kernel. The data and indexes are on the 
raid array while the tx log is on disk 
with the OS. All is well.

The one application executes one transaction every 60 seconds or so. The 
transaction can range from tiny 
to relatively large. Maybe 30-70k inserts, 60-100k updates... nothing too 
heavy, take about 8-12 seconds 
to finish the the entire update in the worst case. The application is using the 
latest jdbc I am using 
preparedStatements with addBatch/executebatch/clearBatch to send statements in 
batches of 10 thousand... 
(is that high?)

The box itself is a little over subscribed for memory which is causing us to 
swap a bit... As the 
application runs, I notice the postgres process which handles this particular 
app connection grows in memory seemingly 
uncrontrollably until kaboom. Once the kernel kills off enough processes and 
the system settles, I see the postgres process is at 1.9GB 
of res memory and 77MB of shared memory. This challenges a number of 
assumptions I have made in the last while and raises a 
few questions... BTW, I am assuming this is not a memory leak b/c the same 
install of our software on a box 
with 8GB of memory and no swap being used has no unexplained growth in the 
memory... it is perfectly healthy 
and quite performant.

Anyway, due to errors in the transaction, it is rolledback afterwhich the 
postgres process remains at 901MB of 
resident memory and 91MB of of shared memory.

27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar 
qradar :::x.x.x.x(51149) idle

There are a few things I would like to understand. 

- What in the postgres will grow at an uncontrolled rate when the system is 
under heavy load or the transaction 
  is larger... there must be something not governed by the shared memory or 
other configuration in postgresql.conf. 
  It seems like, once we start hitting swap, postgres grows in memory resulting 
in more swapping... until applications 
  start getting killed.
- when the transaction was rolled back why did the process hold onto the 901MB 
of memory? 
- when is a transaction too big? is this determined by the configuration and 
performance of wal_buffers and wal log or is there 
  house cleaning which MUST be done at commit/rollback to avoid siutations like 
this thus indicating there is an upper bound.

I have been configuring postgres from tidbits I collected reading this list in 
the last few months 
not sure if what I have is totally right for the work load, but when I have 
adequate memory and avoid swap, we are more than 
happy with performance. Configuration which is not below is just the default.

shared_buffers = 32767
work_mem = 20480
maintenance_work_mem = 32768
max_fsm_pages = 4024000
max_fsm_relations = 2000
fsync = false
wal_sync_method = fsync
wal_buffers = 4096
checkpoint_segments = 32
checkpoint_timeout = 1200
checkpoint_warning = 60
commit_delay = 5000
commit_siblings = 5
effective_cache_size = 175000
random_page_cost = 2
autovacuum = true
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay=100
autovacuum_vacuum_cost_limit=100
default_statistics_target = 40

Is there anything here which looks weird or mis configured? I am just starting 
to play with the bg writer configuration so I did not include.
typically, there is little or no iowait... and no reason to think there is 
something miconfigured... from what I have seen.

In one transaction i have seen as many as 5 checkpoint_segments be created/used 
so I was considering increasing wal_buffers to 8192 from 4096 
given as many as 4 segments in memory/cache at once... need to test this though 


Anyone have any thoughts on what could have caused the bloat? 

thanks

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Scott Marlowe
On Wed, 2006-06-14 at 13:43, Dave Page wrote:
>  

> 
> Just thought I'd point you to Dell's forums.
> 
> 
> http://forums.us.dell.com/supportforums/board?board.id=pes_linux&page=1
> 
> wherein you'll find plenty of folks who have problems with
> freezing RAID
> controllers with 28xx and 18xx machines. 
>  
> 
> Never had any such problems in the dozen or so machines we run
> (about a 50-50 split of Linux to Windows). 
> 
> Regards, Dave
> 

Yeah, We've got a mix of 2650 and 2850s, and our 2850s have been rock
solid stable, unlike the 2650s.  I was actually kinda surprised to see
how many people have problems with the 2850s.

Apparently, the 2850 mobos have a built in RAID that's pretty stable
(it's got a PERC number I can't remembeR), but ordering them with an add
on Perc RAID controller appears to make them somewhat unstable as well.

On recommendation I've seen repeatedly is to use the --noapic option at
boot time.

Just FYI

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Which processor runs better for Postgresql?

2006-06-14 Thread Dave Page
 

> -Original Message-
> From: Scott Marlowe [mailto:[EMAIL PROTECTED] 
> Sent: 14 June 2006 20:52
> To: Dave Page
> Cc: Joshua D. Drake; [EMAIL PROTECTED]; 
> pgsql-performance@postgresql.org
> Subject: RE: [PERFORM] Which processor runs better for Postgresql?
> 
> 
> Yeah, We've got a mix of 2650 and 2850s, and our 2850s have been rock
> solid stable, unlike the 2650s.  I was actually kinda surprised to see
> how many people have problems with the 2850s.
> 
> Apparently, the 2850 mobos have a built in RAID that's pretty stable
> (it's got a PERC number I can't remembeR), but ordering them 
> with an add
> on Perc RAID controller appears to make them somewhat 
> unstable as well.

That might be it - we always chose the onboard PERC because it has twice
the cache of the other options. 

Regards, Dave.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres consuming way too much memory???

2006-06-14 Thread Tom Lane
"jody brownell" <[EMAIL PROTECTED]> writes:
> 27116 postgres  15   0 1515m 901m  91m S  0.0 22.9  18:33.96 postgres: qradar 
> qradar :::x.x.x.x(51149) idle

This looks like a memory leak, but you haven't provided enough info to
let someone else reproduce it.  Can you log what your application is
doing and extract a test case?  What PG version is this, anyway?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Scott Marlowe
On Wed, 2006-06-14 at 12:04, John Vincent wrote:
> 
> On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
> On Wed, 2006-06-14 at 09:47, John E. Vincent wrote:
> > -- this is the third time I've tried sending this and I
> never saw it get
> > through to the list. Sorry if multiple copies show up.
> >
> > Hi all,
> 
> BUNCHES SNIPPED
> 
> > work_mem = 1048576 ( I know this is high but you should see
> some of our
> > sorts and aggregates)
> 
> Ummm.  That's REALLY high.  You might want to consider
> lowering the
> global value here, and then crank it up on a case by case
> basis, like 
> during nighttime report generation.  Just one or two queries
> could
> theoretically run your machine out of memory right now.  Just
> put a "set
> work_mem=100" in your script before the big query runs.
> 
> 
> I know it is but that's what we need for some of our queries. Our ETL
> tool (informatica) and BI tool (actuate) won't let us set those things
> as part of our jobs. We need it for those purposes. We have some
> really nasty queries that will be fixed in our new server. 

Description of "Queries gone wild" redacted.  hehe.

Yeah, I've seen those kinds of queries before too.  you might be able to
limit your exposure by using alter user:

alter user userwhoneedslotsofworkmem set work_mem=100;

and then only that user will have that big of a default.  You could even
make it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I just
get REAL nervous seeing a production machine with a work_mem set that
high.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John Vincent
On 6/14/06, Scott Marlowe <[EMAIL PROTECTED]> wrote:
Description of "Queries gone wild" redacted.  hehe.Yeah, I've seen those kinds of queries before too.  you might be able tolimit your exposure by using alter user:alter user userwhoneedslotsofworkmem set work_mem=100;
Is this applicable on  8.0? We were actually LOOKING for a governor of some sort for these queries.  And something that is not explicitly stated, is that allocated up front or is that just a ceiling?
and then only that user will have that big of a default.  You could evenmake it so that only queries that need that much log in as that user,
and all other queries log in as other folks.  Just a thought.  I justget REAL nervous seeing a production machine with a work_mem set thathigh.Which is actually how it's configured. We have a dedicated user connecting from  Actuate. The reports developers use thier own logins when developing new reports. Only when they get published do they convert to the Actuate user.
-- John E. Vincent[EMAIL PROTECTED]


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Scott Marlowe
How long does gzip take to compress this backup?

On Wed, 2006-06-14 at 15:59, John Vincent wrote:
> Okay I did another test dumping using the uncompressed backup on the
> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a
> fair bit of space left on the SAN to work with. 
> 
> On 6/14/06, John Vincent <[EMAIL PROTECTED]> wrote:
> Well I did a test to answer my own question:
> 
> -rw-r--r--  1 postgres postgres 167M Jun 14 01:43
> claDW_PGSQL-20060613170001.bak
> -rw-r--r--  1 root root 2.4G Jun 14 14:45
> claDW_PGSQL.test.bak 
> 
> the claDW_PGSQL database is a subset of the data in the main
> schema that I'm dealing with. 
> 
> I did several tests using -Fc -Z0 and a straight pg_dump with
> no format option.
> 
> The file size is about 1300% larger and takes just as long to
> dump even for that small database. 
> 
> Interestingly enough gzip compresses about 1M smaller with no
> gzip options.
> 
> I don't know that the uncompressed is really helping much. I'm
> going to run another query when there's no other users on the
> system and see how it goes. 
> 
> 
> 
> -- 
> John E. Vincent

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Jim C. Nasby
On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> Out of curiosity, does anyone have any idea what the ratio of actual
> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?

-Z 0 should mean no compression.

Something you can try is piping the output of pg_dump to gzip/bzip2. On
some OSes, that will let you utilize 1 CPU for just the compression. If
you wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.

Or if you don't care about disk IO bandwidth, just compress after the
fact (though, that could just put you in a situation where pg_dump
becomes bandwidth constrained).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John Vincent
time gzip -6 claDW_PGSQL.test.bakreal    3m4.360suser    1m22.090ssys 0m6.050sWhich is still less time than it would take to do a compressed pg_dump. On 6/14/06, 
Scott Marlowe <[EMAIL PROTECTED]> wrote:
How long does gzip take to compress this backup?On Wed, 2006-06-14 at 15:59, John Vincent wrote:> Okay I did another test dumping using the uncompressed backup on the> system unloaded and the time dropped down to 8m for the backup.
> There's still the size issue to contend with but as I said, I've got a> fair bit of space left on the SAN to work with.


Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread John Vincent
On 6/14/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:> Out of curiosity, does anyone have any idea what the ratio of actual> datasize to backup size is if I use the custom format with -Z 0 compression
> or the tar format?-Z 0 should mean no compression.But the custom format is still a binary backup, no?
Something you can try is piping the output of pg_dump to gzip/bzip2. Onsome OSes, that will let you utilize 1 CPU for just the compression. Ifyou wanted to get even fancier, there is a parallelized version of bzip2
out there, which should let you use all your CPUs.Or if you don't care about disk IO bandwidth, just compress after thefact (though, that could just put you in a situation where pg_dumpbecomes bandwidth constrained).
Unfortunately if we working with our current source box, the 1 CPU is already the bottleneck in regards to compression. If I run the pg_dump from the remote server though, I might be okay.
--Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



Re: [PERFORM] Performance of pg_dump on PGSQL 8.0

2006-06-14 Thread Jim C. Nasby
On Wed, Jun 14, 2006 at 05:18:14PM -0400, John Vincent wrote:
> On 6/14/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> >
> >On Wed, Jun 14, 2006 at 02:11:19PM -0400, John Vincent wrote:
> >> Out of curiosity, does anyone have any idea what the ratio of actual
> >> datasize to backup size is if I use the custom format with -Z 0
> >compression
> >> or the tar format?
> >
> >-Z 0 should mean no compression.
> 
> 
> But the custom format is still a binary backup, no?
 
I fail to see what that has to do with anything...

> Something you can try is piping the output of pg_dump to gzip/bzip2. On
> >some OSes, that will let you utilize 1 CPU for just the compression. If
> >you wanted to get even fancier, there is a parallelized version of bzip2
> >out there, which should let you use all your CPUs.
> >
> >Or if you don't care about disk IO bandwidth, just compress after the
> >fact (though, that could just put you in a situation where pg_dump
> >becomes bandwidth constrained).
> 
> 
> Unfortunately if we working with our current source box, the 1 CPU is
> already the bottleneck in regards to compression. If I run the pg_dump from
> the remote server though, I might be okay.

Oh, right, forgot about that. Yeah, your best bet could be to use an
external machine for the dump.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Solaris shared_buffers anomaly?

2006-06-14 Thread Josh Berkus
Folks,

First off, you'll be glad to know that I've persuaded two of the Sun 
performance engineers to join this list soon.   So you should be able to 
get more difinitive answers to these questions.

Second, 7.4 still did linear scanning of shared_buffers as part of LRU and 
for other activities.   I don't know how that would cause swapping, but it 
certainly could cause dramatic slowdowns (like 2-5x) if you overallocated 
shared_buffers.   

Possibly this is also triggering a bug in Solaris 2.6.   2.6 is pretty 
darned old (1997); maybe you should upgrade?   We're testing with s_b set 
to 300,000 on Solaris 10 (Niagara) so this is obviously not a current 
Solaris issue.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Dan Gorman

All,
  So I thought I'd pose this question:

If I have a pg database attached to a powervault (PV) with just an  
off-the-shelf SCSI card I generally want fsync on to prevent data  
corruption in case the PV should loose power.
However, if I have it attached to a NetApp that ensures data writes  
to via the NVRAM can I safely turn fsync off to gain additional  
performance?


Best Regards,
Dan Gorman



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread D'Arcy J.M. Cain
On Wed, 14 Jun 2006 14:48:04 -0700
Dan Gorman <[EMAIL PROTECTED]> wrote:
> If I have a pg database attached to a powervault (PV) with just an  
> off-the-shelf SCSI card I generally want fsync on to prevent data  
> corruption in case the PV should loose power.
> However, if I have it attached to a NetApp that ensures data writes  
> to via the NVRAM can I safely turn fsync off to gain additional  
> performance?

I wouldn't.  Remember, you still have to get the data to the NetApp.
You don't want things sitting in the computer's buffers when it's power
goes down.

-- 
D'Arcy J.M. Cain  |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Mark Lewis
No.  You need fsync on in order to force the data to get TO the NetApp
at the right time.  With fsync off, the data gets cached in the
operating system.

-- Mark Lewis

On Wed, 2006-06-14 at 14:48 -0700, Dan Gorman wrote:
> All,
>So I thought I'd pose this question:
> 
> If I have a pg database attached to a powervault (PV) with just an  
> off-the-shelf SCSI card I generally want fsync on to prevent data  
> corruption in case the PV should loose power.
> However, if I have it attached to a NetApp that ensures data writes  
> to via the NVRAM can I safely turn fsync off to gain additional  
> performance?
> 
> Best Regards,
> Dan Gorman
> 
> 
> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Effects of cascading references in foreign keys

2006-06-14 Thread Bruce Momjian

Would someone please find the answer to Tom's last question?

---

Tom Lane wrote:
> I wrote:
> > Looking at this, I wonder if there isn't a bug or at least an
> > inefficiency in 8.1.  The KeysEqual short circuit tests are still there
> > in ri_triggers.c; aren't they now redundant with the test in triggers.c?
> > And don't they need to account for the special case mentioned in the
> > comment in triggers.c, that the RI check must still be done if we are
> > looking at a row updated by the same transaction that created it?
> 
> OK, I take back the possible-bug comment: the special case only applies
> to the FK-side triggers, which is to say RI_FKey_check, and that routine
> doesn't attempt to skip the check on equal old/new keys.  I'm still
> wondering though if the KeysEqual tests in the other RI triggers aren't
> now a waste of cycles.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq
> 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Jim Nasby

On Jun 13, 2006, at 8:50 PM, Tom Lane wrote:


"Jim C. Nasby" <[EMAIL PROTECTED]> writes:

On Tue, Jun 13, 2006 at 06:04:42PM -0400, Tom Lane wrote:

It'd depend on the context, possibly, but it's easy to show that the
current planner does fold "now() - interval_constant" when making
estimates.  Simple example:


Turns out the difference is between feeding a date vs a timestamp  
into the
query... I would have thought that since date is a date that the  
WHERE clause

would be casted to a date if it was a timestamptz, but I guess not...


Hmm ... worksforme.  Could you provide a complete test case?


decibel=# create table date_test(d date not null, i int not null);
CREATE TABLE
decibel=# insert into date_test select now()-x*'1 day'::interval, i  
from generate_series(0,3000) x, generate_series(1,10) i;

INSERT 0 30010
decibel=# analyze verbose date_test;
INFO:  analyzing "decibel.date_test"
INFO:  "date_test": scanned 3 of 1622163 pages, containing  
555 live rows and 0 dead rows; 3 rows in sample, 300100155  
estimated total rows

ANALYZE
decibel=# explain select * from date_test where d >= now()-'15  
days'::interval;

 QUERY PLAN
-
Seq Scan on date_test  (cost=0.00..6873915.80 rows=1228164 width=8)
   Filter: (d >= (now() - '15 days'::interval))
(2 rows)

decibel=# explain select * from date_test where d >= (now()-'15  
days'::interval)::date;

 QUERY PLAN
-
Seq Scan on date_test  (cost=0.00..7624166.20 rows=1306467 width=8)
   Filter: (d >= ((now() - '15 days'::interval))::date)
(2 rows)

decibel=# select version();
 version
 
-
PostgreSQL 8.1.4 on amd64-portbld-freebsd6.0, compiled by GCC cc  
(GCC) 3.4.4 [FreeBSD] 20050518

(1 row)

decibel=#

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Confirmation of bad query plan generated by 7.4

2006-06-14 Thread Tom Lane
Jim Nasby <[EMAIL PROTECTED]> writes:
> On Jun 13, 2006, at 8:50 PM, Tom Lane wrote:
>> Hmm ... worksforme.  Could you provide a complete test case?

> decibel=# create table date_test(d date not null, i int not null);
> [etc]

Not sure what you are driving at.  The estimates are clearly not
defaults (the default estimate would be 1/3rd of the table, or
about 100mil rows).  Are you expecting them to be the same?  If so why?
The comparison values are slightly different after all.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Greg Stark

Mark Lewis <[EMAIL PROTECTED]> writes:

> On Wed, 2006-06-14 at 14:48 -0700, Dan Gorman wrote:
> >
> > However, if I have it attached to a NetApp that ensures data writes  
> > to via the NVRAM can I safely turn fsync off to gain additional  
> > performance?
>
> No.  You need fsync on in order to force the data to get TO the NetApp
> at the right time.  With fsync off, the data gets cached in the
> operating system.

In fact the benefit of the NVRAM is precisely that it makes sure you *don't*
have any reason to turn fsync off. It should make the fsync essentially free.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Precomputed constants?

2006-06-14 Thread Zoltan Boszormenyi

Jim C. Nasby írta:

On Wed, Jun 14, 2006 at 01:30:10PM +0200, B?sz?rm?nyi Zolt?n wrote:
  

Replacing random() with a true constant gives me index scan
even if it's hidden inside other function calls. E.g.:



The database has no choice but to compute random() for every row; it's
marked VOLATILE.
  


I see now, docs about CREATE FUNCTION mentions random(),
currval() and timeofday() as examples for VOLATILE.
But where in the documentation can I find this info about all
built-in functions? Thanks.

Best regards,
Zoltán Böszörményi


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Jonah H. Harris

On 14 Jun 2006 23:33:53 -0400, Greg Stark <[EMAIL PROTECTED]> wrote:

In fact the benefit of the NVRAM is precisely that it makes sure you *don't*
have any reason to turn fsync off. It should make the fsync essentially free.


Having run PostgreSQL on a NetApp with input from NetApp, this is
correct.  fsync should be turned on, but you will not incur the *real*
direct-to-disk cost of the sync, it will be direct-to-NVRAM.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Dan Gorman
That makes sense. Speaking of NetApp, we're using the 3050C with 4 FC  
shelfs. Any generic advice other than the NetApp (their NFS oracle  
tuning options)

that might be useful? (e.g. turning off snapshots)

Regards,
Dan Gorman

On Jun 14, 2006, at 10:14 PM, Jonah H. Harris wrote:


On 14 Jun 2006 23:33:53 -0400, Greg Stark <[EMAIL PROTECTED]> wrote:
In fact the benefit of the NVRAM is precisely that it makes sure  
you *don't*
have any reason to turn fsync off. It should make the fsync  
essentially free.


Having run PostgreSQL on a NetApp with input from NetApp, this is
correct.  fsync should be turned on, but you will not incur the *real*
direct-to-disk cost of the sync, it will be direct-to-NVRAM.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Jonah H. Harris

On 6/15/06, Dan Gorman <[EMAIL PROTECTED]> wrote:

shelfs. Any generic advice other than the NetApp (their NFS oracle
tuning options) that might be useful? (e.g. turning off snapshots)


I was using PostgreSQL on a 980c, but feature-wise they're probably
pretty close.

What type of application are you running?  OLTP?  If so, what type of
transaction volume?  Are you planning to use any Flex* or Snap*
features?  What type of volume layouts are you using?

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Jonah H. Harris

On 6/15/06, Jonah H. Harris <[EMAIL PROTECTED]> wrote:

On 6/15/06, Dan Gorman <[EMAIL PROTECTED]> wrote:
> shelfs. Any generic advice other than the NetApp (their NFS oracle
> tuning options) that might be useful? (e.g. turning off snapshots)

I was using PostgreSQL on a 980c, but feature-wise they're probably
pretty close.

What type of application are you running?  OLTP?  If so, what type of
transaction volume?  Are you planning to use any Flex* or Snap*
features?  What type of volume layouts are you using?


Also, you mentioned NFS... is that what you were planning?  If you
licensed iSCSI, it's a bit better for the database from a performance
angle.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Joe Conway

Dan Gorman wrote:
That makes sense. Speaking of NetApp, we're using the 3050C with 4 FC  
shelfs. Any generic advice other than the NetApp (their NFS oracle  
tuning options)

that might be useful? (e.g. turning off snapshots)


I'm not sure if this is in the tuning advice you already have, but we 
use a dedicated gigabit interface to the NetApp, with jumbo (9K) frames, 
and an 8K NFS blocksize. We use this for both Oracle and Postgres when 
the database resides on NetApp.


Joe

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM]Is it possible to start two instances of postgresql?

2006-06-14 Thread kah_hang_ang




Hi,

Is it possible to start two instances of postgresql with different port and
directory which run simultaneously?
If can then will this cause any problem or performance drop down?

Thanks.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres fsync off (not needed) with NetApp

2006-06-14 Thread Dan Gorman
Currently I have jumbo frames enabled on the NA and the switches and  
also are using a the 32K R/W NFS options. Everything is gigE.


Regards,
Dan Gorman


On Jun 14, 2006, at 10:51 PM, Joe Conway wrote:


Dan Gorman wrote:
That makes sense. Speaking of NetApp, we're using the 3050C with 4  
FC  shelfs. Any generic advice other than the NetApp (their NFS  
oracle  tuning options)

that might be useful? (e.g. turning off snapshots)


I'm not sure if this is in the tuning advice you already have, but  
we use a dedicated gigabit interface to the NetApp, with jumbo (9K)  
frames, and an 8K NFS blocksize. We use this for both Oracle and  
Postgres when the database resides on NetApp.


Joe




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM]Is it possible to start two instances of postgresql?

2006-06-14 Thread A. Kretschmer
am  15.06.2006, um 13:58:20 +0800 mailte [EMAIL PROTECTED] folgendes:
> 
> 
> 
> 
> Hi,
> 
> Is it possible to start two instances of postgresql with different port and
> directory which run simultaneously?

Yes, this is possible, and this is the Debian way for updates.


> If can then will this cause any problem or performance drop down?

Of course, if you have high load in one database ... you have only one
machine.


HTH, Andreas
-- 
Andreas Kretschmer(Kontakt: siehe Header)
Heynitz:  035242/47215,  D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===Schollglas Unternehmensgruppe=== 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM]Is it possible to start two instances of postgresql?

2006-06-14 Thread kah_hang_ang




so what is the best way to implement two databases in one machine?
implement with two postgresql instances with separate directory or
implement under one instance?

if I implement two database in one instance, if one of the database crash
will it affect the other?




 
  "A. Kretschmer"   
 
  <[EMAIL PROTECTED]To:   
pgsql-performance@postgresql.org  
  las.com>   cc:
 
  Sent by:   Subject:  Re: 
[PERFORM]Is it possible to start two instances of postgresql? 
  [EMAIL PROTECTED] 
   
  tgresql.org   
 

 

 
  06/15/2006 02:07 PM   
 

 

 




am  15.06.2006, um 13:58:20 +0800 mailte [EMAIL PROTECTED]
folgendes:
>
>
>
>
> Hi,
>
> Is it possible to start two instances of postgresql with different port
and
> directory which run simultaneously?

Yes, this is possible, and this is the Debian way for updates.


> If can then will this cause any problem or performance drop down?

Of course, if you have high load in one database ... you have only one
machine.


HTH, Andreas
--
Andreas Kretschmer(Kontakt: siehe Header)
Heynitz:  035242/47215,  D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
 ===Schollglas Unternehmensgruppe===

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq