Re: [PERFORM] Problem with bitmap-index-scan plan

2006-07-18 Thread Kapadaidakis Yannis


Yes that was the problem! Thank you very much

On Thu, 13 Jul 2006, Tom Lane wrote:


[EMAIL PROTECTED] writes:

... is quite reasonable.The table has 1.000.000 rows (17.242 pages). From
pg_stat_get_blocks_fetched I can see that there were 102 page requests for
table. So all things seem to work great here!



But if I multiply the size of the table ten-times (10.000.000 rows - 172.414
pages) and run the same query I get:
...
which is slower even than a seq scan. Now I get that there were 131.398 page
requests for table in order to retrieve almost 1250 tuples!Can someone explain
why this is happening? All memory parameters are set to default.


You probably need to increase work_mem so that the bitmaps don't become
lossy ...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Big differences in plans between 8.0 and 8.1

2006-07-18 Thread Gabriele Turchi
Il giorno dom, 16/07/2006 alle 11.08 -0700, Joe Conway ha scritto:
> Gabriele Turchi wrote:
> > Il giorno sab, 15/07/2006 alle 13.04 -0700, Joe Conway ha scritto:
> >>Why not just periodically (once an hour?) run "ANALYZE registrazioni;" 
> >>during the day. This will only update the statistics, and should be very 
> >>low impact.
> > 
> > This is my "solution" too... but: is enough? Or else: there is a better
> > way to do this? If the performance in the better case is 50 times faster
> > than the worse case, during an hour (50/100 record inserted in
> > "registrazioni") how much the performance can fall before the new
> > "ANALYZE" is run? Otherwise, running ANALYZE more frequently can badly
> > affect the overall performance?
> 
> One thing I noticed is that in both plans there is a seq scan on 
> registrazioni. Given that performance degrades so quickly as records are 
> inserted into registrazioni, I'm wondering if you're missing an index. 
> What indexes do you have on registrazioni?
> 
> Joe

No one. The application was not fine-tuned, because the original
performance (under 8.0) was "more than enough". I thought that creating
an index on a table with no more than some hundred of records was not
useful...

My biggest doubt is anyway related to the very big difference between
the plans in 8.0 and 8.1 under the same conditions. 

Thank you,
Gabriele


smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] Temporary table retains old contents on update eventually causing slow temp file usage.

2006-07-18 Thread Rusty Conover

Sorry for replying to my own post.

I forgot to include my version information, I used:

PostgreSQL 8.1.0 on powerpc-apple-darwin8.3.0, compiled by GCC  
powerpc-apple-darwin8-gcc-4.0.0 (GCC) 4.0.0 (Apple Computer, Inc.  
build 5026)


and

PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)  
4.0.1 20050727 (Red Hat 4.0.1-5)


On both the same result happens.

Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.
http://www.infogears.com




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] Temporary table retains old contents on update eventually causing slow temp file usage.

2006-07-18 Thread Rusty Conover

Hi,

It would seem that doing any changes on a temp table forces a copy of  
the entire contents of the table to be retained in memory/disk. Is  
this  happening due to MVCC?  Is there a way to change this behavior?  
It could be very useful when you have really huge temp tables that  
need to be updated a few times before they can be dropped.


Below is an example of the problem.  I'll create a temp table, insert  
600 rows (just a bunch of urls, you can use anything really), then  
update the table a few times without actually changing anything.  Of  
course this test case really doesn't show the extent of the problem,  
because its such a small amount of data involved.  When I have a temp  
table of about 150 megs and do more then a few updates on it, it  
forces postgresql to use the disk making things really slow.   
Originally the entire temp table fit into RAM.


I tried using savepoints and releasing them to see if it would make  
any difference and it did not, which isn't unexpected.  Could  
pg_relation_size() be incorrect in this case?


Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.
http://www.infogears.com



test=# begin;
BEGIN
test=# create temp table test_urls (u text);
CREATE TABLE
test=# insert into test_urls (u) select url from url limit 600;
INSERT 0 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
73728
(1 row)

test=# update test_urls set u = u;
UPDATE 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
   147456
(1 row)

test=# update test_urls set u = u;
UPDATE 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
   212992
(1 row)

test=# update test_urls set u = u;
UPDATE 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
   286720
(1 row)

test=# update test_urls set u = u;
UPDATE 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
   352256
(1 row)

test=# update test_urls set u = u;
UPDATE 600
test=# select pg_relation_size('test_urls');
pg_relation_size
--
   425984
(1 row)



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Temporary table retains old contents on update eventually

2006-07-18 Thread Gavin Sherry
On Tue, 18 Jul 2006, Rusty Conover wrote:

> Hi,
>
> It would seem that doing any changes on a temp table forces a copy of
> the entire contents of the table to be retained in memory/disk. Is
> this  happening due to MVCC?  Is there a way to change this behavior?
> It could be very useful when you have really huge temp tables that
> need to be updated a few times before they can be dropped.

This is caused by our MVCC implementation. It cannot be easily changed. We
rely on MVCC for two things: concurrency and rolling back of aborted
commands. Without the latter, we couldn't support the following trivially:

template1=# create temp table bar (i int);
CREATE TABLE
template1=# begin;
BEGIN
template1=# insert into bar values(1);
INSERT 0 1
template1=# abort;
ROLLBACK
template1=# select * from bar;
 i
---
(0 rows)

It would be nice if we could special case temp tables because of the fact
that concurrency does not come into the equation but I cannot see it
happening without a generalised overwriting MVCC system.

The only alternative in the mean time is to vacuum your temporary table(s)
as part of your interaction with them.

Thanks,

Gavin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Ron Peacetree
>From: Alex Turner <[EMAIL PROTECTED]>
>Sent: Jul 18, 2006 12:21 AM
>To: Ron Peacetree <[EMAIL PROTECTED]>
>Cc: Mikael Carneholm <[EMAIL PROTECTED]>, pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] RAID stripe size question
>
>On 7/17/06, Ron Peacetree <[EMAIL PROTECTED]> wrote:
>>
>> -Original Message-
>> >From: Mikael Carneholm <[EMAIL PROTECTED]>
>> >Sent: Jul 17, 2006 5:16 PM
>> >To: Ron  Peacetree <[EMAIL PROTECTED]>,
>> pgsql-performance@postgresql.org
>> >Subject: RE: [PERFORM] RAID stripe size question
>> >
>> >I use 90% of the raid cache for writes, don't think I could go higher
>> >than that.
>> >Too bad the emulex only has 256Mb though :/
>> >
>> If your RAID cache hit rates are in the 90+% range, you probably would
>> find it profitable to make it greater.  I've definitely seen access patterns
>> that benefitted from increased RAID cache for any size I could actually
>> install.  For those access patterns, no amount of RAID cache commercially
>> available was enough to find the "flattening" point of the cache percentage
>> curve.  256MB of BB RAID cache per HBA is just not that much for many IO
>> patterns.
>
>90% as in 90% of the RAM, not 90% hit rate I'm imagining.
>
Either way, =particularly= for OLTP-like I/O patterns, the more RAID cache the 
better unless the IO pattern is completely random.  In which case the best you 
can do is cache the entire sector map of the RAID set and use as many spindles 
as possible for the tables involved.  I've seen high end set ups in Fortune 
2000 organizations that look like some of the things you read about on tpc.org: 
=hundreds= of HDs are used.

Clearly, completely random IO patterns are to be avoided whenever and however 
possible.

Thankfully, most things can be designed to not have completely random IO and 
stuff like WAL IO are definitely not random.

The important point here about cache size is that unless you make cache large 
enough that you see a flattening in the cache behavior, you probably can still 
use more cache.  Working sets are often very large for DB applications.

 
>>The controller is a FC2143 (
>> http://h71016.www7.hp.com/dstore/MiddleFrame.asp?page=config&ProductLineId=450&FamilyId=1449&BaseId=17621&oi=E9CED&BEID=19701&SBLID=),
>> which uses PCI-E. Don't know how it compares to other controllers, haven't
>> had the time to search for / read any reviews yet.
>> >
>> This is a relatively low end HBA with 1 4Gb FC on it.  Max sustained IO on
>> it is going to be ~320MBps.  Or ~ enough for an 8 HD RAID 10 set made of
>> 75MBps ASTR HD's.
>>
>> 28 such HDs are =definitely= IO choked on this HBA.
>
>Not they aren't.  This is OLTP, not data warehousing.  I already posted math
>for OLTP throughput, which is in the order of 8-80MB/second actual data
>throughput based on maximum theoretical seeks/second.
>
WAL IO patterns are not OLTP-like.  Neither are most support or decision 
support IO patterns.  Even  in an OLTP system, there are usually only a few 
scenarios and tables where the IO pattern is pessimal.
Alex is quite correct that those few will be the bottleneck on overall system 
performance if the system's primary function is OLTP-like.

For those few, you dedicate as many spindles and RAID cache as you can afford 
and as show any performance benefit.  I've seen an entire HBA maxed out with 
cache and as many HDs as would saturate the attainable IO rate dedicated to =1= 
table (unfortunately SSD was not a viable option in this case).


>>The arithmetic suggests you need a better HBA or more HBAs or both.
>>
>>
>> >>WAL's are basically appends that are written in bursts of your chosen
>> log chunk size and that are almost never read afterwards.  Big DB pages and
>> big RAID stripes makes sense for WALs.
>
>
>unless of course you are running OLTP, in which case a big stripe isn't
>necessary, spend the disks on your data parition, because your WAL activity
>is going to be small compared with your random IO.
>
Or to put it another way, the scenarios and tables that have the most random 
looking IO patterns are going to be the performance bottleneck on the whole 
system.  In an OLTP-like system, WAL IO is unlikely to be your biggest 
performance issue.  As in any other performance tuning effort, you only gain by 
speeding up the current bottleneck.


>>
>> >According to
>> http://www.pcguide.com/ref/hdd/perf/raid/concepts/perfStripe-c.html, it
>> seems to be the other way around? ("As stripe size is decreased, files are
>> broken into smaller and smaller pieces. This increases the number of drives
>> that an average file will use to hold all the blocks containing the data of
>> that file, theoretically increasing transfer performance, but decreasing
>> positioning performance.")
>> >
>> >I guess I'll have to find out which theory that holds by good ol? trial
>> and error... :)
>> >
>> IME, stripe sizes of 64, 128, or 256 are the most common found to be
>> optimal for most access patterns + SW + FS + OS + HW.
>
>
>New rec

Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Mikael Carneholm
> This is a relatively low end HBA with 1 4Gb FC on it.  Max sustained
IO on it is going to be ~320MBps.  Or ~ enough for an 8 HD RAID 10 set
made of 75MBps ASTR HD's.

Looking at http://h30094.www3.hp.com/product.asp?sku=2260908&extended=1,
I notice that the controller has a Ultra160 SCSI interface which implies
that the theoretical max throughput is 160Mb/s. Ouch.

However, what's more important is the seeks/s - ~530/s on a 28 disk
array is quite lousy compared to the 1400/s on a 12 x 15Kdisk array as
mentioned by Mark here:
http://archives.postgresql.org/pgsql-performance/2006-07/msg00170.php.
Could be the disk RPM (10K vs 15K) that makes the difference here...

I will test another stripe size (128K) for the DATA lun (28 disks) to
see what difference that makes, I think I read somewhere that linux
flushes blocks of 128K at a time, so it might be worth evaluating.

/Mikael



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Temporary table retains old contents on update eventually causing slow temp file usage.

2006-07-18 Thread Rusty Conover


On Jul 18, 2006, at 6:22 AM, Gavin Sherry wrote:


On Tue, 18 Jul 2006, Rusty Conover wrote:


Hi,

It would seem that doing any changes on a temp table forces a copy of
the entire contents of the table to be retained in memory/disk. Is
this  happening due to MVCC?  Is there a way to change this behavior?
It could be very useful when you have really huge temp tables that
need to be updated a few times before they can be dropped.


This is caused by our MVCC implementation. It cannot be easily  
changed. We

rely on MVCC for two things: concurrency and rolling back of aborted
commands. Without the latter, we couldn't support the following  
trivially:


template1=# create temp table bar (i int);
CREATE TABLE
template1=# begin;
BEGIN
template1=# insert into bar values(1);
INSERT 0 1
template1=# abort;
ROLLBACK
template1=# select * from bar;
 i
---
(0 rows)

It would be nice if we could special case temp tables because of  
the fact

that concurrency does not come into the equation but I cannot see it
happening without a generalised overwriting MVCC system.

The only alternative in the mean time is to vacuum your temporary  
table(s)

as part of your interaction with them.


I forgot to add in my original post that the temporary tables I'm  
dealing with have the "on commit drop" flag, so really persisting  
beyond the transaction isn't needed.   But I don't think that makes  
any difference, because of savepoints' required functionality.


The problem with vacuuming is that you can't do it by default right  
now inside of a transaction.


Reading vacuum.c though, it leaves the door open:

/*
 * We cannot run VACUUM inside a user transaction block; if we were  
inside

 * a transaction, then our commit- and start-transaction-command calls
 * would not have the intended effect! Furthermore, the forced  
commit that
 * occurs before truncating the relation's file would have the  
effect of

 * committing the rest of the user's transaction too, which would
 * certainly not be the desired behavior.  (This only applies to VACUUM
 * FULL, though.  We could in theory run lazy VACUUM inside a  
transaction
 * block, but we choose to disallow that case because we'd rather  
commit
 * as soon as possible after finishing the vacuum.  This is  
mainly so that

 * we can let go the AccessExclusiveLock that we may be holding.)
 *
 * ANALYZE (without VACUUM) can run either way.
 */

Since we're dealing with a temporary table we shouldn't have any  
problems with the AccessExclusiveLock. Would lazy vacuuming mark the  
pages as free? I assume it wouldn't release them or shrink the size  
of the relation, but could they be reused on later updates in that  
same transaction?


Cheers,

Rusty
--
Rusty Conover
InfoGears Inc.
Web: http://www.infogears.com




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Luke Lonergan
Title: Re: [PERFORM] RAID stripe size question



Mikael,

On 7/18/06 6:34 AM, "Mikael Carneholm" <[EMAIL PROTECTED]> wrote:

> However, what's more important is the seeks/s - ~530/s on a 28 disk
> array is quite lousy compared to the 1400/s on a 12 x 15Kdisk array

I'm getting 2500 seeks/second on a 36 disk SATA software RAID (ZFS, Solaris 10) on a Sun X4500:

=== Single Stream 

With a very recent update to the zfs module that improves I/O scheduling and prefetching, I get the following bonnie++ 1.03a results with a 36 drive RAID10, Solaris 10 U2 on an X4500 with 500GB Hitachi drives (zfs checksumming is off):

Version  1.03   --Sequential Output--    --Sequential Input-   --Random-
-Per Chr-  --Block--  -Rewrite-  -Per Chr-  --Block--  --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP /sec %CP
thumperdw-i-1   32G 120453  99 467814  98 290391  58 109371  99 993344  94 1801   4
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 + +++ + +++ + +++ 30850  99 + +++ + +++

=== Two Streams 

Bumping up the number of concurrent processes to 2, we get about 1.5x speed reads of RAID10 with a concurrent workload (you have to add the rates together): 

Version  1.03   --Sequential Output--   --Sequential Input- --Random-
-Per Chr- --Block--  -Rewrite-  -Per Chr-  --Block--    --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP  /sec %CP
thumperdw-i-1   32G 111441  95 212536  54 171798  51 106184  98 719472  88  1233   2
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 26085  90 + +++  5700  98 21448  97 + +++  4381  97

Version  1.03   --Sequential Output--   --Sequential Input- --Random-
-Per Chr-  --Block--  -Rewrite-  -Per Chr-  --Block--   --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP  /sec %CP
thumperdw-i-1   32G 116355  99 212509  54 171647  50 106112  98 715030  87  1274   3
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 26082  99 + +++  5588  98 21399  88 + +++  4272  97

So that’s 2500 seeks per second, 1440MB/s sequential block read, 212MB/s per character sequential read.
===

- Luke





Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Alex Turner
This is a great testament to the fact that very often software RAID will seriously outperform hardware RAID because the OS guys who implemented it took the time to do it right, as compared with some controller manufacturers who seem to think it's okay to provided sub-standard performance.
Based on the bonnie++ numbers comming back from your array, I would also encourage you to evaluate software RAID, as you might see significantly better performance as a result.  RAID 10 is also a good candidate as it's not so heavy on the cache and CPU as RAID 5.
Alex.On 7/18/06, Luke Lonergan <[EMAIL PROTECTED]> wrote:





Mikael,

On 7/18/06 6:34 AM, "Mikael Carneholm" <
[EMAIL PROTECTED]> wrote:

> However, what's more important is the seeks/s - ~530/s on a 28 disk
> array is quite lousy compared to the 1400/s on a 12 x 15Kdisk array

I'm getting 2500 seeks/second on a 36 disk SATA software RAID (ZFS, Solaris 10) on a Sun X4500:

=== Single Stream 

With a very recent update to the zfs module that improves I/O scheduling and prefetching, I get the following bonnie++ 1.03a results with a 36 drive RAID10, Solaris 10 U2 on an X4500 with 500GB Hitachi drives (zfs checksumming is off):


Version  1.03   --Sequential Output--    --Sequential Input-   --Random-

-Per Chr-  --Block--  -Rewrite-  -Per Chr-  --Block--  --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP /sec %CP
thumperdw-i-1   32G 120453  99 467814  98 290391  58 109371  99 993344  94 1801   4
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 + +++ + +++ + +++ 30850  99 + +++ + +++

=== Two Streams 

Bumping up the number of concurrent processes to 2, we get about 1.5x speed reads of RAID10 with a concurrent workload (you have to add the rates together): 

Version  1.03   --Sequential Output--   --Sequential Input- --Random-
-Per Chr- --Block--  -Rewrite-  -Per Chr-  --Block--    --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP  /sec %CP
thumperdw-i-1   32G 111441  95 212536  54 171798  51 106184  98 719472  88  1233   2
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 26085  90 + +++  5700  98 21448  97 + +++  4381  97

Version  1.03   --Sequential Output--   --Sequential Input- --Random-
-Per Chr-  --Block--  -Rewrite-  -Per Chr-  --Block--   --Seeks--
Machine    Size K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP K/sec  %CP  /sec %CP
thumperdw-i-1   32G 116355  99 212509  54 171647  50 106112  98 715030  87  1274   3
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 26082  99 + +++  5588  98 21399  88 + +++  4272  97

So that's 2500 seeks per second, 1440MB/s sequential block read, 212MB/s per character sequential read.
===

- Luke







Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Scott Marlowe
On Tue, 2006-07-18 at 14:27, Alex Turner wrote:
> This is a great testament to the fact that very often software RAID
> will seriously outperform hardware RAID because the OS guys who
> implemented it took the time to do it right, as compared with some
> controller manufacturers who seem to think it's okay to provided
> sub-standard performance. 
> 
> Based on the bonnie++ numbers comming back from your array, I would
> also encourage you to evaluate software RAID, as you might see
> significantly better performance as a result.  RAID 10 is also a good
> candidate as it's not so heavy on the cache and CPU as RAID 5. 

Also, consider testing a mix, where your hardware RAID controller does
the mirroring and the OS stripes ((R)AID 0) over the top of it.  I've
gotten good performance from mediocre hardware cards doing this.  It has
the advantage of still being able to use the battery backed cache and
its instant fsync while not relying on some cards that have issues
layering RAID layers one atop the other.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Ron Peacetree
Have you done any experiments implementing RAID 50 this way (HBA does RAID 5, 
OS does RAID 0)?  If so, what were the results?

Ron

-Original Message-
>From: Scott Marlowe <[EMAIL PROTECTED]>
>Sent: Jul 18, 2006 3:37 PM
>To: Alex Turner <[EMAIL PROTECTED]>
>Cc: Luke Lonergan <[EMAIL PROTECTED]>, Mikael Carneholm <[EMAIL PROTECTED]>, 
>Ron Peacetree <[EMAIL PROTECTED]>, pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] RAID stripe size question
>
>On Tue, 2006-07-18 at 14:27, Alex Turner wrote:
>> This is a great testament to the fact that very often software RAID
>> will seriously outperform hardware RAID because the OS guys who
>> implemented it took the time to do it right, as compared with some
>> controller manufacturers who seem to think it's okay to provided
>> sub-standard performance. 
>> 
>> Based on the bonnie++ numbers comming back from your array, I would
>> also encourage you to evaluate software RAID, as you might see
>> significantly better performance as a result.  RAID 10 is also a good
>> candidate as it's not so heavy on the cache and CPU as RAID 5. 
>
>Also, consider testing a mix, where your hardware RAID controller does
>the mirroring and the OS stripes ((R)AID 0) over the top of it.  I've
>gotten good performance from mediocre hardware cards doing this.  It has
>the advantage of still being able to use the battery backed cache and
>its instant fsync while not relying on some cards that have issues
>layering RAID layers one atop the other.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Scott Marlowe
Nope, haven't tried that.  At the time I was testing this I didn't even
think of trying it.  I'm not even sure I'd heard of RAID 50 at the
time... :)

I basically had an old MegaRAID 4xx series card in a dual PPro 200 and a
stack of 6 9 gig hard drives.  Spare parts.  And even though the RAID
1+0 was relatively much faster on this hardware, the Dual P IV 2800 with
a pair of 15k USCSI drives and a much later model MegaRAID at it for
lunch with a single mirror set, and was plenty fast for our use at the
time, so I never really had call to test it in production.

But it definitely made our test server, the aforementioned PPro200
machine, more livable.

On Tue, 2006-07-18 at 14:43, Ron Peacetree wrote:
> Have you done any experiments implementing RAID 50 this way (HBA does RAID 5, 
> OS does RAID 0)?  If so, what were the results?
> 
> Ron
> 
> -Original Message-
> >From: Scott Marlowe <[EMAIL PROTECTED]>
> >Sent: Jul 18, 2006 3:37 PM
> >To: Alex Turner <[EMAIL PROTECTED]>
> >Cc: Luke Lonergan <[EMAIL PROTECTED]>, Mikael Carneholm <[EMAIL PROTECTED]>, 
> >Ron Peacetree <[EMAIL PROTECTED]>, pgsql-performance@postgresql.org
> >Subject: Re: [PERFORM] RAID stripe size question
> >
> >On Tue, 2006-07-18 at 14:27, Alex Turner wrote:
> >> This is a great testament to the fact that very often software RAID
> >> will seriously outperform hardware RAID because the OS guys who
> >> implemented it took the time to do it right, as compared with some
> >> controller manufacturers who seem to think it's okay to provided
> >> sub-standard performance. 
> >> 
> >> Based on the bonnie++ numbers comming back from your array, I would
> >> also encourage you to evaluate software RAID, as you might see
> >> significantly better performance as a result.  RAID 10 is also a good
> >> candidate as it's not so heavy on the cache and CPU as RAID 5. 
> >
> >Also, consider testing a mix, where your hardware RAID controller does
> >the mirroring and the OS stripes ((R)AID 0) over the top of it.  I've
> >gotten good performance from mediocre hardware cards doing this.  It has
> >the advantage of still being able to use the battery backed cache and
> >its instant fsync while not relying on some cards that have issues
> >layering RAID layers one atop the other.
> 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] RAID stripe size question

2006-07-18 Thread Milen Kulev
According to 
http://www.pcguide.com/ref/hdd/perf/raid/concepts/perfStripe-c.html, it seems 
to be the other way around?
("As stripe size is decreased, files are broken into smaller and smaller 
pieces. This increases the number of drives
that an average file will use to hold all the blocks containing the data of 
that file, 

-theoretically increasing transfer performance, but decreasing positioning 
performance.")

Mikael,
In OLTP you utterly need  best possible latency.  If you decompose the response 
time if you physical request you will
see positioning performance plays the dominant role in the response time 
(ignore for a moment caches and their effects).

So, if you need really good response times of your SQL queries, choose  15 rpm 
disks(and add as much cache as possible
to magnify the effect ;) )

Best Regards. 
Milen 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Query plan issue when upgrading to postgres 8.14 (from

2006-07-18 Thread Ioana Danes
Hi Chris,   Here are the results of my query for postgresql  8.0.3 and 8.1.4. For postgresql 8.1.4 there are 2 results, one for test  table having the same indexes as in 8.0.3 and the second one for a new  index on test table by (testtype,testid) that will speed up my query.  This last index will fix my problem for this particular query.  In the Test table there are 19,494,826 records and 11,090 records  have testtype = 1455. The data on both servers is identical. And on  both servers I run vacuum analyze prior executing this queries.As it can be seen the result in postgresql 8.1.4 is very slow and I am wondering why is that. Bug, missing configuration, ... 1. Result on Postgresql 8.0.3:  -  # explain analyze select max(TESTID) from TEST where TESTTYPE = 1455; Aggregate  (cost=391.56..391.56 rows=1 width=8) (actual time=94.707..94.711 rows=1 loops=1) 
    ->  Index Scan using ix_test_testtype on  test  (cost=0.00..355.18 rows=14551 width=8) (actual  time=0.036..51.089 rows=11090 loops=1)   Index Cond: (testtype = 1455)   Total runtime: 94.778 ms  (4 rows)# select max(TESTID) from TEST where TESTTYPE = 1455;   max  --   18527829  (1 row)Time: 13.447 ms2. Result on Postgresql 8.1.4 (with the same indexes as in 8.0.3):-- Result  (cost=32.78..32.79 rows=1 width=0) (actual time=1865.406..1865.408 rows=1 loops=1)   InitPlan   ->  Limit  (cost=0.00..32.78  rows=1 width=8) (actual time=1865.378..1865.381 rows=1 loops=1)     
 ->  Index Scan Backward using pk_testid on test   (cost=0.00..464069.25 rows=14155 width=8) (actual  time=1865.371..1865.371 rows=1 loops=1)    Filter: ((testid IS NOT NULL) AND (testtype = 1455)) Total runtime: 1865.522 ms(6 rows)# select max(TESTID) from TEST where TESTTYPE = 1455;     max-- 18527829Time: 1858.076 ms  3. Result on Postgresql 8.1.4 (after creating an index by testtype, testid ):  -  # explain analyze select max(TESTID) from TEST where TESTTYPE = 1455;   Result  (cost=1.71..1.72 rows=1 width=0) (actual time=0.069..0.070 rows=1 loops=1)     InitPlan  
 ->  Limit  (cost=0.00..1.71 rows=1 width=8) (actual time=0.055..0.056 rows=1 loops=1)      ->  Index Scan Backward using ix_test2 on test   (cost=0.00..24248.92 rows=14155 width=8) (actual time=0.050..0.050  rows=1 loops=1)   Index Cond: (testtype = 1455)   Filter: (testid IS NOT NULL)   Total runtime: 0.159 ms# select max(TESTID) from TEST where TESTTYPE = 1455;   max  --   18527829Time: 1.029 ms  Thank you very much,  Ioana Danes  Chris <[EMAIL PROTECTED]> wrote:You've left out the best details. Post an 'explain analyze' from both versions, and don't cut anything out :)I'm guessing postgres is seeing an index on the table is faster because it doesn't think you have many rows in the table. How many are there, and have you done an analyze of the table after loading the data in?-- Postgresql & php tutorialshttp://www.designmagick.com/---(end of broadcast)---TIP 2: Don't 'kill -9' the postmaster 
		Now you can have a huge leap forward in email: get the new Yahoo! Mail. 

[PERFORM] Performance penalty for remote access of postgresql (8.1.3)? any experiance?

2006-07-18 Thread Guoping Zhang
Hello,

I am seeking advice/comment/experience you may have had for the performance
cost for remote access to postgresql 8.1.X?

I have two servers, one is Sun V240 (say server A)  and the other is dual
intel Xeon (say Server B) and both installed Solaris 10.

With Server A, there is postgresql 8.1.3 installed with pgpool
(pgpool-3.0.2), with server B, there is a pgpool (v3.0.2) installed.

The test program is installed on both A and B, where the test application on
server B is accessing to DBMS on A through pgpool.

Note that the test code is not fancy but can insert a large number of record
(say 100k rows) with configurable transaction size.

Following are the results (repeated many times with the mean value and shall
be accurate) for various setting by fixed 100k insertion operation with a
transaction size as 100 rows):
--
1. Test program running on server A directly access to LOCAL postgresql:
24.03 seconds
2. Test progam running on server A access to LOCAL postgresql through
pgpool: 30.05 seconds
3. Test progam running on server A access REMOTE postgresql through local
pgpool: 74.06 seconds
--
I have to say both machines are very light load and interconnected with
local LAN.

>From 1 and 2, pgpool add 20% overhead, it sounds reasonable but any way to
reduce it???

>From 2 and 3, it suggests the remote access is much slower than local
access.

My question is:
  a) Anyone has the similar experience? How do you deal with it?
  b) Why TCP stack imposes such big delay? any tuning point I shall do?


The time call reports
  for test 2 is
real0m32.71s
user0m2.42s
sys 0m2.65s

  for test 3 is
   real 1:14.0
   user2.5
   sys 3.2

   c) Obviously, CPU time for (user + sys) for both tests are very similar,
but the overall time is quite different. I assume the time used on TCP stack
makes the difference.


Many thanks,
Regards,
Guoping Zhang


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance penalty for remote access of postgresql (8.1.3)? any experiance?

2006-07-18 Thread Florian Weimer
* Guoping Zhang:

>   a) Anyone has the similar experience? How do you deal with it?
>   b) Why TCP stack imposes such big delay? any tuning point I shall do?

If you use INSERT, you'll incur a network round-trip delay for each
record.  Try using COPY FROM instead, possibly to a temporary table if
you need more complex calculations.  If you do this, there won't be a
huge difference between local and remote access as long as the
bandwidth is sufficient.

-- 
Florian Weimer<[EMAIL PROTECTED]>
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings